linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC/RFT 0/2] ARM: mm: Introduce arch hooks for dma address translation
@ 2014-02-03 23:28 Santosh Shilimkar
       [not found] ` <1391470107-15927-3-git-send-email-santosh.shilimkar@ti.com>
       [not found] ` <1391470107-15927-2-git-send-email-santosh.shilimkar@ti.com>
  0 siblings, 2 replies; 15+ messages in thread
From: Santosh Shilimkar @ 2014-02-03 23:28 UTC (permalink / raw)
  To: linux-arm-kernel

Currently arch specific DMA address translation routines can be enabled
using only defines wtth "mach/memory.h" which makes impossible to use them
in with multi-platform, single zImage builds.

Hence, introduce arch specific hooks for DMA address translations
routines to be compatible with multi-platform builds. In case if
an architecture won't use it - DMA address translation routines
will fall-back to existing implementation.

Series updates existing machines like omap1, ks8695, iop13xx to
use new DMA hooks.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Olof Johansson <olof@lixom.net>
CC: Grygorii Strashko <grygorii.strashko@ti.com>

Santosh Shilimkar (2):
  ARM: mm: introduce arch hooks for dma address translation routines
  ARM: keystone: Install hooks for dma address translation routines

 arch/arm/include/asm/dma-mapping.h          |   25 +++--------
 arch/arm/mach-iop13xx/include/mach/memory.h |   61 ---------------------------
 arch/arm/mach-iop13xx/setup.c               |   58 +++++++++++++++++++++++++
 arch/arm/mach-keystone/keystone.c           |   31 ++++++++++++++
 arch/arm/mach-ks8695/cpu.c                  |   50 ++++++++++++++++++++++
 arch/arm/mach-ks8695/include/mach/memory.h  |   33 ---------------
 arch/arm/mach-omap1/include/mach/memory.h   |   39 -----------------
 arch/arm/mach-omap1/io.c                    |   52 +++++++++++++++++++++++
 arch/arm/mm/dma-mapping.c                   |   29 +++++++++++++
 9 files changed, 226 insertions(+), 152 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines
       [not found] ` <1391470107-15927-3-git-send-email-santosh.shilimkar@ti.com>
@ 2014-02-04  2:05   ` Olof Johansson
  2014-02-04 14:30     ` Santosh Shilimkar
  0 siblings, 1 reply; 15+ messages in thread
From: Olof Johansson @ 2014-02-04  2:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Mon, Feb 3, 2014 at 3:28 PM, Santosh Shilimkar
<santosh.shilimkar@ti.com> wrote:
> Keystone platforms have their physical memory mapped at an address
> outside the 32-bit physical range.  A Keystone machine with 16G of RAM
> would find its memory at 0x0800000000 - 0x0bffffffff.
> The system interconnect allows to perform DMA transfers from first 2G of
> physical memory (0x08 0000 0000 to 08 7FFF FFFF) which aliased in
> hardware to the 32-bit addressable space (0x80000000 - 0xffffffff),
> because DMA HW supports only 32-bits addressing.
>
> Hence, add arch hooks for dma address translation routines.
>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Olof Johansson <olof@lixom.net>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> ---
>  arch/arm/mach-keystone/keystone.c |   31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
> index 1b43a27..54dae03 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -14,6 +14,7 @@
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> +#include <linux/dma-mapping.h>
>
>  #include <asm/setup.h>
>  #include <asm/mach/map.h>
> @@ -53,6 +54,28 @@ static phys_addr_t keystone_virt_to_idmap(unsigned long x)
>         return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET + KEYSTONE_LOW_PHYS_START;
>  }
>
> +static unsigned long keystone_dma_pfn_offset __read_mostly;
> +
> +static dma_addr_t keystone_pfn_to_dma(struct device *dev, unsigned long pfn)
> +{
> +       return PFN_PHYS(pfn - keystone_dma_pfn_offset);
> +}
> +
> +static unsigned long keystone_dma_to_pfn(struct device *dev, dma_addr_t addr)
> +{
> +       return PFN_DOWN(addr) + keystone_dma_pfn_offset;
> +}
> +
> +static void *keystone_dma_to_virt(struct device *dev, dma_addr_t addr)
> +{
> +       return phys_to_virt(addr + PFN_PHYS(keystone_dma_pfn_offset));
> +}
> +
> +static dma_addr_t keystone_virt_to_dma(struct device *dev, void *addr)
> +{
> +       return virt_to_phys(addr) - PFN_PHYS(keystone_dma_pfn_offset);
> +}
> +
>  static void __init keystone_init_meminfo(void)
>  {
>         bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
> @@ -89,6 +112,14 @@ static void __init keystone_init_meminfo(void)
>         /* Populate the arch idmap hook */
>         arch_virt_to_idmap = keystone_virt_to_idmap;
>
> +       /* Populate the arch DMA hooks */
> +       keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
> +                                          KEYSTONE_LOW_PHYS_START);
> +       __arch_pfn_to_dma = keystone_pfn_to_dma;
> +       __arch_dma_to_pfn = keystone_dma_to_pfn;
> +       __arch_dma_to_virt = keystone_dma_to_virt;
> +       __arch_virt_to_dma = keystone_virt_to_dma;

Is this truly a static window, or is it going through an IOMMU that
just happens to have an identity mapping setup per default?

PPC servers use "ibm,dma-window" to describe the assigned dma address
space for busses/devices, but the window itself doesn't contain any
information about the physical address mapping (since it goes through
an iommu after that). It likely doesn't fit this particular use case,
but it's something we should look at as a base in case we need to
start looking at bindings for this instead of coding it per SoC. We'll
know more once we've seen what a few of the implementations out there
are.


-Olof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
       [not found] ` <1391470107-15927-2-git-send-email-santosh.shilimkar@ti.com>
@ 2014-02-04  2:18   ` Olof Johansson
  2014-02-04 14:33     ` Santosh Shilimkar
  2014-02-04 16:15   ` Arnd Bergmann
  1 sibling, 1 reply; 15+ messages in thread
From: Olof Johansson @ 2014-02-04  2:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,


On Mon, Feb 3, 2014 at 3:28 PM, Santosh Shilimkar
<santosh.shilimkar@ti.com> wrote:
> Currently arch specific DMA address translation routines can be enabled
> using only defines which makes impossible to use them in with
> multi-platform builds.
>
> Hence, introduce arch specific hooks for DMA address translations
> routines to be compatible with multi-platform builds:
> dma_addr_t (*arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
> unsigned long (*arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
> void* (*arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
> dma_addr_t (*arch_virt_to_dma)(struct device *dev, void *addr);
>
> In case if architecture won't use it - DMA address translation routines
> will fall-back to existing implementation.
>
> Also, modify machines omap1, ks8695, iop13xx to use new DMA hooks.

[...]

> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index e701a4d..84acc46 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -55,28 +55,16 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
>   * functions used internally by the DMA-mapping API to provide DMA
>   * addresses. They must not be used by drivers.
>   */
> -#ifndef __arch_pfn_to_dma
> -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> -{
> -       return (dma_addr_t)__pfn_to_bus(pfn);
> -}
>
> -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
> -{
> -       return __bus_to_pfn(addr);
> -}
> +extern dma_addr_t (*__arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
> +extern unsigned long (*__arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
> +extern void* (*__arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
> +extern dma_addr_t (*__arch_virt_to_dma)(struct device *dev, void *addr);

I tend to prefer having these kind of function pointers grouped in a
struct instead of in the toplevel namespace like this. It allows you
to use a set_<foo>_ops() interface too instead and reduces
exposed/exported internals since only the global struct pointer has to
be exported.

>
> -static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
> -{
> -       return (void *)__bus_to_virt((unsigned long)addr);
> -}

I don't actually see any in-tree users of dma_to_virt(). It can
probably be removed.

> -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
> -{
> -       return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
> -}
> +/* Keep __arch_pfn_to_dma defined as it's used by some drivers (V4L2)*/
> +#define __arch_pfn_to_dma __arch_pfn_to_dma

Ick. The v4l driver should be fixed. Marek?


[...]

> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 5c43ca5..74111bf 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -39,6 +39,35 @@
>
>  #include "mm.h"
>
> +static inline dma_addr_t __pfn_to_dma(struct device *dev, unsigned long pfn)
> +{
> +       return (dma_addr_t)__pfn_to_bus(pfn);
> +}
> +
> +static inline unsigned long __dma_to_pfn(struct device *dev, dma_addr_t addr)
> +{
> +       return __bus_to_pfn(addr);
> +}
> +
> +static inline void *__dma_to_virt(struct device *dev, dma_addr_t addr)
> +{
> +       return (void *)__bus_to_virt((unsigned long)addr);
> +}
> +
> +static inline dma_addr_t __virt_to_dma(struct device *dev, void *addr)
> +{
> +       return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
> +}
> +
> +dma_addr_t (*__arch_pfn_to_dma)(struct device *dev, unsigned long pfn) = __pfn_to_dma;
> +EXPORT_SYMBOL(__arch_pfn_to_dma);
> +unsigned long (*__arch_dma_to_pfn)(struct device *dev, dma_addr_t addr) = __dma_to_pfn;
> +EXPORT_SYMBOL(__arch_dma_to_pfn);
> +void* (*__arch_dma_to_virt)(struct device *dev, dma_addr_t addr) = __dma_to_virt;
> +EXPORT_SYMBOL(__arch_dma_to_virt);
> +dma_addr_t (*__arch_virt_to_dma)(struct device *dev, void *addr) = __virt_to_dma;
> +EXPORT_SYMBOL(__arch_virt_to_dma);

Independent on whether someone objects to my preference of exporting a
struct, these (or that struct pointer) should probably be
EXPORT_SYMBOL_GPL().


-Olof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines
  2014-02-04  2:05   ` [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines Olof Johansson
@ 2014-02-04 14:30     ` Santosh Shilimkar
  2014-02-04 16:01       ` Arnd Bergmann
  0 siblings, 1 reply; 15+ messages in thread
From: Santosh Shilimkar @ 2014-02-04 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 03 February 2014 09:05 PM, Olof Johansson wrote:
> Hi,
> 
> On Mon, Feb 3, 2014 at 3:28 PM, Santosh Shilimkar
> <santosh.shilimkar@ti.com> wrote:
>> Keystone platforms have their physical memory mapped at an address
>> outside the 32-bit physical range.  A Keystone machine with 16G of RAM
>> would find its memory at 0x0800000000 - 0x0bffffffff.
>> The system interconnect allows to perform DMA transfers from first 2G of
>> physical memory (0x08 0000 0000 to 08 7FFF FFFF) which aliased in
>> hardware to the 32-bit addressable space (0x80000000 - 0xffffffff),
>> because DMA HW supports only 32-bits addressing.
>>
>> Hence, add arch hooks for dma address translation routines.
>>
>> Cc: Russell King <linux@arm.linux.org.uk>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Olof Johansson <olof@lixom.net>
>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
>> ---
>>  arch/arm/mach-keystone/keystone.c |   31 +++++++++++++++++++++++++++++++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
>> index 1b43a27..54dae03 100644
>> --- a/arch/arm/mach-keystone/keystone.c
>> +++ b/arch/arm/mach-keystone/keystone.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/init.h>
>>  #include <linux/of_platform.h>
>>  #include <linux/of_address.h>
>> +#include <linux/dma-mapping.h>
>>
>>  #include <asm/setup.h>
>>  #include <asm/mach/map.h>
>> @@ -53,6 +54,28 @@ static phys_addr_t keystone_virt_to_idmap(unsigned long x)
>>         return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET + KEYSTONE_LOW_PHYS_START;
>>  }
>>
>> +static unsigned long keystone_dma_pfn_offset __read_mostly;
>> +
>> +static dma_addr_t keystone_pfn_to_dma(struct device *dev, unsigned long pfn)
>> +{
>> +       return PFN_PHYS(pfn - keystone_dma_pfn_offset);
>> +}
>> +
>> +static unsigned long keystone_dma_to_pfn(struct device *dev, dma_addr_t addr)
>> +{
>> +       return PFN_DOWN(addr) + keystone_dma_pfn_offset;
>> +}
>> +
>> +static void *keystone_dma_to_virt(struct device *dev, dma_addr_t addr)
>> +{
>> +       return phys_to_virt(addr + PFN_PHYS(keystone_dma_pfn_offset));
>> +}
>> +
>> +static dma_addr_t keystone_virt_to_dma(struct device *dev, void *addr)
>> +{
>> +       return virt_to_phys(addr) - PFN_PHYS(keystone_dma_pfn_offset);
>> +}
>> +
>>  static void __init keystone_init_meminfo(void)
>>  {
>>         bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
>> @@ -89,6 +112,14 @@ static void __init keystone_init_meminfo(void)
>>         /* Populate the arch idmap hook */
>>         arch_virt_to_idmap = keystone_virt_to_idmap;
>>
>> +       /* Populate the arch DMA hooks */
>> +       keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
>> +                                          KEYSTONE_LOW_PHYS_START);
>> +       __arch_pfn_to_dma = keystone_pfn_to_dma;
>> +       __arch_dma_to_pfn = keystone_dma_to_pfn;
>> +       __arch_dma_to_virt = keystone_dma_to_virt;
>> +       __arch_virt_to_dma = keystone_virt_to_dma;
> 
> Is this truly a static window, or is it going through an IOMMU that
> just happens to have an identity mapping setup per default?
> 
Its true physical static window hardwired in Hardware. No IOMMU
involvement.

> PPC servers use "ibm,dma-window" to describe the assigned dma address
> space for busses/devices, but the window itself doesn't contain any
> information about the physical address mapping (since it goes through
> an iommu after that). It likely doesn't fit this particular use case,
> but it's something we should look at as a base in case we need to
> start looking at bindings for this instead of coding it per SoC. We'll
> know more once we've seen what a few of the implementations out there
> are.
> 
Understood.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-04  2:18   ` [RFC/RFT 1/2] ARM: mm: introduce arch " Olof Johansson
@ 2014-02-04 14:33     ` Santosh Shilimkar
  0 siblings, 0 replies; 15+ messages in thread
From: Santosh Shilimkar @ 2014-02-04 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 03 February 2014 09:18 PM, Olof Johansson wrote:
> Hi,
> 
> 
> On Mon, Feb 3, 2014 at 3:28 PM, Santosh Shilimkar
> <santosh.shilimkar@ti.com> wrote:
>> Currently arch specific DMA address translation routines can be enabled
>> using only defines which makes impossible to use them in with
>> multi-platform builds.
>>
>> Hence, introduce arch specific hooks for DMA address translations
>> routines to be compatible with multi-platform builds:
>> dma_addr_t (*arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
>> unsigned long (*arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
>> void* (*arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
>> dma_addr_t (*arch_virt_to_dma)(struct device *dev, void *addr);
>>
>> In case if architecture won't use it - DMA address translation routines
>> will fall-back to existing implementation.
>>
>> Also, modify machines omap1, ks8695, iop13xx to use new DMA hooks.
> 
> [...]
> 
>> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
>> index e701a4d..84acc46 100644
>> --- a/arch/arm/include/asm/dma-mapping.h
>> +++ b/arch/arm/include/asm/dma-mapping.h
>> @@ -55,28 +55,16 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
>>   * functions used internally by the DMA-mapping API to provide DMA
>>   * addresses. They must not be used by drivers.
>>   */
>> -#ifndef __arch_pfn_to_dma
>> -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>> -{
>> -       return (dma_addr_t)__pfn_to_bus(pfn);
>> -}
>>
>> -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
>> -{
>> -       return __bus_to_pfn(addr);
>> -}
>> +extern dma_addr_t (*__arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
>> +extern unsigned long (*__arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
>> +extern void* (*__arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
>> +extern dma_addr_t (*__arch_virt_to_dma)(struct device *dev, void *addr);
> 
> I tend to prefer having these kind of function pointers grouped in a
> struct instead of in the toplevel namespace like this. It allows you
> to use a set_<foo>_ops() interface too instead and reduces
> exposed/exported internals since only the global struct pointer has to
> be exported.
> 
agree

[..]
 
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index 5c43ca5..74111bf 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -39,6 +39,35 @@
>>
>>  #include "mm.h"
>>
>> +static inline dma_addr_t __pfn_to_dma(struct device *dev, unsigned long pfn)
>> +{
>> +       return (dma_addr_t)__pfn_to_bus(pfn);
>> +}
>> +
>> +static inline unsigned long __dma_to_pfn(struct device *dev, dma_addr_t addr)
>> +{
>> +       return __bus_to_pfn(addr);
>> +}
>> +
>> +static inline void *__dma_to_virt(struct device *dev, dma_addr_t addr)
>> +{
>> +       return (void *)__bus_to_virt((unsigned long)addr);
>> +}
>> +
>> +static inline dma_addr_t __virt_to_dma(struct device *dev, void *addr)
>> +{
>> +       return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
>> +}
>> +
>> +dma_addr_t (*__arch_pfn_to_dma)(struct device *dev, unsigned long pfn) = __pfn_to_dma;
>> +EXPORT_SYMBOL(__arch_pfn_to_dma);
>> +unsigned long (*__arch_dma_to_pfn)(struct device *dev, dma_addr_t addr) = __dma_to_pfn;
>> +EXPORT_SYMBOL(__arch_dma_to_pfn);
>> +void* (*__arch_dma_to_virt)(struct device *dev, dma_addr_t addr) = __dma_to_virt;
>> +EXPORT_SYMBOL(__arch_dma_to_virt);
>> +dma_addr_t (*__arch_virt_to_dma)(struct device *dev, void *addr) = __virt_to_dma;
>> +EXPORT_SYMBOL(__arch_virt_to_dma);
> 
> Independent on whether someone objects to my preference of exporting a
> struct, these (or that struct pointer) should probably be
> EXPORT_SYMBOL_GPL().
> 
Sure.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines
  2014-02-04 14:30     ` Santosh Shilimkar
@ 2014-02-04 16:01       ` Arnd Bergmann
  2014-02-04 16:22         ` Olof Johansson
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2014-02-04 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 04 February 2014, Santosh Shilimkar wrote:
> > PPC servers use "ibm,dma-window" to describe the assigned dma address
> > space for busses/devices, but the window itself doesn't contain any
> > information about the physical address mapping (since it goes through
> > an iommu after that). It likely doesn't fit this particular use case,
> > but it's something we should look at as a base in case we need to
> > start looking at bindings for this instead of coding it per SoC. We'll
> > know more once we've seen what a few of the implementations out there
> > are.
> > 
> Understood.

I think you are looking for the "dma-ranges" property, which describes
how a device DMA address space maps into the parent bus address space
for inbound translations. It's not used much in Linux, but it is clearly
specified. The "ibm,dma-window" property OTOH is for the corner case
that you have a small per-partition DMA address space section, which is
not how things are done on most systems these days.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
       [not found] ` <1391470107-15927-2-git-send-email-santosh.shilimkar@ti.com>
  2014-02-04  2:18   ` [RFC/RFT 1/2] ARM: mm: introduce arch " Olof Johansson
@ 2014-02-04 16:15   ` Arnd Bergmann
  2014-02-04 16:38     ` Santosh Shilimkar
  1 sibling, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2014-02-04 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 04 February 2014, Santosh Shilimkar wrote:
> Currently arch specific DMA address translation routines can be enabled
> using only defines which makes impossible to use them in with
> multi-platform builds.
> 
> Hence, introduce arch specific hooks for DMA address translations
> routines to be compatible with multi-platform builds:
> dma_addr_t (*arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
> unsigned long (*arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
> void* (*arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
> dma_addr_t (*arch_virt_to_dma)(struct device *dev, void *addr);
> 
> In case if architecture won't use it - DMA address translation routines
> will fall-back to existing implementation.
> v
> Also, modify machines omap1, ks8695, iop13xx to use new DMA hooks.

I think this is going into a wrong direction. DMA translation is not
at all a platform-specific thing, but rather bus specific. The most
common scenario is that you have some 64-bit capable buses and some
buses that are limited to 32-bit DMA (or less if you are unfortunate).

We also can't rely on {pfn,phys,virt}_to_{bus,dma} and the reverse
to work anywhere outside of the dma_map_ops implementation, because
of IOMMUs in-between.

Of course we do need a proper solution for this problem, but we
can't make it a per-platform decision, and whatever the solution is
needs to take into account both nontrivial linear mappings (offset
or cropped) and IOMMUs, and set the appropriate dma_map_ops for
the device.

I guess for the legacy cases (omap1, iop13xx, ks8695), we can
hardcode dma_map_ops for all devices to get this right. For everything
else, I'd suggest defaulting to the arm_dma_ops unless we get
other information from DT. This means we have to create standardized
properties to handle any combination of these:

1. DMA is coherent
2. DMA space is offset from phys space
3. DMA space is smaller than 32-bit
4. DMA space is larger than 32-bit
5. DMA goes through an IOMMU

The dma-ranges property can deal with 2-4. Highbank already introduced
a "dma-coherent" flag for 1, and we can decide to generalize that.
I don't know what the state of IOMMU support is, but we have to come
up with something better than what we had on PowerPC, because we now
have to deal with a combination of different IOMMUs in the same system,
whereas the most complex case on PowerPC was some devices all going
through one IOMMU and the other devices being linearly mapped.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines
  2014-02-04 16:01       ` Arnd Bergmann
@ 2014-02-04 16:22         ` Olof Johansson
  0 siblings, 0 replies; 15+ messages in thread
From: Olof Johansson @ 2014-02-04 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 4, 2014 at 8:01 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 04 February 2014, Santosh Shilimkar wrote:
>> > PPC servers use "ibm,dma-window" to describe the assigned dma address
>> > space for busses/devices, but the window itself doesn't contain any
>> > information about the physical address mapping (since it goes through
>> > an iommu after that). It likely doesn't fit this particular use case,
>> > but it's something we should look at as a base in case we need to
>> > start looking at bindings for this instead of coding it per SoC. We'll
>> > know more once we've seen what a few of the implementations out there
>> > are.
>> >
>> Understood.
>
> I think you are looking for the "dma-ranges" property, which describes
> how a device DMA address space maps into the parent bus address space
> for inbound translations. It's not used much in Linux, but it is clearly
> specified. The "ibm,dma-window" property OTOH is for the corner case
> that you have a small per-partition DMA address space section, which is
> not how things are done on most systems these days.

Ah, that might very well be the case. And it looks like dma-ranges
handles this case already. At least based on the first draft proposal
for dma-ranges that I came across. :)


-Olof

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-04 16:15   ` Arnd Bergmann
@ 2014-02-04 16:38     ` Santosh Shilimkar
  2014-02-04 17:04       ` Arnd Bergmann
  0 siblings, 1 reply; 15+ messages in thread
From: Santosh Shilimkar @ 2014-02-04 16:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
> On Tuesday 04 February 2014, Santosh Shilimkar wrote:
>> Currently arch specific DMA address translation routines can be enabled
>> using only defines which makes impossible to use them in with
>> multi-platform builds.
>>
>> Hence, introduce arch specific hooks for DMA address translations
>> routines to be compatible with multi-platform builds:
>> dma_addr_t (*arch_pfn_to_dma)(struct device *dev, unsigned long pfn);
>> unsigned long (*arch_dma_to_pfn)(struct device *dev, dma_addr_t addr);
>> void* (*arch_dma_to_virt)(struct device *dev, dma_addr_t addr);
>> dma_addr_t (*arch_virt_to_dma)(struct device *dev, void *addr);
>>
>> In case if architecture won't use it - DMA address translation routines
>> will fall-back to existing implementation.
>> v
>> Also, modify machines omap1, ks8695, iop13xx to use new DMA hooks.
> 
> I think this is going into a wrong direction. DMA translation is not
> at all a platform-specific thing, but rather bus specific. The most
> common scenario is that you have some 64-bit capable buses and some
> buses that are limited to 32-bit DMA (or less if you are unfortunate).
>
I may be wrong but you could have 64 bit bus but 32 bit DMA controllers.
That is one of the case I am dealing with.
 
> We also can't rely on {pfn,phys,virt}_to_{bus,dma} and the reverse
> to work anywhere outside of the dma_map_ops implementation, because
> of IOMMUs in-between.
> 
> Of course we do need a proper solution for this problem, but we
> can't make it a per-platform decision, and whatever the solution is
> needs to take into account both nontrivial linear mappings (offset
> or cropped) and IOMMUs, and set the appropriate dma_map_ops for
> the device.
> 
> I guess for the legacy cases (omap1, iop13xx, ks8695), we can
> hardcode dma_map_ops for all devices to get this right. For everything
> else, I'd suggest defaulting to the arm_dma_ops unless we get
> other information from DT. This means we have to create standardized
> properties to handle any combination of these:
> 
Thats the case and the $subject series doesn't change that.

> 1. DMA is coherent
> 2. DMA space is offset from phys space
> 3. DMA space is smaller than 32-bit
> 4. DMA space is larger than 32-bit
> 5. DMA goes through an IOMMU
> 
> The dma-ranges property can deal with 2-4. Highbank already introduced
> a "dma-coherent" flag for 1, and we can decide to generalize that.
> I don't know what the state of IOMMU support is, but we have to come
> up with something better than what we had on PowerPC, because we now
> have to deal with a combination of different IOMMUs in the same system,
> whereas the most complex case on PowerPC was some devices all going
> through one IOMMU and the other devices being linearly mapped.
> 
Just to be clear, the patch set is not fiddling with dma_ops as such.
The dma_ops needs few accessors to convert addresses and these accessors
are different on few platforms. And hence needs to be patched.

We will try to look at "dma-ranges" to see if it can address my case.
Thanks for the pointer

Regards,
Santosh
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-04 16:38     ` Santosh Shilimkar
@ 2014-02-04 17:04       ` Arnd Bergmann
  2014-02-05 16:23         ` Dave Martin
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2014-02-04 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 04 February 2014 11:38:32 Santosh Shilimkar wrote:
> On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
> > On Tuesday 04 February 2014, Santosh Shilimkar wrote:

> > I think this is going into a wrong direction. DMA translation is not
> > at all a platform-specific thing, but rather bus specific. The most
> > common scenario is that you have some 64-bit capable buses and some
> > buses that are limited to 32-bit DMA (or less if you are unfortunate).
> >
> I may be wrong but you could have 64 bit bus but 32 bit DMA controllers.
> That is one of the case I am dealing with.

You are absolutely right. In fact you could have any combination of
bus widths between a device and the RAM and the correct way to deal
with this is probably to follow the dma-ranges properties of each
device in-between and take the intersection (that may not be the
right term in English, but I think you know what I mean).

> > I guess for the legacy cases (omap1, iop13xx, ks8695), we can
> > hardcode dma_map_ops for all devices to get this right. For everything
> > else, I'd suggest defaulting to the arm_dma_ops unless we get
> > other information from DT. This means we have to create standardized
> > properties to handle any combination of these:
> > 
> Thats the case and the $subject series doesn't change that.
> 
> > 1. DMA is coherent
> > 2. DMA space is offset from phys space
> > 3. DMA space is smaller than 32-bit
> > 4. DMA space is larger than 32-bit
> > 5. DMA goes through an IOMMU
> > 
> > The dma-ranges property can deal with 2-4. Highbank already introduced
> > a "dma-coherent" flag for 1, and we can decide to generalize that.
> > I don't know what the state of IOMMU support is, but we have to come
> > up with something better than what we had on PowerPC, because we now
> > have to deal with a combination of different IOMMUs in the same system,
> > whereas the most complex case on PowerPC was some devices all going
> > through one IOMMU and the other devices being linearly mapped.
> > 
> Just to be clear, the patch set is not fiddling with dma_ops as such.
> The dma_ops needs few accessors to convert addresses and these accessors
> are different on few platforms. And hence needs to be patched.

well, iop13xx is certainly not going to be multiplatform any time
soon, so we don't have to worry about those. ks8695 won't be multiplatform
unless I do it I suspect. I don't know about the plans for OMAP1,
but since only the OHCI device is special there, it would be trivial
to do a separate dma_map_ops for that device, or to extend arm_dma_ops
to read the offset in a per-device variable as we probably have to
do for DT/multiplatform as well.

> We will try to look at "dma-ranges" to see if it can address my case.
> Thanks for the pointer

Ok.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-04 17:04       ` Arnd Bergmann
@ 2014-02-05 16:23         ` Dave Martin
  2014-02-05 18:37           ` Santosh Shilimkar
  2014-02-06 12:32           ` Arnd Bergmann
  0 siblings, 2 replies; 15+ messages in thread
From: Dave Martin @ 2014-02-05 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 04, 2014 at 06:04:56PM +0100, Arnd Bergmann wrote:
> On Tuesday 04 February 2014 11:38:32 Santosh Shilimkar wrote:
> > On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
> > > On Tuesday 04 February 2014, Santosh Shilimkar wrote:
> 
> > > I think this is going into a wrong direction. DMA translation is not
> > > at all a platform-specific thing, but rather bus specific. The most
> > > common scenario is that you have some 64-bit capable buses and some
> > > buses that are limited to 32-bit DMA (or less if you are unfortunate).
> > >
> > I may be wrong but you could have 64 bit bus but 32 bit DMA controllers.
> > That is one of the case I am dealing with.
> 
> You are absolutely right. In fact you could have any combination of
> bus widths between a device and the RAM and the correct way to deal
> with this is probably to follow the dma-ranges properties of each
> device in-between and take the intersection (that may not be the
> right term in English, but I think you know what I mean).
> 
> > > I guess for the legacy cases (omap1, iop13xx, ks8695), we can
> > > hardcode dma_map_ops for all devices to get this right. For everything
> > > else, I'd suggest defaulting to the arm_dma_ops unless we get
> > > other information from DT. This means we have to create standardized
> > > properties to handle any combination of these:
> > > 
> > Thats the case and the $subject series doesn't change that.
> > 
> > > 1. DMA is coherent
> > > 2. DMA space is offset from phys space
> > > 3. DMA space is smaller than 32-bit
> > > 4. DMA space is larger than 32-bit
> > > 5. DMA goes through an IOMMU

As you explain above, these are properties of end-to-end paths between
a bus-mastering device and the destination.  They aren't properties
of a device, or of a bus.

For example, we can have the following system, which ePAPR can't describe
and wouldn't occur with PCI (or, at least would occur in a transparent
way so that software does not need to understand the difference between
this structure and a simple CPU->devices tree).

     C
     |
     v
     I ---+
    / \    \  
   /   \    \ 
  v     v    \
 A ----> B    \
  \            v
   +---------> D

This follows from the unidirectional and minimalistic nature of ARM SoC
buses (AMBA family, AHB, APB etc. ... and most likely many others too).

To describe A's DMA mappings correctly, the additional links must be
described, even though thay are irrelevant for direct CPU->device
transactions.


> > > 
> > > The dma-ranges property can deal with 2-4. Highbank already introduced
> > > a "dma-coherent" flag for 1, and we can decide to generalize that.
> > > I don't know what the state of IOMMU support is, but we have to come
> > > up with something better than what we had on PowerPC, because we now
> > > have to deal with a combination of different IOMMUs in the same system,
> > > whereas the most complex case on PowerPC was some devices all going
> > > through one IOMMU and the other devices being linearly mapped.
> > > 
> > Just to be clear, the patch set is not fiddling with dma_ops as such.
> > The dma_ops needs few accessors to convert addresses and these accessors
> > are different on few platforms. And hence needs to be patched.
> 
> well, iop13xx is certainly not going to be multiplatform any time
> soon, so we don't have to worry about those. ks8695 won't be multiplatform
> unless I do it I suspect. I don't know about the plans for OMAP1,
> but since only the OHCI device is special there, it would be trivial
> to do a separate dma_map_ops for that device, or to extend arm_dma_ops
> to read the offset in a per-device variable as we probably have to
> do for DT/multiplatform as well.
> 
> > We will try to look at "dma-ranges" to see if it can address my case.
> > Thanks for the pointer

dma-ranges does work for simpler cases.  In particular, it works where all
bus-mastering children of a bus node can a) access each other using the
address space of the bus node, and b) all have the same view of the rest
of the system (which may be different from the view from outside the bus:
the dma-ranges property on the bus describes the difference).

Sometimes, we may be able to describe an otherwise undescribable situation
by introducing additional fake bus nodes.  But if there are cross-links
between devices, this won't always work.


This may not be the common case, but it does happen: we need to decide
whether to describe it propertly, or to describe a fantasy in the DT
and bodge around it elsewhere when it happens.


Similarly, for IOMMU, the ARM SMMU is an independent component which is
not directly associated with a bus: nor is there guaranteed to be a 1:1
correspondence.  Simply wedging properties in a bus or device node to say
"this is associated with an IOMMU" is not always going to work:  it is
what you flow through on a given device->device path that matters, and
that can vary from path to path.


Santosh, bearing these arguments in mind, do you think that dma-ranges
is natural for your hardware?

The answer may be "yes", but if we're having to twist things to fit,
by having to describe something fake or unreal in DT and/or writing board
specific code to work around it, that motivates coming up with a better
way of describing the hardware in these cases.

I believe Jason also has examples of these issues from the Zynq family
of SoCs, which we were discussing last year.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-05 16:23         ` Dave Martin
@ 2014-02-05 18:37           ` Santosh Shilimkar
  2014-02-06 15:38             ` Dave Martin
  2014-02-06 12:32           ` Arnd Bergmann
  1 sibling, 1 reply; 15+ messages in thread
From: Santosh Shilimkar @ 2014-02-05 18:37 UTC (permalink / raw)
  To: linux-arm-kernel

Dave,

On Wednesday 05 February 2014 11:23 AM, Dave Martin wrote:
> On Tue, Feb 04, 2014 at 06:04:56PM +0100, Arnd Bergmann wrote:
>> On Tuesday 04 February 2014 11:38:32 Santosh Shilimkar wrote:
>>> On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
>>>> On Tuesday 04 February 2014, Santosh Shilimkar wrote:
>>
>>>> I think this is going into a wrong direction. DMA translation is not
>>>> at all a platform-specific thing, but rather bus specific. The most
>>>> common scenario is that you have some 64-bit capable buses and some
>>>> buses that are limited to 32-bit DMA (or less if you are unfortunate).
>>>>
>>> I may be wrong but you could have 64 bit bus but 32 bit DMA controllers.
>>> That is one of the case I am dealing with.
>>
>> You are absolutely right. In fact you could have any combination of
>> bus widths between a device and the RAM and the correct way to deal
>> with this is probably to follow the dma-ranges properties of each
>> device in-between and take the intersection (that may not be the
>> right term in English, but I think you know what I mean).
>>
>>>> I guess for the legacy cases (omap1, iop13xx, ks8695), we can
>>>> hardcode dma_map_ops for all devices to get this right. For everything
>>>> else, I'd suggest defaulting to the arm_dma_ops unless we get
>>>> other information from DT. This means we have to create standardized
>>>> properties to handle any combination of these:
>>>>
>>> Thats the case and the $subject series doesn't change that.
>>>
>>>> 1. DMA is coherent
>>>> 2. DMA space is offset from phys space
>>>> 3. DMA space is smaller than 32-bit
>>>> 4. DMA space is larger than 32-bit
>>>> 5. DMA goes through an IOMMU
> 
> As you explain above, these are properties of end-to-end paths between
> a bus-mastering device and the destination.  They aren't properties
> of a device, or of a bus.
> 
> For example, we can have the following system, which ePAPR can't describe
> and wouldn't occur with PCI (or, at least would occur in a transparent
> way so that software does not need to understand the difference between
> this structure and a simple CPU->devices tree).
> 
>      C
>      |
>      v
>      I ---+
>     / \    \  
>    /   \    \ 
>   v     v    \
>  A ----> B    \
>   \            v
>    +---------> D
> 
> This follows from the unidirectional and minimalistic nature of ARM SoC
> buses (AMBA family, AHB, APB etc. ... and most likely many others too).
> 
> To describe A's DMA mappings correctly, the additional links must be
> described, even though thay are irrelevant for direct CPU->device
> transactions.
> 
> 
>>>>
>>>> The dma-ranges property can deal with 2-4. Highbank already introduced
>>>> a "dma-coherent" flag for 1, and we can decide to generalize that.
>>>> I don't know what the state of IOMMU support is, but we have to come
>>>> up with something better than what we had on PowerPC, because we now
>>>> have to deal with a combination of different IOMMUs in the same system,
>>>> whereas the most complex case on PowerPC was some devices all going
>>>> through one IOMMU and the other devices being linearly mapped.
>>>>
>>> Just to be clear, the patch set is not fiddling with dma_ops as such.
>>> The dma_ops needs few accessors to convert addresses and these accessors
>>> are different on few platforms. And hence needs to be patched.
>>
>> well, iop13xx is certainly not going to be multiplatform any time
>> soon, so we don't have to worry about those. ks8695 won't be multiplatform
>> unless I do it I suspect. I don't know about the plans for OMAP1,
>> but since only the OHCI device is special there, it would be trivial
>> to do a separate dma_map_ops for that device, or to extend arm_dma_ops
>> to read the offset in a per-device variable as we probably have to
>> do for DT/multiplatform as well.
>>
>>> We will try to look at "dma-ranges" to see if it can address my case.
>>> Thanks for the pointer
> 
> dma-ranges does work for simpler cases.  In particular, it works where all
> bus-mastering children of a bus node can a) access each other using the
> address space of the bus node, and b) all have the same view of the rest
> of the system (which may be different from the view from outside the bus:
> the dma-ranges property on the bus describes the difference).
> 
> Sometimes, we may be able to describe an otherwise undescribable situation
> by introducing additional fake bus nodes.  But if there are cross-links
> between devices, this won't always work.
> 
> 
> This may not be the common case, but it does happen: we need to decide
> whether to describe it propertly, or to describe a fantasy in the DT
> and bodge around it elsewhere when it happens.
> 
> 
> Similarly, for IOMMU, the ARM SMMU is an independent component which is
> not directly associated with a bus: nor is there guaranteed to be a 1:1
> correspondence.  Simply wedging properties in a bus or device node to say
> "this is associated with an IOMMU" is not always going to work:  it is
> what you flow through on a given device->device path that matters, and
> that can vary from path to path.
> 
> 
> Santosh, bearing these arguments in mind, do you think that dma-ranges
> is natural for your hardware?
> 
> The answer may be "yes", but if we're having to twist things to fit,
> by having to describe something fake or unreal in DT and/or writing board
> specific code to work around it, that motivates coming up with a better
> way of describing the hardware in these cases.
>
The answer at least not fully "yes" with the limited look at dma-ranges
so far.
- The of_translate_dma_address() can be used to translate addresses
from DMA to CPU address space. And this should work but it will be
expensive compared to classic macro's.
- We don't see a way for CPU -> DMA addresses translation using DT.
Probably some more digging/pointers are is needed.

Regards,
Santosh


 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-05 16:23         ` Dave Martin
  2014-02-05 18:37           ` Santosh Shilimkar
@ 2014-02-06 12:32           ` Arnd Bergmann
  2014-02-06 15:22             ` Dave Martin
  1 sibling, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2014-02-06 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 05 February 2014, Dave Martin wrote:
> On Tue, Feb 04, 2014 at 06:04:56PM +0100, Arnd Bergmann wrote:
> > On Tuesday 04 February 2014 11:38:32 Santosh Shilimkar wrote:
> > > On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
> > > 
> > > > 1. DMA is coherent
> > > > 2. DMA space is offset from phys space
> > > > 3. DMA space is smaller than 32-bit
> > > > 4. DMA space is larger than 32-bit
> > > > 5. DMA goes through an IOMMU
> 
> As you explain above, these are properties of end-to-end paths between
> a bus-mastering device and the destination.  They aren't properties
> of a device, or of a bus.
> 
> For example, we can have the following system, which ePAPR can't describe
> and wouldn't occur with PCI (or, at least would occur in a transparent
> way so that software does not need to understand the difference between
> this structure and a simple CPU->devices tree).
> 
>      C
>      |
>      v
>      I ---+
>     / \    \  
>    /   \    \ 
>   v     v    \
>  A ----> B    \
>   \            v
>    +---------> D
> 
> This follows from the unidirectional and minimalistic nature of ARM SoC
> buses (AMBA family, AHB, APB etc. ... and most likely many others too).
> 
> To describe A's DMA mappings correctly, the additional links must be
> described, even though thay are irrelevant for direct CPU->device
> transactions.
 
Can you be more specific about what kind of hardware would use such
a mapping? The interesting cases are normally all about accessing
RAM, while your example seems to be for device-to-device DMA and
that doesn't have to go through dma-ranges.

> dma-ranges does work for simpler cases.  In particular, it works where all
> bus-mastering children of a bus node can a) access each other using the
> address space of the bus node, and b) all have the same view of the rest
> of the system (which may be different from the view from outside the bus:
> the dma-ranges property on the bus describes the difference).
> 
> Sometimes, we may be able to describe an otherwise undescribable situation
> by introducing additional fake bus nodes.  But if there are cross-links
> between devices, this won't always work.
> 
> 
> This may not be the common case, but it does happen: we need to decide
> whether to describe it propertly, or to describe a fantasy in the DT
> and bodge around it elsewhere when it happens.

Do you think this could be fully described if we add a "dma-parent"
property that can redirect the "dma-ranges" parent address space to
another node?

If there are devices that have parts of their DMA address space on
various buses, how about a "dma-ranges-ext" property that contains
tuples of <&parent-phandle local-address parent-address size> rather
than just <local-address parent-address size>?

> Similarly, for IOMMU, the ARM SMMU is an independent component which is
> not directly associated with a bus: nor is there guaranteed to be a 1:1
> correspondence.  Simply wedging properties in a bus or device node to say
> "this is associated with an IOMMU" is not always going to work:  it is
> what you flow through on a given device->device path that matters, and
> that can vary from path to path.

Right, I'm aware that the IOMMU may be per device rather than per-bus.
This could be handled by faking extra buses, or possibly better with
the dma-parent approach above, if that is allowed to point to either
a bus or an IOMMU.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-06 12:32           ` Arnd Bergmann
@ 2014-02-06 15:22             ` Dave Martin
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Martin @ 2014-02-06 15:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 06, 2014 at 01:32:00PM +0100, Arnd Bergmann wrote:
> On Wednesday 05 February 2014, Dave Martin wrote:
> > On Tue, Feb 04, 2014 at 06:04:56PM +0100, Arnd Bergmann wrote:
> > > On Tuesday 04 February 2014 11:38:32 Santosh Shilimkar wrote:
> > > > On Tuesday 04 February 2014 11:15 AM, Arnd Bergmann wrote:
> > > > 
> > > > > 1. DMA is coherent
> > > > > 2. DMA space is offset from phys space
> > > > > 3. DMA space is smaller than 32-bit
> > > > > 4. DMA space is larger than 32-bit
> > > > > 5. DMA goes through an IOMMU
> > 
> > As you explain above, these are properties of end-to-end paths between
> > a bus-mastering device and the destination.  They aren't properties
> > of a device, or of a bus.
> > 
> > For example, we can have the following system, which ePAPR can't describe
> > and wouldn't occur with PCI (or, at least would occur in a transparent
> > way so that software does not need to understand the difference between
> > this structure and a simple CPU->devices tree).
> > 
> >      C
> >      |
> >      v
> >      I ---+
> >     / \    \  
> >    /   \    \ 
> >   v     v    \
> >  A ----> B    \
> >   \            v
> >    +---------> D
> > 
> > This follows from the unidirectional and minimalistic nature of ARM SoC
> > buses (AMBA family, AHB, APB etc. ... and most likely many others too).
> > 
> > To describe A's DMA mappings correctly, the additional links must be
> > described, even though thay are irrelevant for direct CPU->device
> > transactions.
>  
> Can you be more specific about what kind of hardware would use such
> a mapping? The interesting cases are normally all about accessing
> RAM, while your example seems to be for device-to-device DMA and
> that doesn't have to go through dma-ranges.

Imagine that D is RAM.

RAMs may be multi-ported, and it's possible that

Typical cases are where a DMA engine may have a dedicated path to RAM
which might or might not be coherent with other paths, or where there
are multiple, special-purpose RAMs in a system -- for example, some RAM
read by a display controller, which is also independently accessible by

> 
> > dma-ranges does work for simpler cases.  In particular, it works where all
> > bus-mastering children of a bus node can a) access each other using the
> > address space of the bus node, and b) all have the same view of the rest
> > of the system (which may be different from the view from outside the bus:
> > the dma-ranges property on the bus describes the difference).
> > 
> > Sometimes, we may be able to describe an otherwise undescribable situation
> > by introducing additional fake bus nodes.  But if there are cross-links
> > between devices, this won't always work.
> > 
> > 
> > This may not be the common case, but it does happen: we need to decide
> > whether to describe it propertly, or to describe a fantasy in the DT
> > and bodge around it elsewhere when it happens.
> 
> Do you think this could be fully described if we add a "dma-parent"
> property that can redirect the "dma-ranges" parent address space to
> another node?

I don't think so.  A "parent" property allows us to describe a DMA tree
that is different from the CPU->device tree.  This works for interrupts,
which really are wired up as a tree.  But it can't describe the cases
which aren't tree-like.

For example, in the above picture, B and D are both DMA-parents of A.

> If there are devices that have parts of their DMA address space on
> various buses, how about a "dma-ranges-ext" property that contains
> tuples of <&parent-phandle local-address parent-address size> rather
> than just <local-address parent-address size>?

In theory, that could work -- it solves my multiple-parents problem above.
But I do think the "parent" concept is being over-twisted here.

When a CPU acts as a bus master and access a device, we call the device
(recursively) a child of the CPU.  So it's weird to use "parent" to
describe the same relationship when the master is a device, not a CPU.

I'm also not too keen on the "dma" name, since this implies that there
is something magically different about devices mastering devices as
compared with CPUs mastering devices, and may lead to inventing two
methods of describing master/slave linkage which are more different
than necessary.

However, I think this is primarily a disagreement about naming.

> 
> > Similarly, for IOMMU, the ARM SMMU is an independent component which is
> > not directly associated with a bus: nor is there guaranteed to be a 1:1
> > correspondence.  Simply wedging properties in a bus or device node to say
> > "this is associated with an IOMMU" is not always going to work:  it is
> > what you flow through on a given device->device path that matters, and
> > that can vary from path to path.
> 
> Right, I'm aware that the IOMMU may be per device rather than per-bus.
> This could be handled by faking extra buses, or possibly better with
> the dma-parent approach above, if that is allowed to point to either
> a bus or an IOMMU.

Indeed.  One concept that could work is to treat every component that
transforms or routes transactions as a bus for DT purposes.  This can
can complex, though: an IOMMU is all of:

	a) a slave, accepting control transactions from CPUs and elsewhere

	b) a master, generating transactions of its own to read translation
	   tables etc.

	c) an interconnect, routing and transforming transactions from other
	   connected masters to other connected slaves.

	d) an interrupt source

Note that the IOMMU may be in completely different places in the system
topology with regard to each role.  This is a problem except for (d) where
already has a way to describe an independent topology for interrupts.

My own view is that devices of this type should be described using a
single node where all its relationships really do act at a single point
in the topology, and to split it up otherwise.  So, an IOMMU would appear
as a bus node describing its interconnect role, which can most likely
describe the IOMMU's master role too, but with a separate node somewhere
describing its control interface, and a phandle linking them.


DMA address translation is then a matter of traversing links from the
mastering device to the destination device, or vice-versa.  These
links may be described by dma-ranges where the mappings work as defined
by ePAPR, or a separate, phandle-based mechanism similar to the one
you describe, for the other cases.


There was some discussion of possible approaches in this thread:

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-December/215582.html


... but we didn't reach a conclusion yet.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC/RFT 1/2] ARM: mm: introduce arch hooks for dma address translation routines
  2014-02-05 18:37           ` Santosh Shilimkar
@ 2014-02-06 15:38             ` Dave Martin
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Martin @ 2014-02-06 15:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 05, 2014 at 06:37:39PM +0000, Santosh Shilimkar wrote:
> Dave,
> 
> On Wednesday 05 February 2014 11:23 AM, Dave Martin wrote:

[...]

> > Santosh, bearing these arguments in mind, do you think that dma-ranges
> > is natural for your hardware?
> > 
> > The answer may be "yes", but if we're having to twist things to fit,
> > by having to describe something fake or unreal in DT and/or writing board
> > specific code to work around it, that motivates coming up with a better
> > way of describing the hardware in these cases.
> >
> The answer at least not fully "yes" with the limited look at dma-ranges
> so far.
> - The of_translate_dma_address() can be used to translate addresses
> from DMA to CPU address space. And this should work but it will be
> expensive compared to classic macro's.
> - We don't see a way for CPU -> DMA addresses translation using DT.
> Probably some more digging/pointers are is needed.

Are you saying that dma-ranges does correctly describe your hardware,
but the kernel frameworks are inadequate or suboptimal for making use of
this information?

This is a different problem from not being able to describe the hardware.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-02-06 15:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03 23:28 [RFC/RFT 0/2] ARM: mm: Introduce arch hooks for dma address translation Santosh Shilimkar
     [not found] ` <1391470107-15927-3-git-send-email-santosh.shilimkar@ti.com>
2014-02-04  2:05   ` [RFC/RFT 2/2] ARM: keystone: Install hooks for dma address translation routines Olof Johansson
2014-02-04 14:30     ` Santosh Shilimkar
2014-02-04 16:01       ` Arnd Bergmann
2014-02-04 16:22         ` Olof Johansson
     [not found] ` <1391470107-15927-2-git-send-email-santosh.shilimkar@ti.com>
2014-02-04  2:18   ` [RFC/RFT 1/2] ARM: mm: introduce arch " Olof Johansson
2014-02-04 14:33     ` Santosh Shilimkar
2014-02-04 16:15   ` Arnd Bergmann
2014-02-04 16:38     ` Santosh Shilimkar
2014-02-04 17:04       ` Arnd Bergmann
2014-02-05 16:23         ` Dave Martin
2014-02-05 18:37           ` Santosh Shilimkar
2014-02-06 15:38             ` Dave Martin
2014-02-06 12:32           ` Arnd Bergmann
2014-02-06 15:22             ` Dave Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).