* [PATCH V2 0/3] Add hardware I/O coherency support for Armada 370/XP
@ 2012-11-16 9:44 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: linux-arm-kernel
The purpose of this patch set is to add hardware I/O Coherency support
for Armada 370 and Armada XP. Theses SoCs come with an unit called
coherency fabric. A beginning of the support for this unit have been
introduced with the SMP patch set. This series extend this support:
the coherency fabric unit allows to use the Armada XP and the Armada
370 as nearly coherent architectures.
The third patches enables this new feature and register our own set
of DMA ops, to benefit this hardware enhancement.
The first patches exports dma operation functions needed by to
register our own set of dma ops.
The second patch introduces a new flag for the address decoding
configuration in order to be able to set the memory windows as
shared memory.
This series depend on the SMP patch set (V3 was posted on Monday)
The git branch called HWIOCC-for-3.8-V2 is also available at
https://github.com/MISL-EBU-System-SW/mainline-public.git.
Changelog:
V1 -> V2:
- Rebased on to v3.7-rc5
- Added a new patch to exports the dma ops functions
- Renamed the function for a more generic name mvebu_hwcc
- removed the non SMP case during init
- spelling and wording issues
- updating the binding documentation for coherency fabric
Gregory CLEMENT (3):
arm: dma mapping: Export dma ops functions
arm: plat-orion: Add coherency attribute when setup mbus target
arm: mvebu: Add hardware I/O Coherency support
.../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
arch/arm/include/asm/dma-mapping.h | 62 +++++++++++++++++
arch/arm/mach-mvebu/addr-map.c | 3 +
arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
arch/arm/mm/dma-mapping.c | 36 +++-------
arch/arm/plat-orion/addr-map.c | 4 ++
arch/arm/plat-orion/include/plat/addr-map.h | 1 +
8 files changed, 160 insertions(+), 31 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2 0/3] Add hardware I/O coherency support for Armada 370/XP
@ 2012-11-16 9:44 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: Jason Cooper, Andrew Lunn, Gregory Clement, Marek Szyprowski
Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Russell King,
Rob Herring, Ben Dooks, Ian Molton, Nicolas Pitre, Lior Amsalem,
Maen Suleiman, Tawfik Bayouk, Shadi Ammouri, Eran Ben-Avi,
Yehuda Yitschak, Nadav Haklai, Ike Pan, Jani Monoses,
Chris Van Hoof, Dan Frazier, Thomas Petazzoni, Leif Lindholm,
Jon Masters, David Marlin, Sebastian Hesselbarth, linux-kernel
The purpose of this patch set is to add hardware I/O Coherency support
for Armada 370 and Armada XP. Theses SoCs come with an unit called
coherency fabric. A beginning of the support for this unit have been
introduced with the SMP patch set. This series extend this support:
the coherency fabric unit allows to use the Armada XP and the Armada
370 as nearly coherent architectures.
The third patches enables this new feature and register our own set
of DMA ops, to benefit this hardware enhancement.
The first patches exports dma operation functions needed by to
register our own set of dma ops.
The second patch introduces a new flag for the address decoding
configuration in order to be able to set the memory windows as
shared memory.
This series depend on the SMP patch set (V3 was posted on Monday)
The git branch called HWIOCC-for-3.8-V2 is also available at
https://github.com/MISL-EBU-System-SW/mainline-public.git.
Changelog:
V1 -> V2:
- Rebased on to v3.7-rc5
- Added a new patch to exports the dma ops functions
- Renamed the function for a more generic name mvebu_hwcc
- removed the non SMP case during init
- spelling and wording issues
- updating the binding documentation for coherency fabric
Gregory CLEMENT (3):
arm: dma mapping: Export dma ops functions
arm: plat-orion: Add coherency attribute when setup mbus target
arm: mvebu: Add hardware I/O Coherency support
.../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
arch/arm/include/asm/dma-mapping.h | 62 +++++++++++++++++
arch/arm/mach-mvebu/addr-map.c | 3 +
arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
arch/arm/mm/dma-mapping.c | 36 +++-------
arch/arm/plat-orion/addr-map.c | 4 ++
arch/arm/plat-orion/include/plat/addr-map.h | 1 +
8 files changed, 160 insertions(+), 31 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2 1/3] arm: dma mapping: Export dma ops functions
2012-11-16 9:44 ` Gregory CLEMENT
@ 2012-11-16 9:44 ` Gregory CLEMENT
-1 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: linux-arm-kernel
Expose the DMA operations functions. Until now only the dma_ops
structs in a whole or some dma operation were exposed. This patch
exposes all the dma coherents and non-coherents operations. They can
be reused when an architecture or driver need to create its own set of
dma_operation.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
arch/arm/include/asm/dma-mapping.h | 62 ++++++++++++++++++++++++++++++++++++
arch/arm/mm/dma-mapping.c | 36 +++++----------------
2 files changed, 70 insertions(+), 28 deletions(-)
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 2300484..f940a10 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -112,6 +112,60 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
extern int dma_supported(struct device *dev, u64 mask);
/**
+ * arm_dma_map_page - map a portion of a page for streaming DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @page: page that buffer resides in
+ * @offset: offset into page for start of buffer
+ * @size: size of buffer to map
+ * @dir: DMA transfer direction
+ *
+ * Ensure that any data held in the cache is appropriately discarded
+ * or written back.
+ *
+ * The device owns this memory once this call has completed. The CPU
+ * can regain ownership by calling dma_unmap_page().
+ */
+extern dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+extern dma_addr_t arm_coherent_dma_map_page(struct device *dev,
+ struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+/**
+ * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @handle: DMA address of buffer
+ * @size: size of buffer (same as passed to dma_map_page)
+ * @dir: DMA transfer direction (same as passed to dma_map_page)
+ *
+ * Unmap a page streaming mode DMA translation. The handle and size
+ * must match what was provided in the previous dma_map_page() call.
+ * All other usages are undefined.
+ *
+ * After this call, reads by the CPU to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+extern void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+extern void arm_dma_sync_single_for_cpu(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir);
+
+extern void arm_dma_sync_single_for_device(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir);
+
+extern int arm_dma_set_mask(struct device *dev, u64 dma_mask);
+
+
+/**
* arm_dma_alloc - allocate consistent memory for DMA
* @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
* @size: required memory size
@@ -125,6 +179,10 @@ extern int dma_supported(struct device *dev, u64 mask);
extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
gfp_t gfp, struct dma_attrs *attrs);
+extern void *arm_coherent_dma_alloc(struct device *dev, size_t size,
+ dma_addr_t *handle, gfp_t gfp,
+ struct dma_attrs *attrs);
+
#define dma_alloc_coherent(d, s, h, f) dma_alloc_attrs(d, s, h, f, NULL)
static inline void *dma_alloc_attrs(struct device *dev, size_t size,
@@ -157,6 +215,10 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size,
extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t handle, struct dma_attrs *attrs);
+extern void arm_coherent_dma_free(struct device *dev, size_t size,
+ void *cpu_addr, dma_addr_t handle,
+ struct dma_attrs *attrs);
+
#define dma_free_coherent(d, s, c, h) dma_free_attrs(d, s, c, h, NULL)
static inline void dma_free_attrs(struct device *dev, size_t size,
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 58bc3e4..5b60ee6 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -56,20 +56,13 @@ static void __dma_page_dev_to_cpu(struct page *, unsigned long,
size_t, enum dma_data_direction);
/**
- * arm_dma_map_page - map a portion of a page for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
* Ensure that any data held in the cache is appropriately discarded
* or written back.
*
* The device owns this memory once this call has completed. The CPU
* can regain ownership by calling dma_unmap_page().
*/
-static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -78,7 +71,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
return pfn_to_dma(dev, page_to_pfn(page)) + offset;
}
-static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
+dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -86,12 +79,6 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
}
/**
- * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
* Unmap a page streaming mode DMA translation. The handle and size
* must match what was provided in the previous dma_map_page() call.
* All other usages are undefined.
@@ -99,7 +86,7 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
* After this call, reads by the CPU to the buffer are guaranteed to see
* whatever the device wrote there.
*/
-static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -108,7 +95,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
handle & ~PAGE_MASK, size, dir);
}
-static void arm_dma_sync_single_for_cpu(struct device *dev,
+void arm_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
unsigned int offset = handle & (PAGE_SIZE - 1);
@@ -116,7 +103,7 @@ static void arm_dma_sync_single_for_cpu(struct device *dev,
__dma_page_dev_to_cpu(page, offset, size, dir);
}
-static void arm_dma_sync_single_for_device(struct device *dev,
+void arm_dma_sync_single_for_device(struct device *dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
unsigned int offset = handle & (PAGE_SIZE - 1);
@@ -124,8 +111,6 @@ static void arm_dma_sync_single_for_device(struct device *dev,
__dma_page_cpu_to_dev(page, offset, size, dir);
}
-static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
-
struct dma_map_ops arm_dma_ops = {
.alloc = arm_dma_alloc,
.free = arm_dma_free,
@@ -143,11 +128,6 @@ struct dma_map_ops arm_dma_ops = {
};
EXPORT_SYMBOL(arm_dma_ops);
-static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs);
-static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
- dma_addr_t handle, struct dma_attrs *attrs);
-
struct dma_map_ops arm_coherent_dma_ops = {
.alloc = arm_coherent_dma_alloc,
.free = arm_coherent_dma_free,
@@ -672,7 +652,7 @@ void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
__builtin_return_address(0));
}
-static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
+void *arm_coherent_dma_alloc(struct device *dev, size_t size,
dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
{
pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
@@ -751,7 +731,7 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
__arm_dma_free(dev, size, cpu_addr, handle, attrs, false);
}
-static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
+void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t handle, struct dma_attrs *attrs)
{
__arm_dma_free(dev, size, cpu_addr, handle, attrs, true);
@@ -971,7 +951,7 @@ int dma_supported(struct device *dev, u64 mask)
}
EXPORT_SYMBOL(dma_supported);
-static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
+int arm_dma_set_mask(struct device *dev, u64 dma_mask)
{
if (!dev->dma_mask || !dma_supported(dev, dma_mask))
return -EIO;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 1/3] arm: dma mapping: Export dma ops functions
@ 2012-11-16 9:44 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: Jason Cooper, Andrew Lunn, Gregory Clement, Marek Szyprowski
Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Russell King,
Rob Herring, Ben Dooks, Ian Molton, Nicolas Pitre, Lior Amsalem,
Maen Suleiman, Tawfik Bayouk, Shadi Ammouri, Eran Ben-Avi,
Yehuda Yitschak, Nadav Haklai, Ike Pan, Jani Monoses,
Chris Van Hoof, Dan Frazier, Thomas Petazzoni, Leif Lindholm,
Jon Masters, David Marlin, Sebastian Hesselbarth, linux-kernel
Expose the DMA operations functions. Until now only the dma_ops
structs in a whole or some dma operation were exposed. This patch
exposes all the dma coherents and non-coherents operations. They can
be reused when an architecture or driver need to create its own set of
dma_operation.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
---
arch/arm/include/asm/dma-mapping.h | 62 ++++++++++++++++++++++++++++++++++++
arch/arm/mm/dma-mapping.c | 36 +++++----------------
2 files changed, 70 insertions(+), 28 deletions(-)
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 2300484..f940a10 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -112,6 +112,60 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
extern int dma_supported(struct device *dev, u64 mask);
/**
+ * arm_dma_map_page - map a portion of a page for streaming DMA
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @page: page that buffer resides in
+ * @offset: offset into page for start of buffer
+ * @size: size of buffer to map
+ * @dir: DMA transfer direction
+ *
+ * Ensure that any data held in the cache is appropriately discarded
+ * or written back.
+ *
+ * The device owns this memory once this call has completed. The CPU
+ * can regain ownership by calling dma_unmap_page().
+ */
+extern dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+extern dma_addr_t arm_coherent_dma_map_page(struct device *dev,
+ struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+/**
+ * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
+ * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
+ * @handle: DMA address of buffer
+ * @size: size of buffer (same as passed to dma_map_page)
+ * @dir: DMA transfer direction (same as passed to dma_map_page)
+ *
+ * Unmap a page streaming mode DMA translation. The handle and size
+ * must match what was provided in the previous dma_map_page() call.
+ * All other usages are undefined.
+ *
+ * After this call, reads by the CPU to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+extern void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs);
+
+extern void arm_dma_sync_single_for_cpu(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir);
+
+extern void arm_dma_sync_single_for_device(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir);
+
+extern int arm_dma_set_mask(struct device *dev, u64 dma_mask);
+
+
+/**
* arm_dma_alloc - allocate consistent memory for DMA
* @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
* @size: required memory size
@@ -125,6 +179,10 @@ extern int dma_supported(struct device *dev, u64 mask);
extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
gfp_t gfp, struct dma_attrs *attrs);
+extern void *arm_coherent_dma_alloc(struct device *dev, size_t size,
+ dma_addr_t *handle, gfp_t gfp,
+ struct dma_attrs *attrs);
+
#define dma_alloc_coherent(d, s, h, f) dma_alloc_attrs(d, s, h, f, NULL)
static inline void *dma_alloc_attrs(struct device *dev, size_t size,
@@ -157,6 +215,10 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size,
extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t handle, struct dma_attrs *attrs);
+extern void arm_coherent_dma_free(struct device *dev, size_t size,
+ void *cpu_addr, dma_addr_t handle,
+ struct dma_attrs *attrs);
+
#define dma_free_coherent(d, s, c, h) dma_free_attrs(d, s, c, h, NULL)
static inline void dma_free_attrs(struct device *dev, size_t size,
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 58bc3e4..5b60ee6 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -56,20 +56,13 @@ static void __dma_page_dev_to_cpu(struct page *, unsigned long,
size_t, enum dma_data_direction);
/**
- * arm_dma_map_page - map a portion of a page for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
* Ensure that any data held in the cache is appropriately discarded
* or written back.
*
* The device owns this memory once this call has completed. The CPU
* can regain ownership by calling dma_unmap_page().
*/
-static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
+dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -78,7 +71,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
return pfn_to_dma(dev, page_to_pfn(page)) + offset;
}
-static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
+dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -86,12 +79,6 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
}
/**
- * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
* Unmap a page streaming mode DMA translation. The handle and size
* must match what was provided in the previous dma_map_page() call.
* All other usages are undefined.
@@ -99,7 +86,7 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
* After this call, reads by the CPU to the buffer are guaranteed to see
* whatever the device wrote there.
*/
-static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
+void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
struct dma_attrs *attrs)
{
@@ -108,7 +95,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
handle & ~PAGE_MASK, size, dir);
}
-static void arm_dma_sync_single_for_cpu(struct device *dev,
+void arm_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
unsigned int offset = handle & (PAGE_SIZE - 1);
@@ -116,7 +103,7 @@ static void arm_dma_sync_single_for_cpu(struct device *dev,
__dma_page_dev_to_cpu(page, offset, size, dir);
}
-static void arm_dma_sync_single_for_device(struct device *dev,
+void arm_dma_sync_single_for_device(struct device *dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
{
unsigned int offset = handle & (PAGE_SIZE - 1);
@@ -124,8 +111,6 @@ static void arm_dma_sync_single_for_device(struct device *dev,
__dma_page_cpu_to_dev(page, offset, size, dir);
}
-static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
-
struct dma_map_ops arm_dma_ops = {
.alloc = arm_dma_alloc,
.free = arm_dma_free,
@@ -143,11 +128,6 @@ struct dma_map_ops arm_dma_ops = {
};
EXPORT_SYMBOL(arm_dma_ops);
-static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs);
-static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
- dma_addr_t handle, struct dma_attrs *attrs);
-
struct dma_map_ops arm_coherent_dma_ops = {
.alloc = arm_coherent_dma_alloc,
.free = arm_coherent_dma_free,
@@ -672,7 +652,7 @@ void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
__builtin_return_address(0));
}
-static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
+void *arm_coherent_dma_alloc(struct device *dev, size_t size,
dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
{
pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
@@ -751,7 +731,7 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
__arm_dma_free(dev, size, cpu_addr, handle, attrs, false);
}
-static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
+void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t handle, struct dma_attrs *attrs)
{
__arm_dma_free(dev, size, cpu_addr, handle, attrs, true);
@@ -971,7 +951,7 @@ int dma_supported(struct device *dev, u64 mask)
}
EXPORT_SYMBOL(dma_supported);
-static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
+int arm_dma_set_mask(struct device *dev, u64 dma_mask)
{
if (!dev->dma_mask || !dma_supported(dev, dma_mask))
return -EIO;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 2/3] arm: plat-orion: Add coherency attribute when setup mbus target
2012-11-16 9:44 ` Gregory CLEMENT
@ 2012-11-16 9:44 ` Gregory CLEMENT
-1 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: linux-arm-kernel
Recent SoC such as Armada 370/XP came with the possibility to deal
with the I/O coherency by hardware. In this case the transaction
attribute of the window must be flagged as "Shared transaction". Once
this flag is set, then the transactions will be forced to be sent
through the coherency block, in other case transaction is driven
directly to DRAM.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
arch/arm/plat-orion/addr-map.c | 4 ++++
arch/arm/plat-orion/include/plat/addr-map.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/arch/arm/plat-orion/addr-map.c b/arch/arm/plat-orion/addr-map.c
index a7b8060..febe386 100644
--- a/arch/arm/plat-orion/addr-map.c
+++ b/arch/arm/plat-orion/addr-map.c
@@ -42,6 +42,8 @@ EXPORT_SYMBOL_GPL(mv_mbus_dram_info);
#define WIN_REMAP_LO_OFF 0x0008
#define WIN_REMAP_HI_OFF 0x000c
+#define ATTR_HW_COHERENCY (0x1 << 4)
+
/*
* Default implementation
*/
@@ -163,6 +165,8 @@ void __init orion_setup_cpu_mbus_target(const struct orion_addr_map_cfg *cfg,
w = &orion_mbus_dram_info.cs[cs++];
w->cs_index = i;
w->mbus_attr = 0xf & ~(1 << i);
+ if (cfg->hw_io_coherency)
+ w->mbus_attr |= ATTR_HW_COHERENCY;
w->base = base & 0xffff0000;
w->size = (size | 0x0000ffff) + 1;
}
diff --git a/arch/arm/plat-orion/include/plat/addr-map.h b/arch/arm/plat-orion/include/plat/addr-map.h
index ec63e4a..b76c065 100644
--- a/arch/arm/plat-orion/include/plat/addr-map.h
+++ b/arch/arm/plat-orion/include/plat/addr-map.h
@@ -17,6 +17,7 @@ struct orion_addr_map_cfg {
const int num_wins; /* Total number of windows */
const int remappable_wins;
void __iomem *bridge_virt_base;
+ int hw_io_coherency;
/* If NULL, the default cpu_win_can_remap will be used, using
the value in remappable_wins */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 2/3] arm: plat-orion: Add coherency attribute when setup mbus target
@ 2012-11-16 9:44 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:44 UTC (permalink / raw)
To: Jason Cooper, Andrew Lunn, Gregory Clement, Marek Szyprowski
Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Russell King,
Rob Herring, Ben Dooks, Ian Molton, Nicolas Pitre, Lior Amsalem,
Maen Suleiman, Tawfik Bayouk, Shadi Ammouri, Eran Ben-Avi,
Yehuda Yitschak, Nadav Haklai, Ike Pan, Jani Monoses,
Chris Van Hoof, Dan Frazier, Thomas Petazzoni, Leif Lindholm,
Jon Masters, David Marlin, Sebastian Hesselbarth, linux-kernel
Recent SoC such as Armada 370/XP came with the possibility to deal
with the I/O coherency by hardware. In this case the transaction
attribute of the window must be flagged as "Shared transaction". Once
this flag is set, then the transactions will be forced to be sent
through the coherency block, in other case transaction is driven
directly to DRAM.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
arch/arm/plat-orion/addr-map.c | 4 ++++
arch/arm/plat-orion/include/plat/addr-map.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/arch/arm/plat-orion/addr-map.c b/arch/arm/plat-orion/addr-map.c
index a7b8060..febe386 100644
--- a/arch/arm/plat-orion/addr-map.c
+++ b/arch/arm/plat-orion/addr-map.c
@@ -42,6 +42,8 @@ EXPORT_SYMBOL_GPL(mv_mbus_dram_info);
#define WIN_REMAP_LO_OFF 0x0008
#define WIN_REMAP_HI_OFF 0x000c
+#define ATTR_HW_COHERENCY (0x1 << 4)
+
/*
* Default implementation
*/
@@ -163,6 +165,8 @@ void __init orion_setup_cpu_mbus_target(const struct orion_addr_map_cfg *cfg,
w = &orion_mbus_dram_info.cs[cs++];
w->cs_index = i;
w->mbus_attr = 0xf & ~(1 << i);
+ if (cfg->hw_io_coherency)
+ w->mbus_attr |= ATTR_HW_COHERENCY;
w->base = base & 0xffff0000;
w->size = (size | 0x0000ffff) + 1;
}
diff --git a/arch/arm/plat-orion/include/plat/addr-map.h b/arch/arm/plat-orion/include/plat/addr-map.h
index ec63e4a..b76c065 100644
--- a/arch/arm/plat-orion/include/plat/addr-map.h
+++ b/arch/arm/plat-orion/include/plat/addr-map.h
@@ -17,6 +17,7 @@ struct orion_addr_map_cfg {
const int num_wins; /* Total number of windows */
const int remappable_wins;
void __iomem *bridge_virt_base;
+ int hw_io_coherency;
/* If NULL, the default cpu_win_can_remap will be used, using
the value in remappable_wins */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 3/3] arm: mvebu: Add hardware I/O Coherency support
2012-11-16 9:44 ` Gregory CLEMENT
@ 2012-11-16 9:45 ` Gregory CLEMENT
-1 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:45 UTC (permalink / raw)
To: linux-arm-kernel
Armada 370 and XP come with an unit called coherency fabric. This unit
allows to use the Armada 370/XP as a nearly coherent architecture. The
coherency mechanism uses snoop filters to ensure the coherency between
caches, DRAM and devices. This mechanism needs a synchronization
barrier which guarantees that all the memory writes initiated by the
devices have reached their target and do not reside in intermediate
write buffers. That's why the architecture is not totally coherent and
we need to provide our own functions for some DMA operations.
Beside the use of the coherency fabric, the device units will have to
set the attribute flag of the decoding address window to select the
accurate coherency process for the memory transaction. This is done
each device driver programs the DRAM address windows. The value of the
attribute set by the driver is retrieved through the
orion_addr_map_cfg struct filled during the early initialization of
the platform.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
---
.../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
arch/arm/mach-mvebu/addr-map.c | 3 +
arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
4 files changed, 85 insertions(+), 3 deletions(-)
diff --git a/Documentation/devicetree/bindings/arm/coherency-fabric.txt b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
index 2bfbf67..17d8cd1 100644
--- a/Documentation/devicetree/bindings/arm/coherency-fabric.txt
+++ b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
@@ -5,12 +5,17 @@ Available on Marvell SOCs: Armada 370 and Armada XP
Required properties:
- compatible: "marvell,coherency-fabric"
-- reg: Should contain,coherency fabric registers location and length.
+
+- reg: Should contain coherency fabric registers location and
+ length. First pair for the coherency fabric registers, second pair
+ for the per-CPU fabric registers registers.
Example:
coherency-fabric at d0020200 {
compatible = "marvell,coherency-fabric";
- reg = <0xd0020200 0xb0>;
+ reg = <0xd0020200 0xb0>,
+ <0xd0021810 0x1c>;
+
};
diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi
index b0d075b..98a6b26 100644
--- a/arch/arm/boot/dts/armada-370-xp.dtsi
+++ b/arch/arm/boot/dts/armada-370-xp.dtsi
@@ -38,7 +38,8 @@
coherency-fabric at d0020200 {
compatible = "marvell,coherency-fabric";
- reg = <0xd0020200 0xb0>;
+ reg = <0xd0020200 0xb0>,
+ <0xd0021810 0x1c>;
};
soc {
diff --git a/arch/arm/mach-mvebu/addr-map.c b/arch/arm/mach-mvebu/addr-map.c
index fe454a4..595f6b7 100644
--- a/arch/arm/mach-mvebu/addr-map.c
+++ b/arch/arm/mach-mvebu/addr-map.c
@@ -108,6 +108,9 @@ static int __init armada_setup_cpu_mbus(void)
addr_map_cfg.bridge_virt_base = mbus_unit_addr_decoding_base;
+ if (of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric"))
+ addr_map_cfg.hw_io_coherency = 1;
+
/*
* Disable, clear and configure windows.
*/
diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index 20a0ccc..153fcfa 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -22,6 +22,8 @@
#include <linux/of_address.h>
#include <linux/io.h>
#include <linux/smp.h>
+#include <linux/dma-mapping.h>
+#include <linux/platform_device.h>
#include <asm/smp_plat.h>
#include "armada-370-xp.h"
@@ -32,11 +34,14 @@
* value matching its virtual mapping
*/
static void __iomem *coherency_base = ARMADA_370_XP_REGS_VIRT_BASE + 0x20200;
+static void __iomem *coherency_cpu_base;
/* Coherency fabric registers */
#define COHERENCY_FABRIC_CTL_OFFSET 0x0
#define COHERENCY_FABRIC_CFG_OFFSET 0x4
+#define IO_SYNC_BARRIER_CTL_OFFSET 0x0
+
static struct of_device_id of_coherency_table[] = {
{.compatible = "marvell,coherency-fabric"},
{ /* end of list */ },
@@ -75,6 +80,70 @@ int set_cpu_coherent(unsigned int hw_cpu_id, int smp_group_id)
return 0;
}
+static inline void mvebu_hwcc_sync_io_barrier(void)
+{
+ writel(0x1, coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET);
+ while (readl(coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET) & 0x1);
+}
+
+static dma_addr_t mvebu_hwcc_dma_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+ return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+}
+
+
+static void mvebu_hwcc_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+}
+
+static void mvebu_hwcc_dma_sync(struct device *dev, dma_addr_t dma_handle,
+ size_t size, enum dma_data_direction dir)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+}
+
+static struct dma_map_ops mvebu_hwcc_dma_ops = {
+ .alloc = arm_coherent_dma_alloc,
+ .free = arm_coherent_dma_free,
+ .mmap = arm_dma_mmap,
+ .unmap_page = mvebu_hwcc_dma_unmap_page,
+ .get_sgtable = arm_dma_get_sgtable,
+ .map_page = mvebu_hwcc_dma_map_page,
+ .map_sg = arm_dma_map_sg,
+ .unmap_sg = arm_dma_unmap_sg,
+ .sync_single_for_cpu = mvebu_hwcc_dma_sync,
+ .sync_single_for_device = mvebu_hwcc_dma_sync,
+ .sync_sg_for_cpu = arm_dma_sync_sg_for_cpu,
+ .sync_sg_for_device = arm_dma_sync_sg_for_device,
+ .set_dma_mask = arm_dma_set_mask,
+};
+
+static int mvebu_hwcc_platform_notifier(struct notifier_block *nb,
+ unsigned long event, void *__dev)
+{
+ struct device *dev = __dev;
+
+ if (event != BUS_NOTIFY_ADD_DEVICE)
+ return NOTIFY_DONE;
+ set_dma_ops(dev, &mvebu_hwcc_dma_ops);
+
+ return NOTIFY_OK;
+}
+
+static struct notifier_block mvebu_hwcc_platform_nb = {
+ .notifier_call = mvebu_hwcc_platform_notifier,
+};
+
int __init coherency_init(void)
{
struct device_node *np;
@@ -83,6 +152,10 @@ int __init coherency_init(void)
if (np) {
pr_info("Initializing Coherency fabric\n");
coherency_base = of_iomap(np, 0);
+ coherency_cpu_base = of_iomap(np, 1);
+ set_cpu_coherent(cpu_logical_map(smp_processor_id()), 0);
+ bus_register_notifier(&platform_bus_type,
+ &mvebu_hwcc_platform_nb);
}
return 0;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 3/3] arm: mvebu: Add hardware I/O Coherency support
@ 2012-11-16 9:45 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-16 9:45 UTC (permalink / raw)
To: Jason Cooper, Andrew Lunn, Gregory Clement, Marek Szyprowski
Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Russell King,
Rob Herring, Ben Dooks, Ian Molton, Nicolas Pitre, Lior Amsalem,
Maen Suleiman, Tawfik Bayouk, Shadi Ammouri, Eran Ben-Avi,
Yehuda Yitschak, Nadav Haklai, Ike Pan, Jani Monoses,
Chris Van Hoof, Dan Frazier, Thomas Petazzoni, Leif Lindholm,
Jon Masters, David Marlin, Sebastian Hesselbarth, linux-kernel
Armada 370 and XP come with an unit called coherency fabric. This unit
allows to use the Armada 370/XP as a nearly coherent architecture. The
coherency mechanism uses snoop filters to ensure the coherency between
caches, DRAM and devices. This mechanism needs a synchronization
barrier which guarantees that all the memory writes initiated by the
devices have reached their target and do not reside in intermediate
write buffers. That's why the architecture is not totally coherent and
we need to provide our own functions for some DMA operations.
Beside the use of the coherency fabric, the device units will have to
set the attribute flag of the decoding address window to select the
accurate coherency process for the memory transaction. This is done
each device driver programs the DRAM address windows. The value of the
attribute set by the driver is retrieved through the
orion_addr_map_cfg struct filled during the early initialization of
the platform.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
---
.../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
arch/arm/mach-mvebu/addr-map.c | 3 +
arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
4 files changed, 85 insertions(+), 3 deletions(-)
diff --git a/Documentation/devicetree/bindings/arm/coherency-fabric.txt b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
index 2bfbf67..17d8cd1 100644
--- a/Documentation/devicetree/bindings/arm/coherency-fabric.txt
+++ b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
@@ -5,12 +5,17 @@ Available on Marvell SOCs: Armada 370 and Armada XP
Required properties:
- compatible: "marvell,coherency-fabric"
-- reg: Should contain,coherency fabric registers location and length.
+
+- reg: Should contain coherency fabric registers location and
+ length. First pair for the coherency fabric registers, second pair
+ for the per-CPU fabric registers registers.
Example:
coherency-fabric@d0020200 {
compatible = "marvell,coherency-fabric";
- reg = <0xd0020200 0xb0>;
+ reg = <0xd0020200 0xb0>,
+ <0xd0021810 0x1c>;
+
};
diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi
index b0d075b..98a6b26 100644
--- a/arch/arm/boot/dts/armada-370-xp.dtsi
+++ b/arch/arm/boot/dts/armada-370-xp.dtsi
@@ -38,7 +38,8 @@
coherency-fabric@d0020200 {
compatible = "marvell,coherency-fabric";
- reg = <0xd0020200 0xb0>;
+ reg = <0xd0020200 0xb0>,
+ <0xd0021810 0x1c>;
};
soc {
diff --git a/arch/arm/mach-mvebu/addr-map.c b/arch/arm/mach-mvebu/addr-map.c
index fe454a4..595f6b7 100644
--- a/arch/arm/mach-mvebu/addr-map.c
+++ b/arch/arm/mach-mvebu/addr-map.c
@@ -108,6 +108,9 @@ static int __init armada_setup_cpu_mbus(void)
addr_map_cfg.bridge_virt_base = mbus_unit_addr_decoding_base;
+ if (of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric"))
+ addr_map_cfg.hw_io_coherency = 1;
+
/*
* Disable, clear and configure windows.
*/
diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index 20a0ccc..153fcfa 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -22,6 +22,8 @@
#include <linux/of_address.h>
#include <linux/io.h>
#include <linux/smp.h>
+#include <linux/dma-mapping.h>
+#include <linux/platform_device.h>
#include <asm/smp_plat.h>
#include "armada-370-xp.h"
@@ -32,11 +34,14 @@
* value matching its virtual mapping
*/
static void __iomem *coherency_base = ARMADA_370_XP_REGS_VIRT_BASE + 0x20200;
+static void __iomem *coherency_cpu_base;
/* Coherency fabric registers */
#define COHERENCY_FABRIC_CTL_OFFSET 0x0
#define COHERENCY_FABRIC_CFG_OFFSET 0x4
+#define IO_SYNC_BARRIER_CTL_OFFSET 0x0
+
static struct of_device_id of_coherency_table[] = {
{.compatible = "marvell,coherency-fabric"},
{ /* end of list */ },
@@ -75,6 +80,70 @@ int set_cpu_coherent(unsigned int hw_cpu_id, int smp_group_id)
return 0;
}
+static inline void mvebu_hwcc_sync_io_barrier(void)
+{
+ writel(0x1, coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET);
+ while (readl(coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET) & 0x1);
+}
+
+static dma_addr_t mvebu_hwcc_dma_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+ return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+}
+
+
+static void mvebu_hwcc_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
+ size_t size, enum dma_data_direction dir,
+ struct dma_attrs *attrs)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+}
+
+static void mvebu_hwcc_dma_sync(struct device *dev, dma_addr_t dma_handle,
+ size_t size, enum dma_data_direction dir)
+{
+ if (dir != DMA_TO_DEVICE)
+ mvebu_hwcc_sync_io_barrier();
+}
+
+static struct dma_map_ops mvebu_hwcc_dma_ops = {
+ .alloc = arm_coherent_dma_alloc,
+ .free = arm_coherent_dma_free,
+ .mmap = arm_dma_mmap,
+ .unmap_page = mvebu_hwcc_dma_unmap_page,
+ .get_sgtable = arm_dma_get_sgtable,
+ .map_page = mvebu_hwcc_dma_map_page,
+ .map_sg = arm_dma_map_sg,
+ .unmap_sg = arm_dma_unmap_sg,
+ .sync_single_for_cpu = mvebu_hwcc_dma_sync,
+ .sync_single_for_device = mvebu_hwcc_dma_sync,
+ .sync_sg_for_cpu = arm_dma_sync_sg_for_cpu,
+ .sync_sg_for_device = arm_dma_sync_sg_for_device,
+ .set_dma_mask = arm_dma_set_mask,
+};
+
+static int mvebu_hwcc_platform_notifier(struct notifier_block *nb,
+ unsigned long event, void *__dev)
+{
+ struct device *dev = __dev;
+
+ if (event != BUS_NOTIFY_ADD_DEVICE)
+ return NOTIFY_DONE;
+ set_dma_ops(dev, &mvebu_hwcc_dma_ops);
+
+ return NOTIFY_OK;
+}
+
+static struct notifier_block mvebu_hwcc_platform_nb = {
+ .notifier_call = mvebu_hwcc_platform_notifier,
+};
+
int __init coherency_init(void)
{
struct device_node *np;
@@ -83,6 +152,10 @@ int __init coherency_init(void)
if (np) {
pr_info("Initializing Coherency fabric\n");
coherency_base = of_iomap(np, 0);
+ coherency_cpu_base = of_iomap(np, 1);
+ set_cpu_coherent(cpu_logical_map(smp_processor_id()), 0);
+ bus_register_notifier(&platform_bus_type,
+ &mvebu_hwcc_platform_nb);
}
return 0;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V2 1/3] arm: dma mapping: Export dma ops functions
2012-11-16 9:44 ` Gregory CLEMENT
(?)
@ 2012-11-19 10:00 ` Gregory CLEMENT
-1 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-19 10:00 UTC (permalink / raw)
To: linux-arm-kernel
On 11/16/2012 10:44 AM, Gregory CLEMENT wrote:
> Expose the DMA operations functions. Until now only the dma_ops
> structs in a whole or some dma operation were exposed. This patch
> exposes all the dma coherents and non-coherents operations. They can
> be reused when an architecture or driver need to create its own set of
> dma_operation.
Hello Marek,
If I understood well, you are the one who take care of the ARM DMA-mapping
subsystem.
It would be good if we could have an acked-by from you, for this patch.
Thanks,
Greogry
>
> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
> ---
> arch/arm/include/asm/dma-mapping.h | 62 ++++++++++++++++++++++++++++++++++++
> arch/arm/mm/dma-mapping.c | 36 +++++----------------
> 2 files changed, 70 insertions(+), 28 deletions(-)
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 2300484..f940a10 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -112,6 +112,60 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
> extern int dma_supported(struct device *dev, u64 mask);
>
> /**
> + * arm_dma_map_page - map a portion of a page for streaming DMA
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @page: page that buffer resides in
> + * @offset: offset into page for start of buffer
> + * @size: size of buffer to map
> + * @dir: DMA transfer direction
> + *
> + * Ensure that any data held in the cache is appropriately discarded
> + * or written back.
> + *
> + * The device owns this memory once this call has completed. The CPU
> + * can regain ownership by calling dma_unmap_page().
> + */
> +extern dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> + unsigned long offset, size_t size,
> + enum dma_data_direction dir,
> + struct dma_attrs *attrs);
> +
> +extern dma_addr_t arm_coherent_dma_map_page(struct device *dev,
> + struct page *page,
> + unsigned long offset, size_t size,
> + enum dma_data_direction dir,
> + struct dma_attrs *attrs);
> +
> +/**
> + * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> + * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> + * @handle: DMA address of buffer
> + * @size: size of buffer (same as passed to dma_map_page)
> + * @dir: DMA transfer direction (same as passed to dma_map_page)
> + *
> + * Unmap a page streaming mode DMA translation. The handle and size
> + * must match what was provided in the previous dma_map_page() call.
> + * All other usages are undefined.
> + *
> + * After this call, reads by the CPU to the buffer are guaranteed to see
> + * whatever the device wrote there.
> + */
> +extern void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> + size_t size, enum dma_data_direction dir,
> + struct dma_attrs *attrs);
> +
> +extern void arm_dma_sync_single_for_cpu(struct device *dev,
> + dma_addr_t handle, size_t size,
> + enum dma_data_direction dir);
> +
> +extern void arm_dma_sync_single_for_device(struct device *dev,
> + dma_addr_t handle, size_t size,
> + enum dma_data_direction dir);
> +
> +extern int arm_dma_set_mask(struct device *dev, u64 dma_mask);
> +
> +
> +/**
> * arm_dma_alloc - allocate consistent memory for DMA
> * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> * @size: required memory size
> @@ -125,6 +179,10 @@ extern int dma_supported(struct device *dev, u64 mask);
> extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> gfp_t gfp, struct dma_attrs *attrs);
>
> +extern void *arm_coherent_dma_alloc(struct device *dev, size_t size,
> + dma_addr_t *handle, gfp_t gfp,
> + struct dma_attrs *attrs);
> +
> #define dma_alloc_coherent(d, s, h, f) dma_alloc_attrs(d, s, h, f, NULL)
>
> static inline void *dma_alloc_attrs(struct device *dev, size_t size,
> @@ -157,6 +215,10 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size,
> extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
> dma_addr_t handle, struct dma_attrs *attrs);
>
> +extern void arm_coherent_dma_free(struct device *dev, size_t size,
> + void *cpu_addr, dma_addr_t handle,
> + struct dma_attrs *attrs);
> +
> #define dma_free_coherent(d, s, c, h) dma_free_attrs(d, s, c, h, NULL)
>
> static inline void dma_free_attrs(struct device *dev, size_t size,
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 58bc3e4..5b60ee6 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -56,20 +56,13 @@ static void __dma_page_dev_to_cpu(struct page *, unsigned long,
> size_t, enum dma_data_direction);
>
> /**
> - * arm_dma_map_page - map a portion of a page for streaming DMA
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @page: page that buffer resides in
> - * @offset: offset into page for start of buffer
> - * @size: size of buffer to map
> - * @dir: DMA transfer direction
> - *
> * Ensure that any data held in the cache is appropriately discarded
> * or written back.
> *
> * The device owns this memory once this call has completed. The CPU
> * can regain ownership by calling dma_unmap_page().
> */
> -static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> +dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> unsigned long offset, size_t size, enum dma_data_direction dir,
> struct dma_attrs *attrs)
> {
> @@ -78,7 +71,7 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
> return pfn_to_dma(dev, page_to_pfn(page)) + offset;
> }
>
> -static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
> +dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page,
> unsigned long offset, size_t size, enum dma_data_direction dir,
> struct dma_attrs *attrs)
> {
> @@ -86,12 +79,6 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
> }
>
> /**
> - * arm_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
> - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
> - * @handle: DMA address of buffer
> - * @size: size of buffer (same as passed to dma_map_page)
> - * @dir: DMA transfer direction (same as passed to dma_map_page)
> - *
> * Unmap a page streaming mode DMA translation. The handle and size
> * must match what was provided in the previous dma_map_page() call.
> * All other usages are undefined.
> @@ -99,7 +86,7 @@ static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *pag
> * After this call, reads by the CPU to the buffer are guaranteed to see
> * whatever the device wrote there.
> */
> -static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> +void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir,
> struct dma_attrs *attrs)
> {
> @@ -108,7 +95,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle,
> handle & ~PAGE_MASK, size, dir);
> }
>
> -static void arm_dma_sync_single_for_cpu(struct device *dev,
> +void arm_dma_sync_single_for_cpu(struct device *dev,
> dma_addr_t handle, size_t size, enum dma_data_direction dir)
> {
> unsigned int offset = handle & (PAGE_SIZE - 1);
> @@ -116,7 +103,7 @@ static void arm_dma_sync_single_for_cpu(struct device *dev,
> __dma_page_dev_to_cpu(page, offset, size, dir);
> }
>
> -static void arm_dma_sync_single_for_device(struct device *dev,
> +void arm_dma_sync_single_for_device(struct device *dev,
> dma_addr_t handle, size_t size, enum dma_data_direction dir)
> {
> unsigned int offset = handle & (PAGE_SIZE - 1);
> @@ -124,8 +111,6 @@ static void arm_dma_sync_single_for_device(struct device *dev,
> __dma_page_cpu_to_dev(page, offset, size, dir);
> }
>
> -static int arm_dma_set_mask(struct device *dev, u64 dma_mask);
> -
> struct dma_map_ops arm_dma_ops = {
> .alloc = arm_dma_alloc,
> .free = arm_dma_free,
> @@ -143,11 +128,6 @@ struct dma_map_ops arm_dma_ops = {
> };
> EXPORT_SYMBOL(arm_dma_ops);
>
> -static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
> - dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs);
> -static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
> - dma_addr_t handle, struct dma_attrs *attrs);
> -
> struct dma_map_ops arm_coherent_dma_ops = {
> .alloc = arm_coherent_dma_alloc,
> .free = arm_coherent_dma_free,
> @@ -672,7 +652,7 @@ void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> __builtin_return_address(0));
> }
>
> -static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
> +void *arm_coherent_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
> {
> pgprot_t prot = __get_dma_pgprot(attrs, pgprot_kernel);
> @@ -751,7 +731,7 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
> __arm_dma_free(dev, size, cpu_addr, handle, attrs, false);
> }
>
> -static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
> +void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
> dma_addr_t handle, struct dma_attrs *attrs)
> {
> __arm_dma_free(dev, size, cpu_addr, handle, attrs, true);
> @@ -971,7 +951,7 @@ int dma_supported(struct device *dev, u64 mask)
> }
> EXPORT_SYMBOL(dma_supported);
>
> -static int arm_dma_set_mask(struct device *dev, u64 dma_mask)
> +int arm_dma_set_mask(struct device *dev, u64 dma_mask)
> {
> if (!dev->dma_mask || !dma_supported(dev, dma_mask))
> return -EIO;
>
--
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2 3/3] arm: mvebu: Add hardware I/O Coherency support
2012-11-16 9:45 ` Gregory CLEMENT
(?)
@ 2012-11-19 12:50 ` Marek Szyprowski
2012-11-20 21:01 ` Gregory CLEMENT
-1 siblings, 1 reply; 11+ messages in thread
From: Marek Szyprowski @ 2012-11-19 12:50 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On 11/16/2012 10:45 AM, Gregory CLEMENT wrote:
> Armada 370 and XP come with an unit called coherency fabric. This unit
> allows to use the Armada 370/XP as a nearly coherent architecture. The
> coherency mechanism uses snoop filters to ensure the coherency between
> caches, DRAM and devices. This mechanism needs a synchronization
> barrier which guarantees that all the memory writes initiated by the
> devices have reached their target and do not reside in intermediate
> write buffers. That's why the architecture is not totally coherent and
> we need to provide our own functions for some DMA operations.
>
> Beside the use of the coherency fabric, the device units will have to
> set the attribute flag of the decoding address window to select the
> accurate coherency process for the memory transaction. This is done
> each device driver programs the DRAM address windows. The value of the
> attribute set by the driver is retrieved through the
> orion_addr_map_cfg struct filled during the early initialization of
> the platform.
>
> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
> Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
> ---
> .../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
> arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
> arch/arm/mach-mvebu/addr-map.c | 3 +
> arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
> 4 files changed, 85 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/arm/coherency-fabric.txt b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
> index 2bfbf67..17d8cd1 100644
> --- a/Documentation/devicetree/bindings/arm/coherency-fabric.txt
> +++ b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
> @@ -5,12 +5,17 @@ Available on Marvell SOCs: Armada 370 and Armada XP
> Required properties:
>
> - compatible: "marvell,coherency-fabric"
> -- reg: Should contain,coherency fabric registers location and length.
> +
> +- reg: Should contain coherency fabric registers location and
> + length. First pair for the coherency fabric registers, second pair
> + for the per-CPU fabric registers registers.
>
> Example:
>
> coherency-fabric at d0020200 {
> compatible = "marvell,coherency-fabric";
> - reg = <0xd0020200 0xb0>;
> + reg = <0xd0020200 0xb0>,
> + <0xd0021810 0x1c>;
> +
> };
>
> diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi
> index b0d075b..98a6b26 100644
> --- a/arch/arm/boot/dts/armada-370-xp.dtsi
> +++ b/arch/arm/boot/dts/armada-370-xp.dtsi
> @@ -38,7 +38,8 @@
>
> coherency-fabric at d0020200 {
> compatible = "marvell,coherency-fabric";
> - reg = <0xd0020200 0xb0>;
> + reg = <0xd0020200 0xb0>,
> + <0xd0021810 0x1c>;
> };
>
> soc {
> diff --git a/arch/arm/mach-mvebu/addr-map.c b/arch/arm/mach-mvebu/addr-map.c
> index fe454a4..595f6b7 100644
> --- a/arch/arm/mach-mvebu/addr-map.c
> +++ b/arch/arm/mach-mvebu/addr-map.c
> @@ -108,6 +108,9 @@ static int __init armada_setup_cpu_mbus(void)
>
> addr_map_cfg.bridge_virt_base = mbus_unit_addr_decoding_base;
>
> + if (of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric"))
> + addr_map_cfg.hw_io_coherency = 1;
> +
> /*
> * Disable, clear and configure windows.
> */
> diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
> index 20a0ccc..153fcfa 100644
> --- a/arch/arm/mach-mvebu/coherency.c
> +++ b/arch/arm/mach-mvebu/coherency.c
> @@ -22,6 +22,8 @@
> #include <linux/of_address.h>
> #include <linux/io.h>
> #include <linux/smp.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/platform_device.h>
> #include <asm/smp_plat.h>
> #include "armada-370-xp.h"
>
> @@ -32,11 +34,14 @@
> * value matching its virtual mapping
> */
> static void __iomem *coherency_base = ARMADA_370_XP_REGS_VIRT_BASE + 0x20200;
> +static void __iomem *coherency_cpu_base;
>
> /* Coherency fabric registers */
> #define COHERENCY_FABRIC_CTL_OFFSET 0x0
> #define COHERENCY_FABRIC_CFG_OFFSET 0x4
>
> +#define IO_SYNC_BARRIER_CTL_OFFSET 0x0
> +
> static struct of_device_id of_coherency_table[] = {
> {.compatible = "marvell,coherency-fabric"},
> { /* end of list */ },
> @@ -75,6 +80,70 @@ int set_cpu_coherent(unsigned int hw_cpu_id, int smp_group_id)
> return 0;
> }
>
> +static inline void mvebu_hwcc_sync_io_barrier(void)
> +{
> + writel(0x1, coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET);
> + while (readl(coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET) & 0x1);
> +}
> +
> +static dma_addr_t mvebu_hwcc_dma_map_page(struct device *dev, struct page *page,
> + unsigned long offset, size_t size,
> + enum dma_data_direction dir,
> + struct dma_attrs *attrs)
> +{
> + if (dir != DMA_TO_DEVICE)
> + mvebu_hwcc_sync_io_barrier();
> + return pfn_to_dma(dev, page_to_pfn(page)) + offset;
> +}
> +
> +
> +static void mvebu_hwcc_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
> + size_t size, enum dma_data_direction dir,
> + struct dma_attrs *attrs)
> +{
> + if (dir != DMA_TO_DEVICE)
> + mvebu_hwcc_sync_io_barrier();
> +}
> +
> +static void mvebu_hwcc_dma_sync(struct device *dev, dma_addr_t dma_handle,
> + size_t size, enum dma_data_direction dir)
> +{
> + if (dir != DMA_TO_DEVICE)
> + mvebu_hwcc_sync_io_barrier();
> +}
> +
> +static struct dma_map_ops mvebu_hwcc_dma_ops = {
> + .alloc = arm_coherent_dma_alloc,
> + .free = arm_coherent_dma_free,
Are you sure that arm_coherent_dma_{alloc,free} are right functions for Your
architecture? If I understand right, You need to do implicit synchronization
(io barrier) between CPU transactions and device transactions.
dma_alloc_coherent() provides memory which can be used simultaneously by
both
CPU and devices, so without such barrier the memory won't be coherent.
IMHO You should use arm_dma_{alloc,free} functions as Your hardware is not
truly coherent. Then Your mvebu_hwcc_dma_ops will look very similar to
dmabounce_ops from arch/arm/common/dmabounce.c (custom functions only for
map/unmap page and sync_single_for_cpu/device).
> + .mmap = arm_dma_mmap,
> + .unmap_page = mvebu_hwcc_dma_unmap_page,
Please reorder entries to get map and unmap together.
> + .get_sgtable = arm_dma_get_sgtable,
> + .map_page = mvebu_hwcc_dma_map_page,
> + .map_sg = arm_dma_map_sg,
> + .unmap_sg = arm_dma_unmap_sg,
> + .sync_single_for_cpu = mvebu_hwcc_dma_sync,
> + .sync_single_for_device = mvebu_hwcc_dma_sync,
> + .sync_sg_for_cpu = arm_dma_sync_sg_for_cpu,
> + .sync_sg_for_device = arm_dma_sync_sg_for_device,
> + .set_dma_mask = arm_dma_set_mask,
> +};
> +
> +static int mvebu_hwcc_platform_notifier(struct notifier_block *nb,
> + unsigned long event, void *__dev)
> +{
> + struct device *dev = __dev;
> +
> + if (event != BUS_NOTIFY_ADD_DEVICE)
> + return NOTIFY_DONE;
> + set_dma_ops(dev, &mvebu_hwcc_dma_ops);
> +
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block mvebu_hwcc_platform_nb = {
> + .notifier_call = mvebu_hwcc_platform_notifier,
> +};
> +
> int __init coherency_init(void)
> {
> struct device_node *np;
> @@ -83,6 +152,10 @@ int __init coherency_init(void)
> if (np) {
> pr_info("Initializing Coherency fabric\n");
> coherency_base = of_iomap(np, 0);
> + coherency_cpu_base = of_iomap(np, 1);
> + set_cpu_coherent(cpu_logical_map(smp_processor_id()), 0);
> + bus_register_notifier(&platform_bus_type,
> + &mvebu_hwcc_platform_nb);
> }
>
> return 0;
Best regards
--
Marek Szyprowski
Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2 3/3] arm: mvebu: Add hardware I/O Coherency support
2012-11-19 12:50 ` Marek Szyprowski
@ 2012-11-20 21:01 ` Gregory CLEMENT
0 siblings, 0 replies; 11+ messages in thread
From: Gregory CLEMENT @ 2012-11-20 21:01 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On 11/19/2012 01:50 PM, Marek Szyprowski wrote:
> Hello,
>
> On 11/16/2012 10:45 AM, Gregory CLEMENT wrote:
>> Armada 370 and XP come with an unit called coherency fabric. This unit
>> allows to use the Armada 370/XP as a nearly coherent architecture. The
>> coherency mechanism uses snoop filters to ensure the coherency between
>> caches, DRAM and devices. This mechanism needs a synchronization
>> barrier which guarantees that all the memory writes initiated by the
>> devices have reached their target and do not reside in intermediate
>> write buffers. That's why the architecture is not totally coherent and
>> we need to provide our own functions for some DMA operations.
>>
>> Beside the use of the coherency fabric, the device units will have to
>> set the attribute flag of the decoding address window to select the
>> accurate coherency process for the memory transaction. This is done
>> each device driver programs the DRAM address windows. The value of the
>> attribute set by the driver is retrieved through the
>> orion_addr_map_cfg struct filled during the early initialization of
>> the platform.
>>
>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
>> Reviewed-by: Yehuda Yitschak <yehuday@marvell.com>
>> ---
>> .../devicetree/bindings/arm/coherency-fabric.txt | 9 ++-
>> arch/arm/boot/dts/armada-370-xp.dtsi | 3 +-
>> arch/arm/mach-mvebu/addr-map.c | 3 +
>> arch/arm/mach-mvebu/coherency.c | 73 ++++++++++++++++++++
>> 4 files changed, 85 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/coherency-fabric.txt b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
>> index 2bfbf67..17d8cd1 100644
>> --- a/Documentation/devicetree/bindings/arm/coherency-fabric.txt
>> +++ b/Documentation/devicetree/bindings/arm/coherency-fabric.txt
>> @@ -5,12 +5,17 @@ Available on Marvell SOCs: Armada 370 and Armada XP
>> Required properties:
>>
>> - compatible: "marvell,coherency-fabric"
>> -- reg: Should contain,coherency fabric registers location and length.
>> +
>> +- reg: Should contain coherency fabric registers location and
>> + length. First pair for the coherency fabric registers, second pair
>> + for the per-CPU fabric registers registers.
>>
>> Example:
>>
>> coherency-fabric at d0020200 {
>> compatible = "marvell,coherency-fabric";
>> - reg = <0xd0020200 0xb0>;
>> + reg = <0xd0020200 0xb0>,
>> + <0xd0021810 0x1c>;
>> +
>> };
>>
>> diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi
>> index b0d075b..98a6b26 100644
>> --- a/arch/arm/boot/dts/armada-370-xp.dtsi
>> +++ b/arch/arm/boot/dts/armada-370-xp.dtsi
>> @@ -38,7 +38,8 @@
>>
>> coherency-fabric at d0020200 {
>> compatible = "marvell,coherency-fabric";
>> - reg = <0xd0020200 0xb0>;
>> + reg = <0xd0020200 0xb0>,
>> + <0xd0021810 0x1c>;
>> };
>>
>> soc {
>> diff --git a/arch/arm/mach-mvebu/addr-map.c b/arch/arm/mach-mvebu/addr-map.c
>> index fe454a4..595f6b7 100644
>> --- a/arch/arm/mach-mvebu/addr-map.c
>> +++ b/arch/arm/mach-mvebu/addr-map.c
>> @@ -108,6 +108,9 @@ static int __init armada_setup_cpu_mbus(void)
>>
>> addr_map_cfg.bridge_virt_base = mbus_unit_addr_decoding_base;
>>
>> + if (of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric"))
>> + addr_map_cfg.hw_io_coherency = 1;
>> +
>> /*
>> * Disable, clear and configure windows.
>> */
>> diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
>> index 20a0ccc..153fcfa 100644
>> --- a/arch/arm/mach-mvebu/coherency.c
>> +++ b/arch/arm/mach-mvebu/coherency.c
>> @@ -22,6 +22,8 @@
>> #include <linux/of_address.h>
>> #include <linux/io.h>
>> #include <linux/smp.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/platform_device.h>
>> #include <asm/smp_plat.h>
>> #include "armada-370-xp.h"
>>
>> @@ -32,11 +34,14 @@
>> * value matching its virtual mapping
>> */
>> static void __iomem *coherency_base = ARMADA_370_XP_REGS_VIRT_BASE + 0x20200;
>> +static void __iomem *coherency_cpu_base;
>>
>> /* Coherency fabric registers */
>> #define COHERENCY_FABRIC_CTL_OFFSET 0x0
>> #define COHERENCY_FABRIC_CFG_OFFSET 0x4
>>
>> +#define IO_SYNC_BARRIER_CTL_OFFSET 0x0
>> +
>> static struct of_device_id of_coherency_table[] = {
>> {.compatible = "marvell,coherency-fabric"},
>> { /* end of list */ },
>> @@ -75,6 +80,70 @@ int set_cpu_coherent(unsigned int hw_cpu_id, int smp_group_id)
>> return 0;
>> }
>>
>> +static inline void mvebu_hwcc_sync_io_barrier(void)
>> +{
>> + writel(0x1, coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET);
>> + while (readl(coherency_cpu_base + IO_SYNC_BARRIER_CTL_OFFSET) & 0x1);
>> +}
>> +
>> +static dma_addr_t mvebu_hwcc_dma_map_page(struct device *dev, struct page *page,
>> + unsigned long offset, size_t size,
>> + enum dma_data_direction dir,
>> + struct dma_attrs *attrs)
>> +{
>> + if (dir != DMA_TO_DEVICE)
>> + mvebu_hwcc_sync_io_barrier();
>> + return pfn_to_dma(dev, page_to_pfn(page)) + offset;
>> +}
>> +
>> +
>> +static void mvebu_hwcc_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
>> + size_t size, enum dma_data_direction dir,
>> + struct dma_attrs *attrs)
>> +{
>> + if (dir != DMA_TO_DEVICE)
>> + mvebu_hwcc_sync_io_barrier();
>> +}
>> +
>> +static void mvebu_hwcc_dma_sync(struct device *dev, dma_addr_t dma_handle,
>> + size_t size, enum dma_data_direction dir)
>> +{
>> + if (dir != DMA_TO_DEVICE)
>> + mvebu_hwcc_sync_io_barrier();
>> +}
>> +
>> +static struct dma_map_ops mvebu_hwcc_dma_ops = {
>> + .alloc = arm_coherent_dma_alloc,
>> + .free = arm_coherent_dma_free,
>
> Are you sure that arm_coherent_dma_{alloc,free} are right functions for Your
> architecture? If I understand right, You need to do implicit synchronization
> (io barrier) between CPU transactions and device transactions.
>
> dma_alloc_coherent() provides memory which can be used simultaneously by
> both
> CPU and devices, so without such barrier the memory won't be coherent.
>
> IMHO You should use arm_dma_{alloc,free} functions as Your hardware is not
> truly coherent. Then Your mvebu_hwcc_dma_ops will look very similar to
> dmabounce_ops from arch/arm/common/dmabounce.c (custom functions only for
> map/unmap page and sync_single_for_cpu/device).
You are totally right. In our first internal version based on older
kernel (3.2 or 3.4). We had set arch_is_coherent() to 1, and added
some hook in __dma_single_cpu_to_dev(), __dma_single_dev_to_cpu(),
__dma_page_cpu_to_dev() and __dma_page_dev_to_cpu(). So when
arch_is_coherent() were removed and when we switched to dma_ops, I had
assumed that we set the architecture as coherent modulo the modified
function. But I didn't realize that in this older kernel there were no
functions for coherent_alloc for arm. So it was wrong to use
the new arm_coherent_dma_alloc.
I made the change you have suggested and I will sent a new version
very soon.
Thanks for you review!
>
>> + .mmap = arm_dma_mmap,
>> + .unmap_page = mvebu_hwcc_dma_unmap_page,
>
> Please reorder entries to get map and unmap together.
OK, I will.
Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-11-20 21:01 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-16 9:44 [PATCH V2 0/3] Add hardware I/O coherency support for Armada 370/XP Gregory CLEMENT
2012-11-16 9:44 ` Gregory CLEMENT
2012-11-16 9:44 ` [PATCH V2 1/3] arm: dma mapping: Export dma ops functions Gregory CLEMENT
2012-11-16 9:44 ` Gregory CLEMENT
2012-11-19 10:00 ` Gregory CLEMENT
2012-11-16 9:44 ` [PATCH V2 2/3] arm: plat-orion: Add coherency attribute when setup mbus target Gregory CLEMENT
2012-11-16 9:44 ` Gregory CLEMENT
2012-11-16 9:45 ` [PATCH V2 3/3] arm: mvebu: Add hardware I/O Coherency support Gregory CLEMENT
2012-11-16 9:45 ` Gregory CLEMENT
2012-11-19 12:50 ` Marek Szyprowski
2012-11-20 21:01 ` Gregory CLEMENT
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.