devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv5 0/4] ARM: implement workaround for Cortex-A9/PL310/PCIe deadlock
@ 2014-05-19  8:13 Thomas Petazzoni
       [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19  8:13 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Rob Herring,
	Arnd Bergmann
  Cc: Albin Tonnerre, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia,
	Thomas Petazzoni

Russell, Will, Catalin,

This patch series adresses a problem that affects the newer Marvell
Armada 375 and 38x SOCs, based on Cortex-A9+PL310, combined with the
Marvell PCIe hardware unit. When the hardware I/O coherency is
enabled, the combination of Cortex-A9/PL310/Marvell PCIe hardware unit
will quickly cause a deadlock when the PCIe bus is stressed.

The workaround for this problem has been suggested by ARM, and
consists in two things:

 (1) Map the PCIe regions as strongly-ordered

 (2) Disable the outer cache sync of the PL310 when hardware I/O
     coherency is used, since it is unneeded and causes the deadlock.

The following four patches address the problem in the following way:

 * PATCH 1/4 adds a small API in arch/arm/mm/ioremap.c to allow
   sub-architectures to override the memory type used for PCI I/O
   mappings.

 * PATCH 2/4 extends the l2x0 cache driver with a new property
   "arm,io-coherent", valid for the PL310, which makes the driver
   disable the outer cache sync operation. This patch should be routed
   through Russell's tree.

 * PATCH 3/4 actually implements the Armada 375/38x workaround, by
   using MT_UNCACHED for PCI memory mappings, and adding the
   "arm,io-coherent" property to the cache controller Device Tree node
   when appropriate (i.e, when hardware I/O coherency is
   enabled). This patch has no build dependency on the two previous
   patches. It has already been merged by the mvebu maintainers.

 * PATCH 4/4 uses the API introduced in PATCH 1/4 to map PCI I/O
   regions as strongly ordered. It should also go through the mvebu
   maintainers tree, but was kept separate from PATCH 3/4, since 1/
   PATCH 3 has already been merged and 2/ PATCH 4 has a build
   dependency on PATCH 1, which may cause a delay in merging of PATCH
   4.

Changes since v4:

 - Re-introduce the patch to allow sub-architectures to override the
   memory type used for PCI I/O mappings, since switching to
   strongly-ordered for all platforms does not seem to be well
   accepted/understood at this point.

 - Remove the of_device_is_compatible() check for the PL310, when
   testing for 'arm,io-coherent'. Suggested by Rob Herring. However,
   the code tetsing 'arm,io-coherent' cannot be moved into
   pl310_of_setup(), because this function is called *before* the
   'outer_cache' structure is initialized.

 - Add a separate patch to use the pci_ioremap_set_mem_type() API in
   mach-mvebu/coherency.c.

Changes since v3:

 - Withdrawn all Acked-by tags since the changes compared to v3 are
   quite significant.

 - Instead of introducing a small mechanism to allow each
   sub-architecture to override the memory type used for PCI I/O
   mappings, simply make all of them mapped MT_UNCACHED instead of
   MT_DEVICE, as suggested by Arnd Bergmann. This also has the nice
   consequence that there is no longer a build dependency between
   PATCH 3/3 and PATCH 1/3. Suggested by Arnd Bergmann.

 - Change the name of the new property of the PL310 DT binding from
   the too generic 'dma-coherent' to 'arm,io-coherent'. Suggested by
   Rob Herring.

 - Instead of adding a complete set of L2 cache operations in
   cache-l2x0.c, simply nullify the outer_cache.sync operation when
   'arm,io-coherent' is specified. Suggested by Rob Herring.

 - Move the Armada 375/38x specific code from mach-mvebu/board-v7.c to
   mach-mvebu/coherency.c, which makes more sense. Suggested by Arnd
   Bergmann.

Changes since v2:

 - Added Acked-by from Catalin on "ARM: mm: allow sub-architectures to
   override PCI I/O memory type".

 - Dropped the patch fixing the of_update_property() function, since
   we're no longer using it.

 - Instead of using a different compatible string to identify PL310
   used in an I/O coherent configuration, use a separate boolean
   property. Suggested by Catalin.

 - Rework the mach-mvebu/coherency.c to add the boolean property
   "dma-coherent" when needed instead of updating the compatible
   string of the cache controller.

Changes since v1:

 - Instead of introducing separate l2x0 initialization functions, rely
   on a separate compatible string to identify whether we're coherent
   or not. The compatible string *has* to be modified at runtime,
   because Armada 375 and 38x are only I/O coherent when in SMP
   mode. In non-SMP mode, they are not I/O coherent, so we cannot
   change the DT to 'arm,pl310-coherent-cache'.

 - Addition of the drivers/of fix to be able to use
   of_update_property() early and fix up the PL310 compatible string,
   as explained in the previous item.

Thanks!

Thomas

Thomas Petazzoni (4):
  ARM: mm: allow sub-architectures to override PCI I/O memory type
  ARM: mm: add support for HW coherent systems in PL310
  ARM: mvebu: implement L2/PCIe deadlock workaround
  ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly
    ordered

 Documentation/devicetree/bindings/arm/l2cc.txt |  3 ++
 arch/arm/include/asm/io.h                      |  6 ++++
 arch/arm/mach-mvebu/coherency.c                | 40 ++++++++++++++++++++++++++
 arch/arm/mm/cache-l2x0.c                       | 13 +++++++++
 arch/arm/mm/ioremap.c                          |  9 +++++-
 5 files changed, 70 insertions(+), 1 deletion(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCHv5 1/4] ARM: mm: allow sub-architectures to override PCI I/O memory type
       [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
@ 2014-05-19  8:13   ` Thomas Petazzoni
  2014-05-19  8:13   ` [PATCHv5 2/4] ARM: mm: add support for HW coherent systems in PL310 Thomas Petazzoni
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19  8:13 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Rob Herring,
	Arnd Bergmann
  Cc: Albin Tonnerre, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia,
	Thomas Petazzoni

Due to a design incompatibility between the PCIe Marvell controller
and the Cortex-A9, stressing PCIe devices with a lot of traffic
quickly causes a deadlock.

One part of the workaround for this is to have all PCIe regions mapped
as strongly-ordered (MT_UNCACHED) instead of the default
MT_DEVICE. While the arch_ioremap_caller() mechanism allows
sub-architecture code to override ioremap(), used to map PCIe memory
regions, there isn't such a mechanism to override the behavior of
pci_ioremap_io().

This commit adds the arch_pci_ioremap_mem_type variable, initialized
to MT_DEVICE by default, and that sub-architecture code can
override. We have chosen to expose a single variable rather than
offering the possibility of overriding the entire pci_ioremap_io(),
because implementing pci_ioremap_io() requires calling functions
(get_mem_type()) that are private to the arch/arm/mm/ code.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
Acked-by: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
---
 arch/arm/include/asm/io.h | 6 ++++++
 arch/arm/mm/ioremap.c     | 9 ++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 8aa4cca..3d23418 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -179,6 +179,12 @@ static inline void __iomem *__typesafe_io(unsigned long addr)
 /* PCI fixed i/o mapping */
 #define PCI_IO_VIRT_BASE	0xfee00000
 
+#if defined(CONFIG_PCI)
+void pci_ioremap_set_mem_type(int mem_type);
+#else
+static inline void pci_ioremap_set_mem_type(int mem_type) {}
+#endif
+
 extern int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr);
 
 /*
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index f9c32ba..d1e5ad7 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -438,6 +438,13 @@ void __arm_iounmap(volatile void __iomem *io_addr)
 EXPORT_SYMBOL(__arm_iounmap);
 
 #ifdef CONFIG_PCI
+static int pci_ioremap_mem_type = MT_DEVICE;
+
+void pci_ioremap_set_mem_type(int mem_type)
+{
+	pci_ioremap_mem_type = mem_type;
+}
+
 int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
 {
 	BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);
@@ -445,7 +452,7 @@ int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
 	return ioremap_page_range(PCI_IO_VIRT_BASE + offset,
 				  PCI_IO_VIRT_BASE + offset + SZ_64K,
 				  phys_addr,
-				  __pgprot(get_mem_type(MT_DEVICE)->prot_pte));
+				  __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte));
 }
 EXPORT_SYMBOL_GPL(pci_ioremap_io);
 #endif
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv5 2/4] ARM: mm: add support for HW coherent systems in PL310
       [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  2014-05-19  8:13   ` [PATCHv5 1/4] ARM: mm: allow sub-architectures to override PCI I/O memory type Thomas Petazzoni
@ 2014-05-19  8:13   ` Thomas Petazzoni
       [not found]     ` <1400487234-4501-3-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  2014-05-19  8:13   ` [PATCHv5 3/4] ARM: mvebu: implement L2/PCIe deadlock workaround Thomas Petazzoni
  2014-05-19  8:13   ` [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered Thomas Petazzoni
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19  8:13 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Rob Herring,
	Arnd Bergmann
  Cc: Albin Tonnerre, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia,
	Thomas Petazzoni

When a PL310 cache is used on a system that provides hardware
coherency, the outer cache sync operation is useless, and can be
skipped. Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
controller and the Cortex-A9.

To avoid this, this commit introduces a new Device Tree property
'arm,io-coherent' for the L2 cache controller node, valid only for the
PL310 cache. It identifies the usage of the PL310 cache in an I/O
coherent configuration. Internally, it makes the driver disable the
outer cache sync operation.

Note that technically speaking, a fully coherent system wouldn't
require any of the other .outer_cache operations. However, in
practice, when booting secondary CPUs, these are not yet coherent, and
therefore a set of cache maintenance operations are necessary at this
point. This explains why we keep the other .outer_cache operations and
only ->sync is disabled.

While in theory any write to a PL310 register could cause the
deadlock, in practice, disabling ->sync is sufficient to workaround
the deadlock, since the other cache maintenance operations are only
used in very specific situations.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
 Documentation/devicetree/bindings/arm/l2cc.txt |  3 +++
 arch/arm/mm/cache-l2x0.c                       | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/l2cc.txt b/Documentation/devicetree/bindings/arm/l2cc.txt
index b513cb8..af527ee 100644
--- a/Documentation/devicetree/bindings/arm/l2cc.txt
+++ b/Documentation/devicetree/bindings/arm/l2cc.txt
@@ -40,6 +40,9 @@ Optional properties:
 - arm,filter-ranges : <start length> Starting address and length of window to
   filter. Addresses in the filter window are directed to the M1 port. Other
   addresses will go to the M0 port.
+- arm,io-coherent : indicates that the system is operating in an hardware
+  I/O coherent mode. Valid only when the arm,pl310-cache compatible
+  string is used.
 - interrupts : 1 combined interrupt.
 - cache-id-part: cache id part number to be used if it is not present
   on hardware
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 7abde2ce..30f4476 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -1005,6 +1005,19 @@ int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
 
 	of_init = true;
 	memcpy(&outer_cache, &data->outer_cache, sizeof(outer_cache));
+
+	/*
+	 * outer sync operations are not needed when the system is I/O
+	 * coherent, and potentially harmful in certain situations
+	 * (PCIe/PL310 deadlock on Armada 375/38x due to hardware I/O
+	 * coherency). The other operations are kept because they are
+	 * infrequent (therefore do not cause the deadlock) and needed
+	 * for secondary CPU boot and other power management
+	 * activities.
+	 */
+	if (of_property_read_bool(np, "arm,io-coherent"))
+		outer_cache.sync = NULL;
+
 	l2x0_init(l2x0_base, aux_val, aux_mask);
 
 	return 0;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv5 3/4] ARM: mvebu: implement L2/PCIe deadlock workaround
       [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  2014-05-19  8:13   ` [PATCHv5 1/4] ARM: mm: allow sub-architectures to override PCI I/O memory type Thomas Petazzoni
  2014-05-19  8:13   ` [PATCHv5 2/4] ARM: mm: add support for HW coherent systems in PL310 Thomas Petazzoni
@ 2014-05-19  8:13   ` Thomas Petazzoni
       [not found]     ` <1400487234-4501-4-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  2014-05-19  8:13   ` [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered Thomas Petazzoni
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19  8:13 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Rob Herring,
	Arnd Bergmann
  Cc: Albin Tonnerre, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia,
	Thomas Petazzoni

The Marvell Armada 375 and Armada 38x SOCs, which use the Cortex-A9
CPU core, the PL310 cache and the Marvell PCIe hardware block are
affected a L2/PCIe deadlock caused by a system erratum when hardware
I/O coherency is used.

This deadlock can be avoided by mapping the PCIe memory areas as
strongly-ordered (note: MT_UNCACHED is strongly-ordered), and by
removing the outer cache sync done in software. This is implemented in
this patch by:

 * Registering a custom arch_ioremap_caller function that allows to
   make sure PCI memory regions are mapped MT_UNCACHED.

 * Adding at runtime the 'arm,io-coherent' property to the PL310 cache
   controller. This cannot be done permanently in the DT, because the
   hardware I/O coherency can only be enabled when CONFIG_SMP is
   enabled, in the current kernel situation.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
 arch/arm/mach-mvebu/coherency.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index d5a975b..f6be9c6 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -31,6 +31,7 @@
 #include <linux/clk.h>
 #include <asm/smp_plat.h>
 #include <asm/cacheflush.h>
+#include <asm/mach/map.h>
 #include "armada-370-xp.h"
 #include "coherency.h"
 #include "mvebu-soc-id.h"
@@ -308,9 +309,47 @@ static void __init armada_370_coherency_init(struct device_node *np)
 	set_cpu_coherent();
 }
 
+/*
+ * This ioremap hook is used on Armada 375/38x to ensure that PCIe
+ * memory areas are mapped as MT_UNCACHED instead of MT_DEVICE. This
+ * is needed as a workaround for a deadlock issue between the PCIe
+ * interface and the cache controller.
+ */
+static void __iomem *
+armada_pcie_wa_ioremap_caller(phys_addr_t phys_addr, size_t size,
+			      unsigned int mtype, void *caller)
+{
+	struct resource pcie_mem;
+
+	mvebu_mbus_get_pcie_mem_aperture(&pcie_mem);
+
+	if (pcie_mem.start <= phys_addr && (phys_addr + size) <= pcie_mem.end)
+		mtype = MT_UNCACHED;
+
+	return __arm_ioremap_caller(phys_addr, size, mtype, caller);
+}
+
 static void __init armada_375_380_coherency_init(struct device_node *np)
 {
+	struct device_node *cache_dn;
+
 	coherency_cpu_base = of_iomap(np, 0);
+	arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
+
+	/*
+	 * Add the PL310 property "arm,io-coherent". This makes sure the
+	 * outer sync operation is not used, which allows to
+	 * workaround the system erratum that causes deadlocks when
+	 * doing PCIe in an SMP situation on Armada 375 and Armada
+	 * 38x.
+	 */
+	for_each_compatible_node(cache_dn, NULL, "arm,pl310-cache") {
+		struct property *p;
+
+		p = kzalloc(sizeof(*p), GFP_KERNEL);
+		p->name = kstrdup("arm,io-coherent", GFP_KERNEL);
+		of_add_property(cache_dn, p);
+	}
 }
 
 static int coherency_type(void)
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered
       [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-05-19  8:13   ` [PATCHv5 3/4] ARM: mvebu: implement L2/PCIe deadlock workaround Thomas Petazzoni
@ 2014-05-19  8:13   ` Thomas Petazzoni
       [not found]     ` <1400487234-4501-5-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  3 siblings, 1 reply; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19  8:13 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Rob Herring,
	Arnd Bergmann
  Cc: Albin Tonnerre, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia,
	Thomas Petazzoni

Part of the workaround for the PCIe/SMP/PL310 deadlock on Armada
375/38x is to map PCI mappings strongly ordered. Mapping PCI memory
regions as strongly ordered was already done thanks to the
arch_ioremap_caller mechanism. This patch does the same for the PCI
I/O regions by using the new pci_ioremap_set_mem_type() function.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
This patch is kept separate from the rest of the
mach-mvebu/coherency.c code for the workaround, as this patch has a
build dependency on the new API. Since the new API patch will go
through Russell's tree, and this patch through the mvebu tree, there
might be some merging issue, or even the need to delay the merging of
this patch.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
 arch/arm/mach-mvebu/coherency.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index f6be9c6..0215614 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -335,6 +335,7 @@ static void __init armada_375_380_coherency_init(struct device_node *np)
 
 	coherency_cpu_base = of_iomap(np, 0);
 	arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
+	pci_ioremap_set_mem_type(MT_UNCACHED);
 
 	/*
 	 * Add the PL310 property "arm,io-coherent". This makes sure the
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv5 2/4] ARM: mm: add support for HW coherent systems in PL310
       [not found]     ` <1400487234-4501-3-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
@ 2014-05-19  9:37       ` Catalin Marinas
  0 siblings, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2014-05-19  9:37 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Russell King, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Grant Likely,
	Rob Herring, Arnd Bergmann, Albin Tonnerre,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia

On Mon, May 19, 2014 at 09:13:52AM +0100, Thomas Petazzoni wrote:
> When a PL310 cache is used on a system that provides hardware
> coherency, the outer cache sync operation is useless, and can be
> skipped. Moreover, on some systems, it is harmful as it causes
> deadlocks between the Marvell coherency mechanism, the Marvell PCIe
> controller and the Cortex-A9.
> 
> To avoid this, this commit introduces a new Device Tree property
> 'arm,io-coherent' for the L2 cache controller node, valid only for the
> PL310 cache. It identifies the usage of the PL310 cache in an I/O
> coherent configuration. Internally, it makes the driver disable the
> outer cache sync operation.
> 
> Note that technically speaking, a fully coherent system wouldn't
> require any of the other .outer_cache operations. However, in
> practice, when booting secondary CPUs, these are not yet coherent, and
> therefore a set of cache maintenance operations are necessary at this
> point. This explains why we keep the other .outer_cache operations and
> only ->sync is disabled.
> 
> While in theory any write to a PL310 register could cause the
> deadlock, in practice, disabling ->sync is sufficient to workaround
> the deadlock, since the other cache maintenance operations are only
> used in very specific situations.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>

Acked-by: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered
       [not found]     ` <1400487234-4501-5-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
@ 2014-05-19  9:59       ` Catalin Marinas
       [not found]         ` <20140519095949.GD5113-5wv7dgnIgG8@public.gmane.org>
  2014-05-19 10:09       ` Catalin Marinas
  1 sibling, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2014-05-19  9:59 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Russell King, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Grant Likely,
	Rob Herring, Arnd Bergmann, Albin Tonnerre,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia

On Mon, May 19, 2014 at 09:13:54AM +0100, Thomas Petazzoni wrote:
> Part of the workaround for the PCIe/SMP/PL310 deadlock on Armada
> 375/38x is to map PCI mappings strongly ordered. Mapping PCI memory
> regions as strongly ordered was already done thanks to the
> arch_ioremap_caller mechanism. This patch does the same for the PCI
> I/O regions by using the new pci_ioremap_set_mem_type() function.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
> ---
>  arch/arm/mach-mvebu/coherency.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
> index f6be9c6..0215614 100644
> --- a/arch/arm/mach-mvebu/coherency.c
> +++ b/arch/arm/mach-mvebu/coherency.c
> @@ -335,6 +335,7 @@ static void __init armada_375_380_coherency_init(struct device_node *np)
>  
>  	coherency_cpu_base = of_iomap(np, 0);
>  	arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
> +	pci_ioremap_set_mem_type(MT_UNCACHED);

The patch is fine but for this to work in the UP case we need to fix
MT_UNCACHED definition for sections. It seems to create SO memory but
not necessarily writable (unless I miss something). Anyway, untested,
something like this:

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b68c6b22e1c8..db1bf8cb3a3e 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -267,7 +267,7 @@ static struct mem_type mem_types[] = {
 	[MT_UNCACHED] = {
 		.prot_pte	= PROT_PTE_DEVICE,
 		.prot_l1	= PMD_TYPE_TABLE,
-		.prot_sect	= PMD_TYPE_SECT | PMD_SECT_XN,
+		.prot_sect	= PROT_SECT_DEVICE,
 		.domain		= DOMAIN_IO,
 	},
 	[MT_CACHECLEAN] = {
@@ -461,6 +461,7 @@ static void __init build_mem_type_table(void)
 			mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_XN;
 			mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_XN;
 			mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_XN;
+			mem_types[MT_UNCACHED].prot_sect |= PMD_SECT_XN;
 
 			/* Also setup NX memory mapping */
 			mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_XN;

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv5 3/4] ARM: mvebu: implement L2/PCIe deadlock workaround
       [not found]     ` <1400487234-4501-4-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
@ 2014-05-19 10:08       ` Catalin Marinas
  0 siblings, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2014-05-19 10:08 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Russell King, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Grant Likely,
	Rob Herring, Arnd Bergmann, Albin Tonnerre,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia

On Mon, May 19, 2014 at 09:13:53AM +0100, Thomas Petazzoni wrote:
> The Marvell Armada 375 and Armada 38x SOCs, which use the Cortex-A9
> CPU core, the PL310 cache and the Marvell PCIe hardware block are
> affected a L2/PCIe deadlock caused by a system erratum when hardware
> I/O coherency is used.
> 
> This deadlock can be avoided by mapping the PCIe memory areas as
> strongly-ordered (note: MT_UNCACHED is strongly-ordered), and by
> removing the outer cache sync done in software. This is implemented in
> this patch by:
> 
>  * Registering a custom arch_ioremap_caller function that allows to
>    make sure PCI memory regions are mapped MT_UNCACHED.
> 
>  * Adding at runtime the 'arm,io-coherent' property to the PL310 cache
>    controller. This cannot be done permanently in the DT, because the
>    hardware I/O coherency can only be enabled when CONFIG_SMP is
>    enabled, in the current kernel situation.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>

Acked-by: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered
       [not found]     ` <1400487234-4501-5-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
  2014-05-19  9:59       ` Catalin Marinas
@ 2014-05-19 10:09       ` Catalin Marinas
  1 sibling, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2014-05-19 10:09 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Russell King, Will Deacon,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Grant Likely,
	Rob Herring, Arnd Bergmann, Albin Tonnerre,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
	Tawfik Bayouk, Nadav Haklai, Lior Amsalem, Ezequiel Garcia

On Mon, May 19, 2014 at 09:13:54AM +0100, Thomas Petazzoni wrote:
> Part of the workaround for the PCIe/SMP/PL310 deadlock on Armada
> 375/38x is to map PCI mappings strongly ordered. Mapping PCI memory
> regions as strongly ordered was already done thanks to the
> arch_ioremap_caller mechanism. This patch does the same for the PCI
> I/O regions by using the new pci_ioremap_set_mem_type() function.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>

Acked-by: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered
       [not found]         ` <20140519095949.GD5113-5wv7dgnIgG8@public.gmane.org>
@ 2014-05-19 11:41           ` Thomas Petazzoni
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Petazzoni @ 2014-05-19 11:41 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Lior Amsalem, devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Russell King, Jason Cooper, Arnd Bergmann, Andrew Lunn,
	Will Deacon, Grant Likely, Gregory Clement, Nadav Haklai,
	Rob Herring, Ezequiel Garcia, Albin Tonnerre, Tawfik Bayouk,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Sebastian Hesselbarth

Dear Catalin Marinas,

On Mon, 19 May 2014 10:59:49 +0100, Catalin Marinas wrote:

> The patch is fine but for this to work in the UP case we need to fix
> MT_UNCACHED definition for sections. It seems to create SO memory but
> not necessarily writable (unless I miss something). Anyway, untested,
> something like this:

I justed tested in !CONFIG_SMP on Armada 38x, and writing to a PCI I/O
mapping works in both the MT_UNCACHED and MT_DEVICE cases.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-05-19 11:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-19  8:13 [PATCHv5 0/4] ARM: implement workaround for Cortex-A9/PL310/PCIe deadlock Thomas Petazzoni
     [not found] ` <1400487234-4501-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2014-05-19  8:13   ` [PATCHv5 1/4] ARM: mm: allow sub-architectures to override PCI I/O memory type Thomas Petazzoni
2014-05-19  8:13   ` [PATCHv5 2/4] ARM: mm: add support for HW coherent systems in PL310 Thomas Petazzoni
     [not found]     ` <1400487234-4501-3-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2014-05-19  9:37       ` Catalin Marinas
2014-05-19  8:13   ` [PATCHv5 3/4] ARM: mvebu: implement L2/PCIe deadlock workaround Thomas Petazzoni
     [not found]     ` <1400487234-4501-4-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2014-05-19 10:08       ` Catalin Marinas
2014-05-19  8:13   ` [PATCHv5 4/4] ARM: mvebu: use pci_ioremap_set_mem_type() to map PCI I/O as strongly ordered Thomas Petazzoni
     [not found]     ` <1400487234-4501-5-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2014-05-19  9:59       ` Catalin Marinas
     [not found]         ` <20140519095949.GD5113-5wv7dgnIgG8@public.gmane.org>
2014-05-19 11:41           ` Thomas Petazzoni
2014-05-19 10:09       ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).