From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches
Date: Fri, 6 Jul 2018 13:26:11 +0100 [thread overview]
Message-ID: <20180706122610.GA6993@arm.com> (raw)
In-Reply-To: <20180702174552.GD23687@arm.com>
Hi Lucas,
On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote:
> On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote:
> > Also the section "External caches", "Before AMBA4 ACE" seems to suggest
> > that the barriers aren't fully propagated to the PL310 write-caches and
> > master ports. I'm unsure if this is just an artifact of the mentioned
> > MMIO access, so handling of normal vs. device memory transactions in
> > the PL310.
>
> I'll check with the hardware folks about the PL310, but I don't see how
> you could propagate barriers over AXI3 anyway because I don't think this
> stuff existed back then.
Damn. As I feared, you can't propagate the barrier to the PL310, so the
A9 is unable to enforce ordering beyond the responses it gets back. The
PL310 will then re-order normal memory accesses (including non-cacheable)
in its store buffer, so we do actually need a bloody CACHE SYNC here.
Patch below. Does it help?
Will
--->8
>From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Wed, 4 Jul 2018 10:56:22 +0100
Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb()
Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier
transactions or shareability domains and was therefore unable to enforce
ordering between memory accesses either side of a barrier instruction,
requiring the CPU to handle all of the ordering itself.
Unfortunately, IP such as the venerable PL310 will re-order stores to
Normal memory inside its store-buffer, meaning that accesses to a coherent
DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed
out-of-order by a non-coherent DMA engine unless the PL310 is kicked with
a CACHE_SYNC command between buffering the accesses. Whilst this is already
the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading
to sequences such as:
STR Rdata, [Rbuf, #DATA_OFFSET]
DMB OSHST
STR Rvalid, [Rbuf, #VALID_OFFSET]
being reordered such that the DMA device sees the write of valid but not
the write of data.
This patch ensures that the SoC fabric and the outer-cache are kicked
by dma_wmb() if they are able to buffer transactions outside of the CPU
but before the Point of Coherency.
Reported-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/include/asm/barrier.h | 14 ++++++++++----
arch/arm/include/asm/outercache.h | 6 ++++++
arch/arm/mm/flush.c | 24 +++++++++++++++++++++++-
3 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 69772e742a0a..e31833f72b69 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -53,17 +53,23 @@
#ifdef CONFIG_ARM_HEAVY_MB
extern void (*soc_mb)(void);
extern void arm_heavy_mb(void);
-#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0)
+extern void arm_heavy_wmb(void);
+extern void arm_heavy_dma_wmb(void);
+#define __arm_heavy_mb() arm_heavy_mb()
+#define __arm_heavy_wmb() arm_heavy_wmb()
+#define __arm_heavy_dma_wmb() arm_heavy_dma_wmb()
#else
-#define __arm_heavy_mb(x...) dsb(x)
+#define __arm_heavy_mb() dsb()
+#define __arm_heavy_wmb() dsb(st)
+#define __arm_heavy_dma_wmb() dmb(oshst)
#endif
#if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
#define mb() __arm_heavy_mb()
#define rmb() dsb()
-#define wmb() __arm_heavy_mb(st)
+#define wmb() __arm_heavy_wmb()
#define dma_rmb() dmb(osh)
-#define dma_wmb() dmb(oshst)
+#define dma_wmb() __arm_heavy_dma_wmb()
#else
#define mb() barrier()
#define rmb() barrier()
diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index c2bf24f40177..e744efe6e2a7 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -43,6 +43,12 @@ struct outer_cache_fns {
extern struct outer_cache_fns outer_cache;
+#ifdef CONFIG_OUTER_CACHE_SYNC
+static inline bool outer_cache_has_sync(void) { return outer_cache.sync; }
+#else
+static inline bool outer_cache_has_sync(void) { return 0; }
+#endif
+
#ifdef CONFIG_OUTER_CACHE
/**
* outer_inv_range - invalidate range of outer cache lines
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 58469623b015..1252689362d8 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -24,7 +24,7 @@
#ifdef CONFIG_ARM_HEAVY_MB
void (*soc_mb)(void);
-void arm_heavy_mb(void)
+static void __arm_heavy_sync(void)
{
#ifdef CONFIG_OUTER_CACHE_SYNC
if (outer_cache.sync)
@@ -33,7 +33,29 @@ void arm_heavy_mb(void)
if (soc_mb)
soc_mb();
}
+
+void arm_heavy_mb(void)
+{
+ dsb();
+ __arm_heavy_sync();
+}
EXPORT_SYMBOL(arm_heavy_mb);
+
+void arm_heavy_wmb(void)
+{
+ dsb(st);
+ __arm_heavy_sync();
+}
+EXPORT_SYMBOL(arm_heavy_wmb);
+
+void arm_heavy_dma_wmb(void)
+{
+ if (outer_cache_has_sync() || soc_mb)
+ arm_heavy_wmb();
+ else
+ dmb(oshst);
+}
+EXPORT_SYMBOL(arm_heavy_dma_wmb);
#endif
#ifdef CONFIG_CPU_CACHE_VIPT
--
2.1.4
next prev parent reply other threads:[~2018-07-06 12:26 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-29 12:28 Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Lucas Stach
2018-06-29 14:25 ` Russell King - ARM Linux
2018-06-29 16:19 ` Lucas Stach
2018-06-29 16:22 ` Will Deacon
2018-06-29 16:48 ` Russell King - ARM Linux
2018-06-29 17:43 ` Will Deacon
2018-06-29 18:01 ` Russell King - ARM Linux
2018-07-02 13:49 ` Lucas Stach
2018-07-02 17:45 ` Will Deacon
2018-07-06 12:26 ` Will Deacon [this message]
2018-07-09 6:20 ` Oleksij Rempel
2018-09-13 13:17 ` Will Deacon
2018-09-13 14:09 ` Oleksij Rempel
2018-07-09 9:45 ` Lucas Stach
2018-06-29 17:14 ` Lucas Stach
2018-06-29 17:46 ` Will Deacon
2018-07-02 9:58 ` Lucas Stach
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180706122610.GA6993@arm.com \
--to=will.deacon@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.