linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches
Date: Thu, 13 Sep 2018 14:17:12 +0100	[thread overview]
Message-ID: <20180913131711.GB13242@arm.com> (raw)
In-Reply-To: <72f67412-a011-6552-03be-6730a9302ebb@pengutronix.de>

On Mon, Jul 09, 2018 at 08:20:22AM +0200, Oleksij Rempel wrote:
> On 06.07.2018 14:26, Will Deacon wrote:
> > On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote:
> >> On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote:
> >>> Also the section "External caches", "Before AMBA4 ACE" seems to suggest
> >>> that the barriers aren't fully propagated to the PL310 write-caches and
> >>> master ports. I'm unsure if this is just an artifact of the mentioned
> >>> MMIO access, so handling of normal vs. device memory transactions in
> >>> the PL310.
> >>
> >> I'll check with the hardware folks about the PL310, but I don't see how
> >> you could propagate barriers over AXI3 anyway because I don't think this
> >> stuff existed back then.
> > 
> > Damn. As I feared, you can't propagate the barrier to the PL310, so the
> > A9 is unable to enforce ordering beyond the responses it gets back. The
> > PL310 will then re-order normal memory accesses (including non-cacheable)
> > in its store buffer, so we do actually need a bloody CACHE SYNC here.
> > 
> > Patch below. Does it help?
> 
> Suddenly no.

I just noticed that I'm still carrying this patch in my tree. Are you saying
it doesn't help on your system? It all went quiet, so I assume you fixed
this issue one way or the other?

Will

> > From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001
> > From: Will Deacon <will.deacon@arm.com>
> > Date: Wed, 4 Jul 2018 10:56:22 +0100
> > Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb()
> > 
> > Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier
> > transactions or shareability domains and was therefore unable to enforce
> > ordering between memory accesses either side of a barrier instruction,
> > requiring the CPU to handle all of the ordering itself.
> > 
> > Unfortunately, IP such as the venerable PL310 will re-order stores to
> > Normal memory inside its store-buffer, meaning that accesses to a coherent
> > DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed
> > out-of-order by a non-coherent DMA engine unless the PL310 is kicked with
> > a CACHE_SYNC command between buffering the accesses. Whilst this is already
> > the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading
> > to sequences such as:
> > 
> > 	STR	Rdata, [Rbuf, #DATA_OFFSET]
> > 	DMB	OSHST
> > 	STR	Rvalid, [Rbuf, #VALID_OFFSET]
> > 
> > being reordered such that the DMA device sees the write of valid but not
> > the write of data.
> > 
> > This patch ensures that the SoC fabric and the outer-cache are kicked
> > by dma_wmb() if they are able to buffer transactions outside of the CPU
> > but before the Point of Coherency.
> > 
> > Reported-by: Lucas Stach <l.stach@pengutronix.de>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm/include/asm/barrier.h    | 14 ++++++++++----
> >  arch/arm/include/asm/outercache.h |  6 ++++++
> >  arch/arm/mm/flush.c               | 24 +++++++++++++++++++++++-
> >  3 files changed, 39 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> > index 69772e742a0a..e31833f72b69 100644
> > --- a/arch/arm/include/asm/barrier.h
> > +++ b/arch/arm/include/asm/barrier.h
> > @@ -53,17 +53,23 @@
> >  #ifdef CONFIG_ARM_HEAVY_MB
> >  extern void (*soc_mb)(void);
> >  extern void arm_heavy_mb(void);
> > -#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0)
> > +extern void arm_heavy_wmb(void);
> > +extern void arm_heavy_dma_wmb(void);
> > +#define __arm_heavy_mb()	arm_heavy_mb()
> > +#define __arm_heavy_wmb()	arm_heavy_wmb()
> > +#define __arm_heavy_dma_wmb()	arm_heavy_dma_wmb()
> >  #else
> > -#define __arm_heavy_mb(x...) dsb(x)
> > +#define __arm_heavy_mb()	dsb()
> > +#define __arm_heavy_wmb()	dsb(st)
> > +#define __arm_heavy_dma_wmb()	dmb(oshst)
> >  #endif
> >  
> >  #if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
> >  #define mb()		__arm_heavy_mb()
> >  #define rmb()		dsb()
> > -#define wmb()		__arm_heavy_mb(st)
> > +#define wmb()		__arm_heavy_wmb()
> >  #define dma_rmb()	dmb(osh)
> > -#define dma_wmb()	dmb(oshst)
> > +#define dma_wmb()	__arm_heavy_dma_wmb()
> >  #else
> >  #define mb()		barrier()
> >  #define rmb()		barrier()
> > diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
> > index c2bf24f40177..e744efe6e2a7 100644
> > --- a/arch/arm/include/asm/outercache.h
> > +++ b/arch/arm/include/asm/outercache.h
> > @@ -43,6 +43,12 @@ struct outer_cache_fns {
> >  
> >  extern struct outer_cache_fns outer_cache;
> >  
> > +#ifdef CONFIG_OUTER_CACHE_SYNC
> > +static inline bool outer_cache_has_sync(void) { return outer_cache.sync; }
> > +#else
> > +static inline bool outer_cache_has_sync(void) { return 0; }
> > +#endif
> > +
> >  #ifdef CONFIG_OUTER_CACHE
> >  /**
> >   * outer_inv_range - invalidate range of outer cache lines
> > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> > index 58469623b015..1252689362d8 100644
> > --- a/arch/arm/mm/flush.c
> > +++ b/arch/arm/mm/flush.c
> > @@ -24,7 +24,7 @@
> >  #ifdef CONFIG_ARM_HEAVY_MB
> >  void (*soc_mb)(void);
> >  
> > -void arm_heavy_mb(void)
> > +static void __arm_heavy_sync(void)
> >  {
> >  #ifdef CONFIG_OUTER_CACHE_SYNC
> >  	if (outer_cache.sync)
> > @@ -33,7 +33,29 @@ void arm_heavy_mb(void)
> >  	if (soc_mb)
> >  		soc_mb();
> >  }
> > +
> > +void arm_heavy_mb(void)
> > +{
> > +	dsb();
> > +	__arm_heavy_sync();
> > +}
> >  EXPORT_SYMBOL(arm_heavy_mb);
> > +
> > +void arm_heavy_wmb(void)
> > +{
> > +	dsb(st);
> > +	__arm_heavy_sync();
> > +}
> > +EXPORT_SYMBOL(arm_heavy_wmb);
> > +
> > +void arm_heavy_dma_wmb(void)
> > +{
> > +	if (outer_cache_has_sync() || soc_mb)
> > +		arm_heavy_wmb();
> > +	else
> > +		dmb(oshst);
> > +}
> > +EXPORT_SYMBOL(arm_heavy_dma_wmb);
> >  #endif
> >  
> >  #ifdef CONFIG_CPU_CACHE_VIPT
> > 
> 

  reply	other threads:[~2018-09-13 13:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 12:28 Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Lucas Stach
2018-06-29 14:25 ` Russell King - ARM Linux
2018-06-29 16:19   ` Lucas Stach
2018-06-29 16:22   ` Will Deacon
2018-06-29 16:48     ` Russell King - ARM Linux
2018-06-29 17:43       ` Will Deacon
2018-06-29 18:01         ` Russell King - ARM Linux
2018-07-02 13:49         ` Lucas Stach
2018-07-02 17:45           ` Will Deacon
2018-07-06 12:26             ` Will Deacon
2018-07-09  6:20               ` Oleksij Rempel
2018-09-13 13:17                 ` Will Deacon [this message]
2018-09-13 14:09                   ` Oleksij Rempel
2018-07-09  9:45               ` Lucas Stach
2018-06-29 17:14     ` Lucas Stach
2018-06-29 17:46       ` Will Deacon
2018-07-02  9:58         ` Lucas Stach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180913131711.GB13242@arm.com \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).