From mboxrd@z Thu Jan 1 00:00:00 1970 From: Will Deacon Subject: Re: RFC on writel and writel_relaxed Date: Tue, 27 Mar 2018 12:24:57 +0100 Message-ID: <20180327112456.GA15531@arm.com> References: <20180326202545.GB15554@ziepe.ca> <20180326210951.GD15554@ziepe.ca> <1522101717.7364.14.camel@kernel.crashing.org> <20180326222756.GJ15554@ziepe.ca> <20180327094159.GA29373@arm.com> <1522149602.7364.44.camel@kernel.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1522149602.7364.44.camel@kernel.crashing.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" To: Benjamin Herrenschmidt Cc: "Paul E. McKenney" , Arnd Bergmann , "linux-rdma@vger.kernel.org" , Sinan Kaya , Jason Gunthorpe , David Laight , Oliver , Alexander Duyck , "open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)" List-Id: linux-rdma@vger.kernel.org On Tue, Mar 27, 2018 at 10:20:02PM +1100, Benjamin Herrenschmidt wrote: > On Tue, 2018-03-27 at 10:42 +0100, Will Deacon wrote: > > > > > > This example adds a wmb() between two writes to a coherent DMA > > > area, it is definitely required there. I'm pretty sure I've never seen > > > any bug reports pointing to a missing wmb() between memory > > > and MMIO write accesses, but if you remember seeing them in the > > > list, maybe you can look again for some evidence of something going > > > wrong on x86 without it? > > > > If this is just about ordering accesses to coherent DMA, then using > > dma_wmb() instead will be much better performance on arm/arm64. > > Ah, something we should look into for powerpc as well, as we could use > an lwsync for that which is also cheaper than a full sync wmb does. > > dma_wmb() is basically the same as smp_wmb() without the CONFIG_SMP > conditional right ? Almost -- the slight change we have on arm64 is to say that it's "outer-shareable", which means it also orders non-cacheable accesses in the case that dma_alloc_coherent is used to allocate a consistent buffer for a non-coherent device. Will