From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4074Tc620czDqlc for ; Sat, 24 Mar 2018 00:42:11 +1100 (AEDT) Subject: Re: RFC on writel and writel_relaxed To: Benjamin Herrenschmidt , Oliver Cc: linuxppc dev list , "linux-rdma@vger.kernel.org" , Marc Zyngier , Will Deacon References: <3611eabe-2999-1482-b2b4-6d216bbe4762@codeaurora.org> <4e5c745a-8b9b-959e-8893-d99cd6032484@codeaurora.org> <1521692689.16434.293.camel@kernel.crashing.org> <1521726722.16434.312.camel@kernel.crashing.org> <5ccdb208-4664-0a7f-df5d-2e12cbe4c239@codeaurora.org> <1521764168.16434.324.camel@kernel.crashing.org> From: Sinan Kaya Message-ID: <9c7d3d35-ddaf-60ee-1430-19b5c3f66813@codeaurora.org> Date: Fri, 23 Mar 2018 09:42:04 -0400 MIME-Version: 1.0 In-Reply-To: <1521764168.16434.324.camel@kernel.crashing.org> Content-Type: text/plain; charset=utf-8 List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 3/22/2018 8:16 PM, Benjamin Herrenschmidt wrote: > On Thu, 2018-03-22 at 12:51 -0500, Sinan Kaya wrote: >> On 3/22/2018 8:52 AM, Benjamin Herrenschmidt wrote: >>>>> No, it's not sufficient. >>> >>> Just to clarify ... barrier() is just a compiler barrier, it means the >>> compiler will generate things in the order they are written. This isn't >>> sufficient on archs with an OO memory model, where an actual memory >>> barrier instruction needs to be emited. >> >> Surprisingly, ARM64 GCC compiler generates a write barrier as >> opposed to preventing code reordering. >> >> I was curious if this is an ARM only thing or not. > > Are you sure of that ? I thought it's the ARM implementation of writel > that had an explicit write barrier in it: Yes, I'm %100 sure. The answer is both writel() and barrier() generates a write barrier instruction. I found this by searching the kernel disassembly for back to back "dsb st" instruction. > > #define writel(v,c) ({ __iowmb(); writel_relaxed((v),(c)); }) > > And __iowmb() is > > #define __iowmb() wmb() > > Note, I'm a bit dubious about this in ARM: > > #define readl(c) ({ u32 __v = readl_relaxed(c); __iormb(); __v; } > > Will, Marc, on powerpc, we put a sync *before* the read in readl etc... > > The reasoning was there could be some DMA setup followed by a side > effect readl rather than a side effect writel to trigger a DMA. Granted > I wouldn't expect modern devices to be that stupid, but I have vague > memory of some devices back in the day having that sort of read ops. > > In general, I though the model offerred by x86 and thus by Linux > readl/writel was full synchronization both before and after the MMIO, > vs either other MMIO or all other forms of ops (cachable memory, locks > etc...). > > Also, can't the above readl_relaxed leak out of a lock ? I think you are asking about PPC, correct? I read somewhere that PPC implementation keeps track of MMIO accesses and has an implicit barrier inside the spin_unlock() code for such accesses. Isn't this true? > > Cheers, > Ben. > > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.