From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-x234.google.com (mail-wr0-x234.google.com [IPv6:2a00:1450:400c:c0c::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40Br962rShzF1V0 for ; Fri, 30 Mar 2018 03:40:57 +1100 (AEDT) Received: by mail-wr0-x234.google.com with SMTP id l49so5990808wrl.4 for ; Thu, 29 Mar 2018 09:40:57 -0700 (PDT) Date: Thu, 29 Mar 2018 10:40:48 -0600 From: Jason Gunthorpe To: David Laight Cc: Will Deacon , Benjamin Herrenschmidt , Arnd Bergmann , Sinan Kaya , Oliver , "open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)" , "linux-rdma@vger.kernel.org" , "Paul E. McKenney" , Peter Zijlstra , Ingo Molnar , Jonathan Corbet Subject: Re: RFC on writel and writel_relaxed Message-ID: <20180329164048.GB13656@ziepe.ca> References: <1522186185.7364.59.camel@kernel.crashing.org> <20180328085338.GA28871@arm.com> <1522230616.21446.1.camel@kernel.crashing.org> <1522231287.21446.9.camel@kernel.crashing.org> <20180328101345.GA30850@arm.com> <20180328165732.GA4546@ziepe.ca> <20180329091941.GA22926@arm.com> <20180329144515.GA13656@ziepe.ca> <4ce6f0338fe04574bd3c0633c522a1c7@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <4ce6f0338fe04574bd3c0633c522a1c7@AcuMS.aculab.com> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, Mar 29, 2018 at 02:58:34PM +0000, David Laight wrote: > From: Jason Gunthorpe > > Sent: 29 March 2018 15:45 > ... > > > > When talking about ordering between the devices, the relevant question > > > > is what happens if the writel(DEVICE_BAR) triggers DEVICE_BAR to DMA > > > > from the DEVICE_FOO. 'ordered' means that in this case > > > > writel(DEVICE_FOO) must be presented to FOO before anything generated > > > > by BAR. > > > > > > Yes, and that isn't the case for arm because the writes can still be > > > buffered. > > > > The statement is not about buffering, or temporal completion order, or > > the order of acks returning to the CPU. It is about pure transaction > > ordering inside the interconnect. > > > > Can write BAR -> FOO pass write CPU -> FOO? > > Almost certainly. > The first cpu write can almost certainly be 'stalled' at the shared PCIe bridge. > The second cpu write then completes (to a different target). > That target then issues a peer to peer transfer that reaches the shared bridge. > I doubt the order of the transactions is guaranteed when it becomes 'un-stalled'. The PCI spec has very strong wording on ordering that covers this case. Stalled bridges have to follow the ordering rules, and posted writes cannot pass other posted writes. Since in PCI all three transactions: CPU -> FOO CPU -> BAR BAR -> FOO Must traverse a shared bus segment, they must be placed on that bus in the above order, and the bridge(s) toward FOO must preserve this order. ARM's AXI has similar rules, I just can't recall the tiny details right now :) > Of course, these are peer to peer transfers, and strange ones at that. > Normally you'd not be doing peer to peer transfers that access 'memory' > the cpu has just written to. It is the best situation I can think of where order of completion to different devices would matter to a generic Linux driver.. .. And there are patches circulating right now for NVMe that enable exactly this kind of transfer, and rely on these kind of semantics, so it is a relevant detail :) Jason