From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [RFC] Control dependencies Date: Mon, 25 Nov 2013 10:42:23 +0100 Message-ID: <20131125094223.GV10022@twins.programming.kicks-ass.net> References: <20131121161733.GH10022@twins.programming.kicks-ass.net> <20131121180237.GX4138@linux.vnet.ibm.com> <20131122134630.GQ3866@twins.programming.kicks-ass.net> <528F9B8D.2090806@hurleysoftware.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from merlin.infradead.org ([205.233.59.134]:35160 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750766Ab3KYJnC (ORCPT ); Mon, 25 Nov 2013 04:43:02 -0500 Content-Disposition: inline In-Reply-To: <528F9B8D.2090806@hurleysoftware.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Hurley Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org, Will Deacon , Tim Chen , Ingo Molnar , Andrew Morton , Thomas Gleixner , linux-arch@vger.kernel.org, Linus Torvalds , Waiman Long , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Rik van Riel , Raghavendra K T , George Spelvin , "H. Peter Anvin" , Arnd Bergmann , Aswin Chandramouleeswaran , Sco On Fri, Nov 22, 2013 at 12:59:41PM -0500, Peter Hurley wrote: > On 11/22/2013 08:46 AM, Peter Zijlstra wrote: > >How about the below version? > > > >--- > >--- a/kernel/events/ring_buffer.c > >+++ b/kernel/events/ring_buffer.c > >@@ -61,19 +61,20 @@ static void perf_output_put_handle(struc > > * > > * kernel user > > * > >- * READ ->data_tail READ ->data_head > >- * smp_mb() (A) smp_rmb() (C) > >- * WRITE $data READ $data > >- * smp_wmb() (B) smp_mb() (D) > >- * STORE ->data_head WRITE ->data_tail > >+ * if (LOAD ->data_tail) { LOAD ->data_head > >+ * (A) smp_rmb() (C) > >+ * STORE $data LOAD $data > >+ * smp_wmb() (B) smp_mb() (D) > >+ * STORE ->data_head STORE ->data_tail > > > I wasn't subscribed to linux-arch so missed the smp_store_release() > outcome, if there was one. > > Are (B) and (D) still slated for changing to STORE.rel semantics, > aka smp_store_release()? The earlier proposal would have A and C be smp_load_acquire() and B and D be smp_store_release(). > I realize that, for the perf ring buffer, (D) is in userspace but > I'm also interested in non-perf situations where (D) would be in the > kernel. So we're still debating the exact semantics of smp_store_release(), it now looks like it needs a heavier memory barrier than previously thought. In which case using it wouldn't make sense for me anymore. Note that C and D are in userspace and not in any hot path (usually) They're only issued once to read an entire buffer backlog at once, so I don't really care about them all that much. A and B otoh are in kernel space and are issued for every single event written, so I'm interested to get them as cheaply as possible. With this proposed patch, we remove a full barrier, with the earlier smp_load_acquire() / smp_store_release() patches we would only downgrade the full barrier to an acquire barrier, which is still more than no barrier at all. And now it looks like the smp_store_release() would actually upgrade the wmb to a full barrier on some systems at least. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from merlin.infradead.org ([205.233.59.134]:35160 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750766Ab3KYJnC (ORCPT ); Mon, 25 Nov 2013 04:43:02 -0500 Date: Mon, 25 Nov 2013 10:42:23 +0100 From: Peter Zijlstra Subject: Re: [RFC] Control dependencies Message-ID: <20131125094223.GV10022@twins.programming.kicks-ass.net> References: <20131121161733.GH10022@twins.programming.kicks-ass.net> <20131121180237.GX4138@linux.vnet.ibm.com> <20131122134630.GQ3866@twins.programming.kicks-ass.net> <528F9B8D.2090806@hurleysoftware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <528F9B8D.2090806@hurleysoftware.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Hurley Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org, Will Deacon , Tim Chen , Ingo Molnar , Andrew Morton , Thomas Gleixner , linux-arch@vger.kernel.org, Linus Torvalds , Waiman Long , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Rik van Riel , Raghavendra K T , George Spelvin , "H. Peter Anvin" , Arnd Bergmann , Aswin Chandramouleeswaran , Scott J Norton , "Figo.zhang" Message-ID: <20131125094223.12P2MLGeSHF601ZQzdQSpW_gpYNTAniNURDmrWCN5ao@z> On Fri, Nov 22, 2013 at 12:59:41PM -0500, Peter Hurley wrote: > On 11/22/2013 08:46 AM, Peter Zijlstra wrote: > >How about the below version? > > > >--- > >--- a/kernel/events/ring_buffer.c > >+++ b/kernel/events/ring_buffer.c > >@@ -61,19 +61,20 @@ static void perf_output_put_handle(struc > > * > > * kernel user > > * > >- * READ ->data_tail READ ->data_head > >- * smp_mb() (A) smp_rmb() (C) > >- * WRITE $data READ $data > >- * smp_wmb() (B) smp_mb() (D) > >- * STORE ->data_head WRITE ->data_tail > >+ * if (LOAD ->data_tail) { LOAD ->data_head > >+ * (A) smp_rmb() (C) > >+ * STORE $data LOAD $data > >+ * smp_wmb() (B) smp_mb() (D) > >+ * STORE ->data_head STORE ->data_tail > > > I wasn't subscribed to linux-arch so missed the smp_store_release() > outcome, if there was one. > > Are (B) and (D) still slated for changing to STORE.rel semantics, > aka smp_store_release()? The earlier proposal would have A and C be smp_load_acquire() and B and D be smp_store_release(). > I realize that, for the perf ring buffer, (D) is in userspace but > I'm also interested in non-perf situations where (D) would be in the > kernel. So we're still debating the exact semantics of smp_store_release(), it now looks like it needs a heavier memory barrier than previously thought. In which case using it wouldn't make sense for me anymore. Note that C and D are in userspace and not in any hot path (usually) They're only issued once to read an entire buffer backlog at once, so I don't really care about them all that much. A and B otoh are in kernel space and are issued for every single event written, so I'm interested to get them as cheaply as possible. With this proposed patch, we remove a full barrier, with the earlier smp_load_acquire() / smp_store_release() patches we would only downgrade the full barrier to an acquire barrier, which is still more than no barrier at all. And now it looks like the smp_store_release() would actually upgrade the wmb to a full barrier on some systems at least.