From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qa0-f49.google.com (mail-qa0-f49.google.com [209.85.216.49]) by kanga.kvack.org (Postfix) with ESMTP id CE44E6B0095 for ; Tue, 26 Nov 2013 14:20:11 -0500 (EST) Received: by mail-qa0-f49.google.com with SMTP id ii20so7996865qab.8 for ; Tue, 26 Nov 2013 11:20:11 -0800 (PST) Received: from e37.co.us.ibm.com (e37.co.us.ibm.com. [32.97.110.158]) by mx.google.com with ESMTPS id a1si12050918qar.50.2013.11.26.11.20.10 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 26 Nov 2013 11:20:11 -0800 (PST) Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 26 Nov 2013 12:20:09 -0700 Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 9B9B01FF001C for ; Tue, 26 Nov 2013 12:19:47 -0700 (MST) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08026.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rAQHIIV914352546 for ; Tue, 26 Nov 2013 18:18:18 +0100 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id rAQJN1EH011320 for ; Tue, 26 Nov 2013 12:23:03 -0700 Date: Tue, 26 Nov 2013 11:20:03 -0800 From: "Paul E. McKenney" Subject: Re: [PATCH v6 4/5] MCS Lock: Barrier corrections Message-ID: <20131126192003.GA4137@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20131121221859.GH4138@linux.vnet.ibm.com> <20131122155835.GR3866@twins.programming.kicks-ass.net> <20131122182632.GW4138@linux.vnet.ibm.com> <20131122185107.GJ4971@laptop.programming.kicks-ass.net> <20131125173540.GK3694@twins.programming.kicks-ass.net> <20131125180250.GR4138@linux.vnet.ibm.com> <20131125182715.GG10022@twins.programming.kicks-ass.net> <20131125235252.GA4138@linux.vnet.ibm.com> <20131126095945.GI10022@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Linus Torvalds Cc: Peter Zijlstra , Will Deacon , Tim Chen , Ingo Molnar , Andrew Morton , Thomas Gleixner , "linux-kernel@vger.kernel.org" , linux-mm , "linux-arch@vger.kernel.org" , Waiman Long , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Rik van Riel , Peter Hurley , Raghavendra K T , George Spelvin , "H. Peter Anvin" , Arnd Bergmann , Aswin Chandramouleeswaran , Scott J Norton , "Figo.zhang" On Tue, Nov 26, 2013 at 11:00:50AM -0800, Linus Torvalds wrote: > On Tue, Nov 26, 2013 at 1:59 AM, Peter Zijlstra wrote: > > > > If you now want to weaken this definition, then that needs consideration > > because we actually rely on things like > > > > spin_unlock(l1); > > spin_lock(l2); > > > > being full barriers. > > Btw, maybe we should just stop that assumption. The complexity of this > discussion makes me go "maybe we should stop with subtle assumptions > that happen to be obviously true on x86 due to historical > implementations, but aren't obviously true even *there* any more with > the MCS lock". >>From an RCU viewpoint, I am OK with that approach. From the viewpoint of documenting our assumptions, I really really like that approach. > We already have a concept of > > smp_mb__before_spinlock(); > spin_lock(): > > for sequences where we *know* we need to make getting a spin-lock be a > full memory barrier. It's free on x86 (and remains so even with the > MCS lock, regardless of any subtle issues, if only because even the > MCS lock starts out with a locked atomic, never mind the contention > slow-case). Of course, that macro is only used inside the scheduler, > and is actually documented to not really be a full memory barrier, but > it handles the case we actually care about. This would work well if we made it be smp_mb__after_spinlock(), used as follows: spin_lock(); smp_mb__after_spinlock(); The reason that it must go after rather than before is to handle the MCS-style low-overhead handoffs. During the handoff, you can count on code at the beginning of the critical section being executed, but things before the lock cannot possibly help you. We should also have something for lock releases, for example: smp_mb__before_spinunlock(); spin_unlock(); This allows architectures to choose where to put the overhead, and also very clearly documents which unlock+lock pairs need the full barriers. Heh. Must smp_mb__after_spinlock() and smp_mb__before_spinunlock() provide a full barrier when used separately, or only when used together? The unlock+lock guarantee requires that they provide a full barrier only when used together. However, I believe that the scheduler's use of smp_mb__before_spinlock() needs a full barrier without pairing. I have no idea whether or not we should have a separate API for each flavor of lock. > IOW, where do we really care about the "unlock+lock" is a memory > barrier? And could we make those places explicit, and then do > something similar to the above to them? There are several places in RCU that assume unlock+lock is a full memory barrier, but I would be more than happy to fix them up given an smp_mb__after_spinlock() and an smp_mb__before_spinunlock(), or something similar. Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org