From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754879AbcBZVbi (ORCPT ); Fri, 26 Feb 2016 16:31:38 -0500 Received: from e36.co.us.ibm.com ([32.97.110.154]:49158 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752141AbcBZVbh (ORCPT ); Fri, 26 Feb 2016 16:31:37 -0500 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Fri, 26 Feb 2016 13:31:33 -0800 From: "Paul E. McKenney" To: Sergey Fedorov Cc: linux-kernel@vger.kernel.org Subject: Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence? Message-ID: <20160226213133.GI3522@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <56D0C02D.6000905@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56D0C02D.6000905@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16022621-0021-0000-0000-0000176D60D2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 27, 2016 at 12:14:21AM +0300, Sergey Fedorov wrote: > Hi, > > I just can't understand how this kind of compiler barrier macros may > provide any form of cache coherence. Sure, such kind of compiler > barrier is necessary to "reliably" access a variable from multiple > CPUs. But why it is stated that these macros *provide* cache > coherence? Without READ_ONCE(), common sub-expression elimination optimizations can cause later reads of a given variable to see older value than previous reads did. For a (silly) example: a = complicated_pure_function(x); b = x; c = complicated_pure_function(x); The compiler is within its rights to transform this into the following: a = complicated_pure_function(x); b = x; c = a(x); In this case, the assignment to b might see a newer value of x than did the later assignment to c. This violates cache coherence, which states that all reads from a given variable must agree on the order of values taken on by that variable. Using READ_ONCE() prevents this violation of cache coherence, albeit at the price of evaluating complicated_pure_function() twice rather than once: a = complicated_pure_function(READ_ONCE(x)); b = READ_ONCE(x); c = complicated_pure_function(READ_ONCE(x)); Similar examples exist for WRITE_ONCE(). You -want- the compiler to violate cache coherence for normal accesses to unshared variables, so you have to tell it when cache coherence is important. Thanx, Paul > From Documentation/memory-barriers.txt: > >The READ_ONCE() and WRITE_ONCE() functions can prevent any number of > >optimizations that, while perfectly safe in single-threaded code, can > >be fatal in concurrent code. Here are some examples of these sorts > >of optimizations: > > > > (*) The compiler is within its rights to reorder loads and stores > > to the same variable, and in some cases, the CPU is within its > > rights to reorder loads to the same variable. This means that > > the following code: > > > > a[0] = x; > > a[1] = x; > > > > Might result in an older value of x stored in a[1] than in a[0]. > > Prevent both the compiler and the CPU from doing this as follows: > > > > a[0] = READ_ONCE(x); > > a[1] = READ_ONCE(x); > > > > In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for > > accesses from multiple CPUs to a single variable. > > Thanks, > Sergey >