From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [RFC][PATCH 0/5] arch: atomic rework Date: Tue, 18 Feb 2014 09:16:09 -0800 Message-ID: <20140218171609.GP4250@linux.vnet.ibm.com> References: <1392486310.18779.6447.camel@triegel.csb> <1392666947.18779.6838.camel@triegel.csb> <530296CD.5050503@warwick.ac.uk> <1392737465.18779.7644.camel@triegel.csb> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Content-Disposition: inline In-Reply-To: To: Linus Torvalds Cc: Torvald Riegel , Alec Teal , Will Deacon , Peter Zijlstra , Ramana Radhakrishnan , David Howells , "linux-arch@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "akpm@linux-foundation.org" , "mingo@kernel.org" , "gcc@gcc.gnu.org" List-Id: linux-arch.vger.kernel.org On Tue, Feb 18, 2014 at 08:49:13AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 7:31 AM, Torvald Riegel wrote: > > On Mon, 2014-02-17 at 16:05 -0800, Linus Torvalds wrote: > >> And exactly because I know enough, I would *really* like atomics to be > >> well-defined, and have very clear - and *local* - rules about how they > >> can be combined and optimized. > > > > "Local"? > > Yes. > > So I think that one of the big advantages of atomics over volatile is > that they *can* be optimized, and as such I'm not at all against > trying to generate much better code than for volatile accesses. > > But at the same time, that can go too far. For example, one of the > things we'd want to use atomics for is page table accesses, where it > is very important that we don't generate multiple accesses to the > values, because parts of the values can be change *by*hardware* (ie > accessed and dirty bits). > > So imagine that you have some clever global optimizer that sees that > the program never ever actually sets the dirty bit at all in any > thread, and then uses that kind of non-local knowledge to make > optimization decisions. THAT WOULD BE BAD. Might as well list other reasons why value proofs via whole-program analysis are unreliable for the Linux kernel: 1. As Linus said, changes from hardware. 2. Assembly code that is not visible to the compiler. Inline asms will -normally- let the compiler know what memory they change, but some just use the "memory" tag. Worse yet, I suspect that most compilers don't look all that carefully at .S files. Any number of other programs contain assembly files. 3. Kernel modules that have not yet been written. Now, the compiler could refrain from trying to prove anything about an EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL() variable, but there is currently no way to communicate this information to the compiler other than marking the variable "volatile". Other programs have similar issues, e.g., via dlopen(). 4. Some drivers allow user-mode code to mmap() some of their state. Any changes undertaken by the user-mode code would be invisible to the compiler. 5. JITed code produced based on BPF: https://lwn.net/Articles/437981/ And probably other stuff as well. Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e37.co.us.ibm.com ([32.97.110.158]:34057 "EHLO e37.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754576AbaBRRQO (ORCPT ); Tue, 18 Feb 2014 12:16:14 -0500 Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 18 Feb 2014 10:16:14 -0700 Date: Tue, 18 Feb 2014 09:16:09 -0800 From: "Paul E. McKenney" Subject: Re: [RFC][PATCH 0/5] arch: atomic rework Message-ID: <20140218171609.GP4250@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1392486310.18779.6447.camel@triegel.csb> <1392666947.18779.6838.camel@triegel.csb> <530296CD.5050503@warwick.ac.uk> <1392737465.18779.7644.camel@triegel.csb> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Linus Torvalds Cc: Torvald Riegel , Alec Teal , Will Deacon , Peter Zijlstra , Ramana Radhakrishnan , David Howells , "linux-arch@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "akpm@linux-foundation.org" , "mingo@kernel.org" , "gcc@gcc.gnu.org" Message-ID: <20140218171609.hHynABr3gT81MgxhbOhfUcDoF1jJ88WfQ-q5nMGRJKc@z> On Tue, Feb 18, 2014 at 08:49:13AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 7:31 AM, Torvald Riegel wrote: > > On Mon, 2014-02-17 at 16:05 -0800, Linus Torvalds wrote: > >> And exactly because I know enough, I would *really* like atomics to be > >> well-defined, and have very clear - and *local* - rules about how they > >> can be combined and optimized. > > > > "Local"? > > Yes. > > So I think that one of the big advantages of atomics over volatile is > that they *can* be optimized, and as such I'm not at all against > trying to generate much better code than for volatile accesses. > > But at the same time, that can go too far. For example, one of the > things we'd want to use atomics for is page table accesses, where it > is very important that we don't generate multiple accesses to the > values, because parts of the values can be change *by*hardware* (ie > accessed and dirty bits). > > So imagine that you have some clever global optimizer that sees that > the program never ever actually sets the dirty bit at all in any > thread, and then uses that kind of non-local knowledge to make > optimization decisions. THAT WOULD BE BAD. Might as well list other reasons why value proofs via whole-program analysis are unreliable for the Linux kernel: 1. As Linus said, changes from hardware. 2. Assembly code that is not visible to the compiler. Inline asms will -normally- let the compiler know what memory they change, but some just use the "memory" tag. Worse yet, I suspect that most compilers don't look all that carefully at .S files. Any number of other programs contain assembly files. 3. Kernel modules that have not yet been written. Now, the compiler could refrain from trying to prove anything about an EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL() variable, but there is currently no way to communicate this information to the compiler other than marking the variable "volatile". Other programs have similar issues, e.g., via dlopen(). 4. Some drivers allow user-mode code to mmap() some of their state. Any changes undertaken by the user-mode code would be invisible to the compiler. 5. JITed code produced based on BPF: https://lwn.net/Articles/437981/ And probably other stuff as well. Thanx, Paul