From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files Date: Mon, 11 Feb 2019 18:04:47 +0100 Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> References: <1549850450-10171-1-git-send-email-longman@redhat.com> <20190211115833.GY32511@hirez.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=m.gmane.org@lists.infradead.org To: Waiman Long Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org, Davidlohr Bueso , linux-ia64@vger.kernel.org, Tim Chen , Arnd Bergmann , linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org, x86@kernel.org, Will Deacon , linux-kernel@vger.kernel.org, Linus Torvalds , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org, Thomas Gleixner , linuxppc-dev@lists.ozlabs.org, Andrew Morton , linux-arm-kernel@lists.infradead.org List-Id: linux-arch.vger.kernel.org On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote: > On 02/11/2019 06:58 AM, Peter Zijlstra wrote: > > Which is clearly worse. Now we can write that as: > > > > int __down_read_trylock2(unsigned long *l) > > { > > long tmp = READ_ONCE(*l); > > > > while (tmp >= 0) { > > if (try_cmpxchg(l, &tmp, tmp + 1)) > > return 1; > > } > > > > return 0; > > } > > > > which generates: > > > > 0000000000000030 <__down_read_trylock2>: > > 30: 48 8b 07 mov (%rdi),%rax > > 33: 48 85 c0 test %rax,%rax > > 36: 78 18 js 50 <__down_read_trylock2+0x20> > > 38: 48 8d 50 01 lea 0x1(%rax),%rdx > > 3c: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi) > > 41: 75 f0 jne 33 <__down_read_trylock2+0x3> > > 43: b8 01 00 00 00 mov $0x1,%eax > > 48: c3 retq > > 49: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > > 50: 31 c0 xor %eax,%eax > > 52: c3 retq > > > > Which is a lot better; but not quite there yet. > > > > > > I've tried quite a bit, but I can't seem to get GCC to generate the: > > > > add $1,%rdx > > jle > > > > required; stuff like: > > > > new = old + 1; > > if (new <= 0) > > > > generates: > > > > lea 0x1(%rax),%rdx > > test %rdx, %rdx > > jle > > Thanks for the suggested code snippet. So you want to replace "lea > 0x1(%rax), %rdx" by "add $1,%rdx"? > > I think the compiler is doing that so as to use the address generation > unit for addition instead of using the ALU. That will leave the ALU > available for doing other arithmetic operation in parallel. I don't > think it is a good idea to override the compiler and force it to use > ALU. So I am not going to try doing that. It is only 1 or 2 more of > codes anyway. Yeah, I was trying to see what I could make it do.. #2 really should be good enough, but you know how it is once you're poking at it :-) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from merlin.infradead.org ([205.233.59.134]:46926 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727715AbfBKRFY (ORCPT ); Mon, 11 Feb 2019 12:05:24 -0500 Date: Mon, 11 Feb 2019 18:04:47 +0100 From: Peter Zijlstra Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> References: <1549850450-10171-1-git-send-email-longman@redhat.com> <20190211115833.GY32511@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Waiman Long Cc: Ingo Molnar , Will Deacon , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-arch@vger.kernel.org, x86@kernel.org, Arnd Bergmann , Borislav Petkov , "H. Peter Anvin" , Davidlohr Bueso , Linus Torvalds , Andrew Morton , Tim Chen Message-ID: <20190211170447.8Xx7MlMAhKtjHLQEEmfuvp-AR2tzFdG6pTgHvhwHuCg@z> On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote: > On 02/11/2019 06:58 AM, Peter Zijlstra wrote: > > Which is clearly worse. Now we can write that as: > > > > int __down_read_trylock2(unsigned long *l) > > { > > long tmp = READ_ONCE(*l); > > > > while (tmp >= 0) { > > if (try_cmpxchg(l, &tmp, tmp + 1)) > > return 1; > > } > > > > return 0; > > } > > > > which generates: > > > > 0000000000000030 <__down_read_trylock2>: > > 30: 48 8b 07 mov (%rdi),%rax > > 33: 48 85 c0 test %rax,%rax > > 36: 78 18 js 50 <__down_read_trylock2+0x20> > > 38: 48 8d 50 01 lea 0x1(%rax),%rdx > > 3c: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi) > > 41: 75 f0 jne 33 <__down_read_trylock2+0x3> > > 43: b8 01 00 00 00 mov $0x1,%eax > > 48: c3 retq > > 49: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > > 50: 31 c0 xor %eax,%eax > > 52: c3 retq > > > > Which is a lot better; but not quite there yet. > > > > > > I've tried quite a bit, but I can't seem to get GCC to generate the: > > > > add $1,%rdx > > jle > > > > required; stuff like: > > > > new = old + 1; > > if (new <= 0) > > > > generates: > > > > lea 0x1(%rax),%rdx > > test %rdx, %rdx > > jle > > Thanks for the suggested code snippet. So you want to replace "lea > 0x1(%rax), %rdx" by "add $1,%rdx"? > > I think the compiler is doing that so as to use the address generation > unit for addition instead of using the ALU. That will leave the ALU > available for doing other arithmetic operation in parallel. I don't > think it is a good idea to override the compiler and force it to use > ALU. So I am not going to try doing that. It is only 1 or 2 more of > codes anyway. Yeah, I was trying to see what I could make it do.. #2 really should be good enough, but you know how it is once you're poking at it :-)