All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
	x86@kernel.org, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files
Date: Mon, 11 Feb 2019 18:04:47 +0100	[thread overview]
Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <e470384d-fb7b-132d-253b-23c495e7c285@redhat.com>

On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote:
> On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> > Which is clearly worse. Now we can write that as:
> >
> >   int __down_read_trylock2(unsigned long *l)
> >   {
> > 	  long tmp = READ_ONCE(*l);
> >
> > 	  while (tmp >= 0) {
> > 		  if (try_cmpxchg(l, &tmp, tmp + 1))
> > 			  return 1;
> > 	  }
> >
> > 	  return 0;
> >   }
> >
> > which generates:
> >
> >   0000000000000030 <__down_read_trylock2>:
> >   30:   48 8b 07                mov    (%rdi),%rax
> >   33:   48 85 c0                test   %rax,%rax
> >   36:   78 18                   js     50 <__down_read_trylock2+0x20>
> >   38:   48 8d 50 01             lea    0x1(%rax),%rdx
> >   3c:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
> >   41:   75 f0                   jne    33 <__down_read_trylock2+0x3>
> >   43:   b8 01 00 00 00          mov    $0x1,%eax
> >   48:   c3                      retq
> >   49:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> >   50:   31 c0                   xor    %eax,%eax
> >   52:   c3                      retq
> >
> > Which is a lot better; but not quite there yet.
> >
> >
> > I've tried quite a bit, but I can't seem to get GCC to generate the:
> >
> > 	add $1,%rdx
> > 	jle
> >
> > required; stuff like:
> >
> > 	new = old + 1;
> > 	if (new <= 0)
> >
> > generates:
> >
> > 	lea 0x1(%rax),%rdx
> > 	test %rdx, %rdx
> > 	jle
> 
> Thanks for the suggested code snippet. So you want to replace "lea
> 0x1(%rax), %rdx" by "add $1,%rdx"?
> 
> I think the compiler is doing that so as to use the address generation
> unit for addition instead of using the ALU. That will leave the ALU
> available for doing other arithmetic operation in parallel. I don't
> think it is a good idea to override the compiler and force it to use
> ALU. So I am not going to try doing that. It is only 1 or 2 more of
> codes anyway.

Yeah, I was trying to see what I could make it do.. #2 really should be
good enough, but you know how it is once you're poking at it :-)

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, x86@kernel.org,
	Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files
Date: Mon, 11 Feb 2019 18:04:47 +0100	[thread overview]
Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> (raw)
Message-ID: <20190211170447.8Xx7MlMAhKtjHLQEEmfuvp-AR2tzFdG6pTgHvhwHuCg@z> (raw)
In-Reply-To: <e470384d-fb7b-132d-253b-23c495e7c285@redhat.com>

On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote:
> On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> > Which is clearly worse. Now we can write that as:
> >
> >   int __down_read_trylock2(unsigned long *l)
> >   {
> > 	  long tmp = READ_ONCE(*l);
> >
> > 	  while (tmp >= 0) {
> > 		  if (try_cmpxchg(l, &tmp, tmp + 1))
> > 			  return 1;
> > 	  }
> >
> > 	  return 0;
> >   }
> >
> > which generates:
> >
> >   0000000000000030 <__down_read_trylock2>:
> >   30:   48 8b 07                mov    (%rdi),%rax
> >   33:   48 85 c0                test   %rax,%rax
> >   36:   78 18                   js     50 <__down_read_trylock2+0x20>
> >   38:   48 8d 50 01             lea    0x1(%rax),%rdx
> >   3c:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
> >   41:   75 f0                   jne    33 <__down_read_trylock2+0x3>
> >   43:   b8 01 00 00 00          mov    $0x1,%eax
> >   48:   c3                      retq
> >   49:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> >   50:   31 c0                   xor    %eax,%eax
> >   52:   c3                      retq
> >
> > Which is a lot better; but not quite there yet.
> >
> >
> > I've tried quite a bit, but I can't seem to get GCC to generate the:
> >
> > 	add $1,%rdx
> > 	jle
> >
> > required; stuff like:
> >
> > 	new = old + 1;
> > 	if (new <= 0)
> >
> > generates:
> >
> > 	lea 0x1(%rax),%rdx
> > 	test %rdx, %rdx
> > 	jle
> 
> Thanks for the suggested code snippet. So you want to replace "lea
> 0x1(%rax), %rdx" by "add $1,%rdx"?
> 
> I think the compiler is doing that so as to use the address generation
> unit for addition instead of using the ALU. That will leave the ALU
> available for doing other arithmetic operation in parallel. I don't
> think it is a good idea to override the compiler and force it to use
> ALU. So I am not going to try doing that. It is only 1 or 2 more of
> codes anyway.

Yeah, I was trying to see what I could make it do.. #2 really should be
good enough, but you know how it is once you're poking at it :-)

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
	x86@kernel.org, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files
Date: Mon, 11 Feb 2019 17:04:47 +0000	[thread overview]
Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <e470384d-fb7b-132d-253b-23c495e7c285@redhat.com>

On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote:
> On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> > Which is clearly worse. Now we can write that as:
> >
> >   int __down_read_trylock2(unsigned long *l)
> >   {
> > 	  long tmp = READ_ONCE(*l);
> >
> > 	  while (tmp >= 0) {
> > 		  if (try_cmpxchg(l, &tmp, tmp + 1))
> > 			  return 1;
> > 	  }
> >
> > 	  return 0;
> >   }
> >
> > which generates:
> >
> >   0000000000000030 <__down_read_trylock2>:
> >   30:   48 8b 07                mov    (%rdi),%rax
> >   33:   48 85 c0                test   %rax,%rax
> >   36:   78 18                   js     50 <__down_read_trylock2+0x20>
> >   38:   48 8d 50 01             lea    0x1(%rax),%rdx
> >   3c:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
> >   41:   75 f0                   jne    33 <__down_read_trylock2+0x3>
> >   43:   b8 01 00 00 00          mov    $0x1,%eax
> >   48:   c3                      retq
> >   49:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> >   50:   31 c0                   xor    %eax,%eax
> >   52:   c3                      retq
> >
> > Which is a lot better; but not quite there yet.
> >
> >
> > I've tried quite a bit, but I can't seem to get GCC to generate the:
> >
> > 	add $1,%rdx
> > 	jle
> >
> > required; stuff like:
> >
> > 	new = old + 1;
> > 	if (new <= 0)
> >
> > generates:
> >
> > 	lea 0x1(%rax),%rdx
> > 	test %rdx, %rdx
> > 	jle
> 
> Thanks for the suggested code snippet. So you want to replace "lea
> 0x1(%rax), %rdx" by "add $1,%rdx"?
> 
> I think the compiler is doing that so as to use the address generation
> unit for addition instead of using the ALU. That will leave the ALU
> available for doing other arithmetic operation in parallel. I don't
> think it is a good idea to override the compiler and force it to use
> ALU. So I am not going to try doing that. It is only 1 or 2 more of
> codes anyway.

Yeah, I was trying to see what I could make it do.. #2 really should be
good enough, but you know how it is once you're poking at it :-)

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
	x86@kernel.org, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] locking/rwsem: Remove arch specific rwsem files
Date: Mon, 11 Feb 2019 18:04:47 +0100	[thread overview]
Message-ID: <20190211170447.GO32477@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <e470384d-fb7b-132d-253b-23c495e7c285@redhat.com>

On Mon, Feb 11, 2019 at 11:35:24AM -0500, Waiman Long wrote:
> On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> > Which is clearly worse. Now we can write that as:
> >
> >   int __down_read_trylock2(unsigned long *l)
> >   {
> > 	  long tmp = READ_ONCE(*l);
> >
> > 	  while (tmp >= 0) {
> > 		  if (try_cmpxchg(l, &tmp, tmp + 1))
> > 			  return 1;
> > 	  }
> >
> > 	  return 0;
> >   }
> >
> > which generates:
> >
> >   0000000000000030 <__down_read_trylock2>:
> >   30:   48 8b 07                mov    (%rdi),%rax
> >   33:   48 85 c0                test   %rax,%rax
> >   36:   78 18                   js     50 <__down_read_trylock2+0x20>
> >   38:   48 8d 50 01             lea    0x1(%rax),%rdx
> >   3c:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
> >   41:   75 f0                   jne    33 <__down_read_trylock2+0x3>
> >   43:   b8 01 00 00 00          mov    $0x1,%eax
> >   48:   c3                      retq
> >   49:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> >   50:   31 c0                   xor    %eax,%eax
> >   52:   c3                      retq
> >
> > Which is a lot better; but not quite there yet.
> >
> >
> > I've tried quite a bit, but I can't seem to get GCC to generate the:
> >
> > 	add $1,%rdx
> > 	jle
> >
> > required; stuff like:
> >
> > 	new = old + 1;
> > 	if (new <= 0)
> >
> > generates:
> >
> > 	lea 0x1(%rax),%rdx
> > 	test %rdx, %rdx
> > 	jle
> 
> Thanks for the suggested code snippet. So you want to replace "lea
> 0x1(%rax), %rdx" by "add $1,%rdx"?
> 
> I think the compiler is doing that so as to use the address generation
> unit for addition instead of using the ALU. That will leave the ALU
> available for doing other arithmetic operation in parallel. I don't
> think it is a good idea to override the compiler and force it to use
> ALU. So I am not going to try doing that. It is only 1 or 2 more of
> codes anyway.

Yeah, I was trying to see what I could make it do.. #2 really should be
good enough, but you know how it is once you're poking at it :-)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-02-11 17:04 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-11  2:00 [PATCH] locking/rwsem: Remove arch specific rwsem files Waiman Long
2019-02-11  2:00 ` Waiman Long
2019-02-11  2:00 ` Waiman Long
2019-02-11  2:00 ` Waiman Long
2019-02-11  2:08 ` Waiman Long
2019-02-11  2:08   ` Waiman Long
2019-02-11  2:08   ` Waiman Long
2019-02-11  2:08   ` Waiman Long
2019-02-11  7:11   ` Ingo Molnar
2019-02-11  7:11     ` Ingo Molnar
2019-02-11  7:11     ` Ingo Molnar
2019-02-11  7:11     ` Ingo Molnar
2019-02-11 10:39     ` Ingo Molnar
2019-02-11 10:39       ` Ingo Molnar
2019-02-11 10:39       ` Ingo Molnar
2019-02-11 10:39       ` Ingo Molnar
2019-02-11 10:52       ` Will Deacon
2019-02-11 10:52         ` Will Deacon
2019-02-11 10:52         ` Will Deacon
2019-02-11 10:55         ` Ingo Molnar
2019-02-11 10:55           ` Ingo Molnar
2019-02-11 10:55           ` Ingo Molnar
2019-02-11 10:55           ` Ingo Molnar
2019-02-11 13:32       ` Waiman Long
2019-02-11 13:32         ` Waiman Long
2019-02-11 13:32         ` Waiman Long
2019-02-11 13:32         ` Waiman Long
2019-02-11  9:36 ` Peter Zijlstra
2019-02-11  9:36   ` Peter Zijlstra
2019-02-11  9:36   ` Peter Zijlstra
2019-02-11  9:36   ` Peter Zijlstra
2019-02-11  9:40   ` Peter Zijlstra
2019-02-11  9:40     ` Peter Zijlstra
2019-02-11  9:40     ` Peter Zijlstra
2019-02-11  9:40     ` Peter Zijlstra
2019-02-11 10:57     ` Peter Zijlstra
2019-02-11 10:57       ` Peter Zijlstra
2019-02-11 10:57       ` Peter Zijlstra
2019-02-11 10:57       ` Peter Zijlstra
2019-02-11 11:58 ` Peter Zijlstra
2019-02-11 11:58   ` Peter Zijlstra
2019-02-11 11:58   ` Peter Zijlstra
2019-02-11 11:58   ` Peter Zijlstra
2019-02-11 16:35   ` Waiman Long
2019-02-11 16:35     ` Waiman Long
2019-02-11 16:35     ` Waiman Long
2019-02-11 16:35     ` Waiman Long
2019-02-11 17:04     ` Peter Zijlstra [this message]
2019-02-11 17:04       ` Peter Zijlstra
2019-02-11 17:04       ` Peter Zijlstra
2019-02-11 17:04       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211170447.GO32477@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave@stgolabs.net \
    --cc=hpa@zytor.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.