[RFC PATCH 1/1] prefetch result in 64 bit atomic ops

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
@ 2014-04-14 18:45 Pranith Kumar
  2014-04-15  7:52 ` Will Deacon
  0 siblings, 1 reply; 3+ messages in thread
From: Pranith Kumar @ 2014-04-14 18:45 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, Joe Perches; +Cc: LKML

Please disregard previous patches. This is the correct one. 

prefetch destination as is being done in ARM32 atomic ops

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 arch/arm64/include/asm/atomic.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index 0237f08..845f9be 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -48,6 +48,7 @@ static inline void atomic_add(int i, atomic_t *v)
 	unsigned long tmp;
 	int result;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic_add\n"
 "1:	ldxr	%w0, %2\n"
 "	add	%w0, %w0, %w3\n"
@@ -62,6 +63,7 @@ static inline int atomic_add_return(int i, atomic_t *v)
 	unsigned long tmp;
 	int result;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic_add_return\n"
 "1:	ldxr	%w0, %2\n"
 "	add	%w0, %w0, %w3\n"
@@ -80,6 +82,7 @@ static inline void atomic_sub(int i, atomic_t *v)
 	unsigned long tmp;
 	int result;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic_sub\n"
 "1:	ldxr	%w0, %2\n"
 "	sub	%w0, %w0, %w3\n"
@@ -94,6 +97,7 @@ static inline int atomic_sub_return(int i, atomic_t *v)
 	unsigned long tmp;
 	int result;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic_sub_return\n"
 "1:	ldxr	%w0, %2\n"
 "	sub	%w0, %w0, %w3\n"
@@ -113,6 +117,7 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
 	int oldval;
 
 	smp_mb();
+	prefetchw(&ptr->counter);
 
 	asm volatile("// atomic_cmpxchg\n"
 "1:	ldxr	%w1, %2\n"
@@ -170,6 +175,7 @@ static inline void atomic64_add(u64 i, atomic64_t *v)
 	long result;
 	unsigned long tmp;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic64_add\n"
 "1:	ldxr	%0, %2\n"
 "	add	%0, %0, %3\n"
@@ -184,6 +190,7 @@ static inline long atomic64_add_return(long i, atomic64_t *v)
 	long result;
 	unsigned long tmp;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic64_add_return\n"
 "1:	ldxr	%0, %2\n"
 "	add	%0, %0, %3\n"
@@ -202,6 +209,7 @@ static inline void atomic64_sub(u64 i, atomic64_t *v)
 	long result;
 	unsigned long tmp;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic64_sub\n"
 "1:	ldxr	%0, %2\n"
 "	sub	%0, %0, %3\n"
@@ -216,6 +224,7 @@ static inline long atomic64_sub_return(long i, atomic64_t *v)
 	long result;
 	unsigned long tmp;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic64_sub_return\n"
 "1:	ldxr	%0, %2\n"
 "	sub	%0, %0, %3\n"
@@ -235,6 +244,7 @@ static inline long atomic64_cmpxchg(atomic64_t *ptr, long old, long new)
 	unsigned long res;
 
 	smp_mb();
+	prefetchw(&ptr->counter);
 
 	asm volatile("// atomic64_cmpxchg\n"
 "1:	ldxr	%1, %2\n"
@@ -258,6 +268,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
 	long result;
 	unsigned long tmp;
 
+	prefetchw(&v->counter);
 	asm volatile("// atomic64_dec_if_positive\n"
 "1:	ldxr	%0, %2\n"
 "	subs	%0, %0, #1\n"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
  2014-04-14 18:45 [RFC PATCH 1/1] prefetch result in 64 bit atomic ops Pranith Kumar
@ 2014-04-15  7:52 ` Will Deacon
  2014-04-15  8:33   ` Joe Perches
  0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2014-04-15  7:52 UTC (permalink / raw)
  To: Pranith Kumar; +Cc: Catalin Marinas, Joe Perches, LKML

Hi Pranith,

On Mon, Apr 14, 2014 at 07:45:22PM +0100, Pranith Kumar wrote:
> Please disregard previous patches. This is the correct one. 
> 
> prefetch destination as is being done in ARM32 atomic ops

Whilst this looks like a potentially sensible optimisation (based on the
results I saw on AArch32), I don't think we can take this patch without some
benchmarks on real silicon. The interaction between the half-barrier atomic
instructions and prfm isn't immediately obvious to me, and we should also
consider looking at streaming preload vs the l1keep option.

Did you write this patch as a basic port of the arch/arm/ patches I wrote,
or was it based on performance figures from real hardware?

Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
  2014-04-15  7:52 ` Will Deacon
@ 2014-04-15  8:33   ` Joe Perches
  0 siblings, 0 replies; 3+ messages in thread
From: Joe Perches @ 2014-04-15  8:33 UTC (permalink / raw)
  To: Will Deacon; +Cc: Pranith Kumar, Catalin Marinas, LKML

On Tue, 2014-04-15 at 08:52 +0100, Will Deacon wrote:
> Did you write this patch as a basic port of the arch/arm/ patches I wrote,
> or was it based on performance figures from real hardware?

>From an earlier thread: https://lkml.org/lkml/2014/4/14/547

On Mon, 2014-04-14 at 14:11 -0400, Pranith Kumar wrote:
> On Mon, Apr 14, 2014 at 2:04 PM, Joe Perches <joe@perches.com> wrote:
> > Are there some benchmark comparison results
> > you neglected to attach?
> 
> I was trying to get these ops to be similar to the 32 bit atomic ops which
> all have prefetch. I did not see any reason why we shouldn't do the same
> here. But no, no hard numbers yet.  Send me a dev board please :)



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-04-15  8:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-14 18:45 [RFC PATCH 1/1] prefetch result in 64 bit atomic ops Pranith Kumar
2014-04-15  7:52 ` Will Deacon
2014-04-15  8:33   ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox