* [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
@ 2014-04-14 18:45 Pranith Kumar
2014-04-15 7:52 ` Will Deacon
0 siblings, 1 reply; 3+ messages in thread
From: Pranith Kumar @ 2014-04-14 18:45 UTC (permalink / raw)
To: catalin.marinas, will.deacon, Joe Perches; +Cc: LKML
Please disregard previous patches. This is the correct one.
prefetch destination as is being done in ARM32 atomic ops
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
arch/arm64/include/asm/atomic.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index 0237f08..845f9be 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -48,6 +48,7 @@ static inline void atomic_add(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
asm volatile("// atomic_add\n"
"1: ldxr %w0, %2\n"
" add %w0, %w0, %w3\n"
@@ -62,6 +63,7 @@ static inline int atomic_add_return(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
asm volatile("// atomic_add_return\n"
"1: ldxr %w0, %2\n"
" add %w0, %w0, %w3\n"
@@ -80,6 +82,7 @@ static inline void atomic_sub(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
asm volatile("// atomic_sub\n"
"1: ldxr %w0, %2\n"
" sub %w0, %w0, %w3\n"
@@ -94,6 +97,7 @@ static inline int atomic_sub_return(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
asm volatile("// atomic_sub_return\n"
"1: ldxr %w0, %2\n"
" sub %w0, %w0, %w3\n"
@@ -113,6 +117,7 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
int oldval;
smp_mb();
+ prefetchw(&ptr->counter);
asm volatile("// atomic_cmpxchg\n"
"1: ldxr %w1, %2\n"
@@ -170,6 +175,7 @@ static inline void atomic64_add(u64 i, atomic64_t *v)
long result;
unsigned long tmp;
+ prefetchw(&v->counter);
asm volatile("// atomic64_add\n"
"1: ldxr %0, %2\n"
" add %0, %0, %3\n"
@@ -184,6 +190,7 @@ static inline long atomic64_add_return(long i, atomic64_t *v)
long result;
unsigned long tmp;
+ prefetchw(&v->counter);
asm volatile("// atomic64_add_return\n"
"1: ldxr %0, %2\n"
" add %0, %0, %3\n"
@@ -202,6 +209,7 @@ static inline void atomic64_sub(u64 i, atomic64_t *v)
long result;
unsigned long tmp;
+ prefetchw(&v->counter);
asm volatile("// atomic64_sub\n"
"1: ldxr %0, %2\n"
" sub %0, %0, %3\n"
@@ -216,6 +224,7 @@ static inline long atomic64_sub_return(long i, atomic64_t *v)
long result;
unsigned long tmp;
+ prefetchw(&v->counter);
asm volatile("// atomic64_sub_return\n"
"1: ldxr %0, %2\n"
" sub %0, %0, %3\n"
@@ -235,6 +244,7 @@ static inline long atomic64_cmpxchg(atomic64_t *ptr, long old, long new)
unsigned long res;
smp_mb();
+ prefetchw(&ptr->counter);
asm volatile("// atomic64_cmpxchg\n"
"1: ldxr %1, %2\n"
@@ -258,6 +268,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
long result;
unsigned long tmp;
+ prefetchw(&v->counter);
asm volatile("// atomic64_dec_if_positive\n"
"1: ldxr %0, %2\n"
" subs %0, %0, #1\n"
--
1.9.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
2014-04-14 18:45 [RFC PATCH 1/1] prefetch result in 64 bit atomic ops Pranith Kumar
@ 2014-04-15 7:52 ` Will Deacon
2014-04-15 8:33 ` Joe Perches
0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2014-04-15 7:52 UTC (permalink / raw)
To: Pranith Kumar; +Cc: Catalin Marinas, Joe Perches, LKML
Hi Pranith,
On Mon, Apr 14, 2014 at 07:45:22PM +0100, Pranith Kumar wrote:
> Please disregard previous patches. This is the correct one.
>
> prefetch destination as is being done in ARM32 atomic ops
Whilst this looks like a potentially sensible optimisation (based on the
results I saw on AArch32), I don't think we can take this patch without some
benchmarks on real silicon. The interaction between the half-barrier atomic
instructions and prfm isn't immediately obvious to me, and we should also
consider looking at streaming preload vs the l1keep option.
Did you write this patch as a basic port of the arch/arm/ patches I wrote,
or was it based on performance figures from real hardware?
Will
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC PATCH 1/1] prefetch result in 64 bit atomic ops
2014-04-15 7:52 ` Will Deacon
@ 2014-04-15 8:33 ` Joe Perches
0 siblings, 0 replies; 3+ messages in thread
From: Joe Perches @ 2014-04-15 8:33 UTC (permalink / raw)
To: Will Deacon; +Cc: Pranith Kumar, Catalin Marinas, LKML
On Tue, 2014-04-15 at 08:52 +0100, Will Deacon wrote:
> Did you write this patch as a basic port of the arch/arm/ patches I wrote,
> or was it based on performance figures from real hardware?
>From an earlier thread: https://lkml.org/lkml/2014/4/14/547
On Mon, 2014-04-14 at 14:11 -0400, Pranith Kumar wrote:
> On Mon, Apr 14, 2014 at 2:04 PM, Joe Perches <joe@perches.com> wrote:
> > Are there some benchmark comparison results
> > you neglected to attach?
>
> I was trying to get these ops to be similar to the 32 bit atomic ops which
> all have prefetch. I did not see any reason why we shouldn't do the same
> here. But no, no hard numbers yet. Send me a dev board please :)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-04-15 8:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-14 18:45 [RFC PATCH 1/1] prefetch result in 64 bit atomic ops Pranith Kumar
2014-04-15 7:52 ` Will Deacon
2014-04-15 8:33 ` Joe Perches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox