* [PATCH v2 01/17] xen: x86 & generic: change to __builtin_prefetch()
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-04-02 16:01 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 02/17] xen: arm32: resync bitops with Linux v3.14-rc7 Ian Campbell
` (17 subsequent siblings)
18 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, julien.grall, tim, Ian Campbell, stefano.stabellini
Quoting Andi Kleen in Linux b483570a13be from 2007:
gcc 3.2+ supports __builtin_prefetch, so it's possible to use it on all
architectures. Change the generic fallback in linux/prefetch.h to use it
instead of noping it out. gcc should do the right thing when the
architecture doesn't support prefetching
Undefine the x86-64 inline assembler version and use the fallback.
ARM wants to use the builtins.
Fix a pair of spelling errors, one of which was from Lucas De Marchi in the
Linux tree.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
---
xen/include/xen/prefetch.h | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/xen/include/xen/prefetch.h b/xen/include/xen/prefetch.h
index 8d7d3ff..ba73998 100644
--- a/xen/include/xen/prefetch.h
+++ b/xen/include/xen/prefetch.h
@@ -28,24 +28,17 @@
prefetchw(x) - prefetches the cacheline at "x" for write
spin_lock_prefetch(x) - prefectches the spinlock *x for taking
- there is also PREFETCH_STRIDE which is the architecure-prefered
+ there is also PREFETCH_STRIDE which is the architecture-preferred
"lookahead" size for prefetching streamed operations.
*/
-/*
- * These cannot be do{}while(0) macros. See the mental gymnastics in
- * the loop macro.
- */
-
#ifndef ARCH_HAS_PREFETCH
-#define ARCH_HAS_PREFETCH
-static inline void prefetch(const void *x) {;}
+#define prefetch(x) __builtin_prefetch(x)
#endif
#ifndef ARCH_HAS_PREFETCHW
-#define ARCH_HAS_PREFETCHW
-static inline void prefetchw(const void *x) {;}
+#define prefetchw(x) __builtin_prefetch(x,1)
#endif
#ifndef ARCH_HAS_SPINLOCK_PREFETCH
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v2 01/17] xen: x86 & generic: change to __builtin_prefetch()
2014-03-26 13:38 ` [PATCH v2 01/17] xen: x86 & generic: change to __builtin_prefetch() Ian Campbell
@ 2014-04-02 16:01 ` Ian Campbell
0 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-04-02 16:01 UTC (permalink / raw)
To: xen-devel, Keir Fraser; +Cc: julien.grall, tim, stefano.stabellini
Keir,
On Wed, 2014-03-26 at 13:38 +0000, Ian Campbell wrote:
> Quoting Andi Kleen in Linux b483570a13be from 2007:
> gcc 3.2+ supports __builtin_prefetch, so it's possible to use it on all
> architectures. Change the generic fallback in linux/prefetch.h to use it
> instead of noping it out. gcc should do the right thing when the
> architecture doesn't support prefetching
>
> Undefine the x86-64 inline assembler version and use the fallback.
>
> ARM wants to use the builtins.
>
> Fix a pair of spelling errors, one of which was from Lucas De Marchi in the
> Linux tree.
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>
Are you OK with this one?
Thanks,
Ian.
> ---
> xen/include/xen/prefetch.h | 13 +++----------
> 1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/xen/include/xen/prefetch.h b/xen/include/xen/prefetch.h
> index 8d7d3ff..ba73998 100644
> --- a/xen/include/xen/prefetch.h
> +++ b/xen/include/xen/prefetch.h
> @@ -28,24 +28,17 @@
> prefetchw(x) - prefetches the cacheline at "x" for write
> spin_lock_prefetch(x) - prefectches the spinlock *x for taking
>
> - there is also PREFETCH_STRIDE which is the architecure-prefered
> + there is also PREFETCH_STRIDE which is the architecture-preferred
> "lookahead" size for prefetching streamed operations.
>
> */
>
> -/*
> - * These cannot be do{}while(0) macros. See the mental gymnastics in
> - * the loop macro.
> - */
> -
> #ifndef ARCH_HAS_PREFETCH
> -#define ARCH_HAS_PREFETCH
> -static inline void prefetch(const void *x) {;}
> +#define prefetch(x) __builtin_prefetch(x)
> #endif
>
> #ifndef ARCH_HAS_PREFETCHW
> -#define ARCH_HAS_PREFETCHW
> -static inline void prefetchw(const void *x) {;}
> +#define prefetchw(x) __builtin_prefetch(x,1)
> #endif
>
> #ifndef ARCH_HAS_SPINLOCK_PREFETCH
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 02/17] xen: arm32: resync bitops with Linux v3.14-rc7
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
2014-03-26 13:38 ` [PATCH v2 01/17] xen: x86 & generic: change to __builtin_prefetch() Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 03/17] xen: arm32: ensure cmpxchg has full barrier semantics Ian Campbell
` (16 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This pulls in the following Linux commits:
commit c36ef4b1762302a493c6cb754073bded084700e2
Author: Will Deacon <will.deacon@arm.com>
Date: Wed Nov 23 11:28:25 2011 +0100
ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
The bitops functions (e.g. _test_and_set_bit) on ARM do not have unwind
annotations and therefore the kernel cannot backtrace out of them on a
fatal error (for example, NULL pointer dereference).
This patch annotates the bitops assembly macros with UNWIND annotations
so that we can produce a meaningful backtrace on error. Callers of the
macros are modified to pass their function name as a macro parameter,
enforcing that the macros are used as standalone function implementations.
Acked-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit d779c07dd72098a7416d907494f958213b7726f3
Author: Will Deacon <will.deacon@arm.com>
Date: Thu Jun 27 12:01:51 2013 +0100
ARM: bitops: prefetch the destination word for write prior to strex
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch prefixes our atomic bitops implementation with prefetchw,
to try and grab the line in exclusive state from the start. The testop
macro is left alone, since the barrier semantics limit the usefulness
of prefetching data.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
commit b7ec699405f55667caeb46d96229d75bf33a83ad
Author: Will Deacon <will.deacon@arm.com>
Date: Tue Nov 19 15:46:11 2013 +0100
ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP
Uwe reported a build failure when targetting a NOMMU platform with my
recent prefetch changes:
arch/arm/lib/changebit.S: Assembler messages:
arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is
not allowed for the current base architecture
This is due to use of the .arch_extension mp directive immediately prior
to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to
nothing if !CONFIG_SMP, gas will still choke on the directive.
This patch fixes the issue by only emitting the sequence (including the
directive) if CONFIG_SMP=y.
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm32/lib/bitops.h | 17 +++++++++++++++--
xen/arch/arm/arm32/lib/changebit.S | 4 +---
xen/arch/arm/arm32/lib/clearbit.S | 4 +---
xen/arch/arm/arm32/lib/setbit.S | 4 +---
xen/arch/arm/arm32/lib/testchangebit.S | 4 +---
xen/arch/arm/arm32/lib/testclearbit.S | 4 +---
xen/arch/arm/arm32/lib/testsetbit.S | 4 +---
7 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/xen/arch/arm/arm32/lib/bitops.h b/xen/arch/arm/arm32/lib/bitops.h
index 689f2e8..25784c3 100644
--- a/xen/arch/arm/arm32/lib/bitops.h
+++ b/xen/arch/arm/arm32/lib/bitops.h
@@ -1,13 +1,20 @@
#include <xen/config.h>
#if __LINUX_ARM_ARCH__ >= 6
- .macro bitop, instr
+ .macro bitop, name, instr
+ENTRY( \name )
+UNWIND( .fnstart )
ands ip, r1, #3
strneb r1, [ip] @ assert word-aligned
mov r2, #1
and r3, r0, #31 @ Get bit offset
mov r0, r0, lsr #5
add r1, r1, r0, lsl #2 @ Get word offset
+#if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP)
+ .arch_extension mp
+ ALT_SMP(W(pldw) [r1])
+ ALT_UP(W(nop))
+#endif
mov r3, r2, lsl r3
1: ldrex r2, [r1]
\instr r2, r2, r3
@@ -15,9 +22,13 @@
cmp r0, #0
bne 1b
bx lr
+UNWIND( .fnend )
+ENDPROC(\name )
.endm
- .macro testop, instr, store
+ .macro testop, name, instr, store
+ENTRY( \name )
+UNWIND( .fnstart )
ands ip, r1, #3
strneb r1, [ip] @ assert word-aligned
mov r2, #1
@@ -36,6 +47,8 @@
cmp r0, #0
movne r0, #1
2: bx lr
+UNWIND( .fnend )
+ENDPROC(\name )
.endm
#else
.macro bitop, name, instr
diff --git a/xen/arch/arm/arm32/lib/changebit.S b/xen/arch/arm/arm32/lib/changebit.S
index 62954bc..11f41d2 100644
--- a/xen/arch/arm/arm32/lib/changebit.S
+++ b/xen/arch/arm/arm32/lib/changebit.S
@@ -13,6 +13,4 @@
#include "bitops.h"
.text
-ENTRY(_change_bit)
- bitop eor
-ENDPROC(_change_bit)
+bitop _change_bit, eor
diff --git a/xen/arch/arm/arm32/lib/clearbit.S b/xen/arch/arm/arm32/lib/clearbit.S
index 42ce416..1b6a569 100644
--- a/xen/arch/arm/arm32/lib/clearbit.S
+++ b/xen/arch/arm/arm32/lib/clearbit.S
@@ -14,6 +14,4 @@
#include "bitops.h"
.text
-ENTRY(_clear_bit)
- bitop bic
-ENDPROC(_clear_bit)
+bitop _clear_bit, bic
diff --git a/xen/arch/arm/arm32/lib/setbit.S b/xen/arch/arm/arm32/lib/setbit.S
index c828851..1f4ef56 100644
--- a/xen/arch/arm/arm32/lib/setbit.S
+++ b/xen/arch/arm/arm32/lib/setbit.S
@@ -13,6 +13,4 @@
#include "bitops.h"
.text
-ENTRY(_set_bit)
- bitop orr
-ENDPROC(_set_bit)
+bitop _set_bit, orr
diff --git a/xen/arch/arm/arm32/lib/testchangebit.S b/xen/arch/arm/arm32/lib/testchangebit.S
index a7f527c..7f4635c 100644
--- a/xen/arch/arm/arm32/lib/testchangebit.S
+++ b/xen/arch/arm/arm32/lib/testchangebit.S
@@ -13,6 +13,4 @@
#include "bitops.h"
.text
-ENTRY(_test_and_change_bit)
- testop eor, str
-ENDPROC(_test_and_change_bit)
+testop _test_and_change_bit, eor, str
diff --git a/xen/arch/arm/arm32/lib/testclearbit.S b/xen/arch/arm/arm32/lib/testclearbit.S
index 8f39c72..4d4152f 100644
--- a/xen/arch/arm/arm32/lib/testclearbit.S
+++ b/xen/arch/arm/arm32/lib/testclearbit.S
@@ -13,6 +13,4 @@
#include "bitops.h"
.text
-ENTRY(_test_and_clear_bit)
- testop bicne, strne
-ENDPROC(_test_and_clear_bit)
+testop _test_and_clear_bit, bicne, strne
diff --git a/xen/arch/arm/arm32/lib/testsetbit.S b/xen/arch/arm/arm32/lib/testsetbit.S
index 1b8d273..54f48f9 100644
--- a/xen/arch/arm/arm32/lib/testsetbit.S
+++ b/xen/arch/arm/arm32/lib/testsetbit.S
@@ -13,6 +13,4 @@
#include "bitops.h"
.text
-ENTRY(_test_and_set_bit)
- testop orreq, streq
-ENDPROC(_test_and_set_bit)
+testop _test_and_set_bit, orreq, streq
--
1.7.10.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 03/17] xen: arm32: ensure cmpxchg has full barrier semantics
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
2014-03-26 13:38 ` [PATCH v2 01/17] xen: x86 & generic: change to __builtin_prefetch() Ian Campbell
2014-03-26 13:38 ` [PATCH v2 02/17] xen: arm32: resync bitops with Linux v3.14-rc7 Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 04/17] xen: arm32: replace hard tabs in atomics.h Ian Campbell
` (15 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Unrelated reads/writes should not pass the xchg.
Provide cmpxchg_local for parity with arm64, although it appears to be unused.
It also helps make the reason for the separation of __cmpxchg_mb more
apparent.
With this our cmpxchg is in sync with Linux v3.14-rc7.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
We got our cmpxchg implementation from Linux which AFAICS has always had these
additional barriers. I don't recall us having decided that Xen barriers should
not have this property as well, and if we did we were remiss in not adding a
comment etc... If my memory is faulty then I am happy to replace thispatch
with one which adds a comment instead.
---
xen/include/asm-arm/arm32/system.h | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
index 9f233fe..dfaa3b6 100644
--- a/xen/include/asm-arm/arm32/system.h
+++ b/xen/include/asm-arm/arm32/system.h
@@ -113,9 +113,29 @@ static always_inline unsigned long __cmpxchg(
return oldval;
}
-#define cmpxchg(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o), \
- (unsigned long)(n),sizeof(*(ptr))))
+static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
+ unsigned long new, int size)
+{
+ unsigned long ret;
+
+ smp_mb();
+ ret = __cmpxchg(ptr, old, new, size);
+ smp_mb();
+
+ return ret;
+}
+
+#define cmpxchg(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+
+#define cmpxchg_local(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
#define local_irq_disable() asm volatile ( "cpsid i @ local_irq_disable\n" : : : "cc" )
#define local_irq_enable() asm volatile ( "cpsie i @ local_irq_enable\n" : : : "cc" )
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 04/17] xen: arm32: replace hard tabs in atomics.h
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (2 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 03/17] xen: arm32: ensure cmpxchg has full barrier semantics Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 05/17] xen: arm32: resync atomics with (almost) v3.14-rc7 Ian Campbell
` (14 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This file is from Linux and the intention was to keep the formatting the same
to make resyncing easier. Put the hardtabs back and adjust the emacs magic to
reflect the desired use of whitespace.
Adjust the 64-bit emacs magic too.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/include/asm-arm/arm32/atomic.h | 166 ++++++++++++++++++------------------
xen/include/asm-arm/arm64/atomic.h | 4 +-
2 files changed, 85 insertions(+), 85 deletions(-)
diff --git a/xen/include/asm-arm/arm32/atomic.h b/xen/include/asm-arm/arm32/atomic.h
index 523c745..3f024d4 100644
--- a/xen/include/asm-arm/arm32/atomic.h
+++ b/xen/include/asm-arm/arm32/atomic.h
@@ -18,122 +18,122 @@
*/
static inline void atomic_add(int i, atomic_t *v)
{
- unsigned long tmp;
- int result;
-
- __asm__ __volatile__("@ atomic_add\n"
-"1: ldrex %0, [%3]\n"
-" add %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
- : "r" (&v->counter), "Ir" (i)
- : "cc");
+ unsigned long tmp;
+ int result;
+
+ __asm__ __volatile__("@ atomic_add\n"
+"1: ldrex %0, [%3]\n"
+" add %0, %0, %4\n"
+" strex %1, %0, [%3]\n"
+" teq %1, #0\n"
+" bne 1b"
+ : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
+ : "r" (&v->counter), "Ir" (i)
+ : "cc");
}
static inline int atomic_add_return(int i, atomic_t *v)
{
- unsigned long tmp;
- int result;
+ unsigned long tmp;
+ int result;
- smp_mb();
+ smp_mb();
- __asm__ __volatile__("@ atomic_add_return\n"
-"1: ldrex %0, [%3]\n"
-" add %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
- : "r" (&v->counter), "Ir" (i)
- : "cc");
+ __asm__ __volatile__("@ atomic_add_return\n"
+"1: ldrex %0, [%3]\n"
+" add %0, %0, %4\n"
+" strex %1, %0, [%3]\n"
+" teq %1, #0\n"
+" bne 1b"
+ : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
+ : "r" (&v->counter), "Ir" (i)
+ : "cc");
- smp_mb();
+ smp_mb();
- return result;
+ return result;
}
static inline void atomic_sub(int i, atomic_t *v)
{
- unsigned long tmp;
- int result;
-
- __asm__ __volatile__("@ atomic_sub\n"
-"1: ldrex %0, [%3]\n"
-" sub %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
- : "r" (&v->counter), "Ir" (i)
- : "cc");
+ unsigned long tmp;
+ int result;
+
+ __asm__ __volatile__("@ atomic_sub\n"
+"1: ldrex %0, [%3]\n"
+" sub %0, %0, %4\n"
+" strex %1, %0, [%3]\n"
+" teq %1, #0\n"
+" bne 1b"
+ : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
+ : "r" (&v->counter), "Ir" (i)
+ : "cc");
}
static inline int atomic_sub_return(int i, atomic_t *v)
{
- unsigned long tmp;
- int result;
+ unsigned long tmp;
+ int result;
- smp_mb();
+ smp_mb();
- __asm__ __volatile__("@ atomic_sub_return\n"
-"1: ldrex %0, [%3]\n"
-" sub %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
- : "r" (&v->counter), "Ir" (i)
- : "cc");
+ __asm__ __volatile__("@ atomic_sub_return\n"
+"1: ldrex %0, [%3]\n"
+" sub %0, %0, %4\n"
+" strex %1, %0, [%3]\n"
+" teq %1, #0\n"
+" bne 1b"
+ : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
+ : "r" (&v->counter), "Ir" (i)
+ : "cc");
- smp_mb();
+ smp_mb();
- return result;
+ return result;
}
static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
{
- unsigned long oldval, res;
+ unsigned long oldval, res;
- smp_mb();
+ smp_mb();
- do {
- __asm__ __volatile__("@ atomic_cmpxchg\n"
- "ldrex %1, [%3]\n"
- "mov %0, #0\n"
- "teq %1, %4\n"
- "strexeq %0, %5, [%3]\n"
- : "=&r" (res), "=&r" (oldval), "+Qo" (ptr->counter)
- : "r" (&ptr->counter), "Ir" (old), "r" (new)
- : "cc");
- } while (res);
+ do {
+ __asm__ __volatile__("@ atomic_cmpxchg\n"
+ "ldrex %1, [%3]\n"
+ "mov %0, #0\n"
+ "teq %1, %4\n"
+ "strexeq %0, %5, [%3]\n"
+ : "=&r" (res), "=&r" (oldval), "+Qo" (ptr->counter)
+ : "r" (&ptr->counter), "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
- smp_mb();
+ smp_mb();
- return oldval;
+ return oldval;
}
static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
{
- unsigned long tmp, tmp2;
-
- __asm__ __volatile__("@ atomic_clear_mask\n"
-"1: ldrex %0, [%3]\n"
-" bic %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (tmp), "=&r" (tmp2), "+Qo" (*addr)
- : "r" (addr), "Ir" (mask)
- : "cc");
+ unsigned long tmp, tmp2;
+
+ __asm__ __volatile__("@ atomic_clear_mask\n"
+"1: ldrex %0, [%3]\n"
+" bic %0, %0, %4\n"
+" strex %1, %0, [%3]\n"
+" teq %1, #0\n"
+" bne 1b"
+ : "=&r" (tmp), "=&r" (tmp2), "+Qo" (*addr)
+ : "r" (addr), "Ir" (mask)
+ : "cc");
}
-#define atomic_inc(v) atomic_add(1, v)
-#define atomic_dec(v) atomic_sub(1, v)
+#define atomic_inc(v) atomic_add(1, v)
+#define atomic_dec(v) atomic_sub(1, v)
-#define atomic_inc_and_test(v) (atomic_add_return(1, v) == 0)
-#define atomic_dec_and_test(v) (atomic_sub_return(1, v) == 0)
+#define atomic_inc_and_test(v) (atomic_add_return(1, v) == 0)
+#define atomic_dec_and_test(v) (atomic_sub_return(1, v) == 0)
#define atomic_inc_return(v) (atomic_add_return(1, v))
#define atomic_dec_return(v) (atomic_sub_return(1, v))
#define atomic_sub_and_test(i, v) (atomic_sub_return(i, v) == 0)
@@ -145,7 +145,7 @@ static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
* Local variables:
* mode: C
* c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
* End:
*/
diff --git a/xen/include/asm-arm/arm64/atomic.h b/xen/include/asm-arm/arm64/atomic.h
index a279755..b04e6d5 100644
--- a/xen/include/asm-arm/arm64/atomic.h
+++ b/xen/include/asm-arm/arm64/atomic.h
@@ -157,7 +157,7 @@ static inline int __atomic_add_unless(atomic_t *v, int a, int u)
* Local variables:
* mode: C
* c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
* End:
*/
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 05/17] xen: arm32: resync atomics with (almost) v3.14-rc7
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (3 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 04/17] xen: arm32: replace hard tabs in atomics.h Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 14:10 ` Julien Grall
2014-03-26 13:38 ` [PATCH v2 06/17] xen: arm32: resync mem* with Linux v3.14-rc7 Ian Campbell
` (13 subsequent siblings)
18 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Almost because I omitting aed3a4e "ARM: 7868/1: arm/arm64: remove
atomic_clear_mask() ..." which I will apply to both arm32 and arm64
simultaneously in a later patch.
This pulls in the following Linux patches:
commit f38d999c4d16fc0fce4270374f15fbb2d8713c09
Author: Will Deacon <will.deacon@arm.com>
Date: Thu Jul 4 11:43:18 2013 +0100
ARM: atomics: prefetch the destination word for write prior to strex
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch prefixes our atomic access implementations with pldw
instructions (on CPUs which support them) to try and grab the line in
exclusive state from the start. Only the barrier-less functions are
updated, since memory barriers can limit the usefulness of prefetching
data.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
commit 4dcc1cf7316a26e112f5c9fcca531ff98ef44700
Author: Chen Gang <gang.chen@asianux.com>
Date: Sat Oct 26 15:07:25 2013 +0100
ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval
For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
instruction).
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
xen/include/asm-arm/arm32/atomic.h | 6 +++++-
xen/include/asm-arm/atomic.h | 1 +
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/xen/include/asm-arm/arm32/atomic.h b/xen/include/asm-arm/arm32/atomic.h
index 3f024d4..d309f66 100644
--- a/xen/include/asm-arm/arm32/atomic.h
+++ b/xen/include/asm-arm/arm32/atomic.h
@@ -21,6 +21,7 @@ static inline void atomic_add(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
__asm__ __volatile__("@ atomic_add\n"
"1: ldrex %0, [%3]\n"
" add %0, %0, %4\n"
@@ -59,6 +60,7 @@ static inline void atomic_sub(int i, atomic_t *v)
unsigned long tmp;
int result;
+ prefetchw(&v->counter);
__asm__ __volatile__("@ atomic_sub\n"
"1: ldrex %0, [%3]\n"
" sub %0, %0, %4\n"
@@ -94,7 +96,8 @@ static inline int atomic_sub_return(int i, atomic_t *v)
static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
{
- unsigned long oldval, res;
+ int oldval;
+ unsigned long res;
smp_mb();
@@ -118,6 +121,7 @@ static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
{
unsigned long tmp, tmp2;
+ prefetchw(addr);
__asm__ __volatile__("@ atomic_clear_mask\n"
"1: ldrex %0, [%3]\n"
" bic %0, %0, %4\n"
diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 69c8f3f..2c92de9 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -2,6 +2,7 @@
#define __ARCH_ARM_ATOMIC__
#include <xen/config.h>
+#include <xen/prefetch.h>
#include <asm/system.h>
#define build_atomic_read(name, size, width, type, reg)\
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v2 05/17] xen: arm32: resync atomics with (almost) v3.14-rc7
2014-03-26 13:38 ` [PATCH v2 05/17] xen: arm32: resync atomics with (almost) v3.14-rc7 Ian Campbell
@ 2014-03-26 14:10 ` Julien Grall
0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2014-03-26 14:10 UTC (permalink / raw)
To: Ian Campbell; +Cc: stefano.stabellini, tim, xen-devel
On 03/26/2014 01:38 PM, Ian Campbell wrote:
> Almost because I omitting aed3a4e "ARM: 7868/1: arm/arm64: remove
> atomic_clear_mask() ..." which I will apply to both arm32 and arm64
> simultaneously in a later patch.
>
> This pulls in the following Linux patches:
>
> commit f38d999c4d16fc0fce4270374f15fbb2d8713c09
> Author: Will Deacon <will.deacon@arm.com>
> Date: Thu Jul 4 11:43:18 2013 +0100
>
> ARM: atomics: prefetch the destination word for write prior to strex
>
> The cost of changing a cacheline from shared to exclusive state can be
> significant, especially when this is triggered by an exclusive store,
> since it may result in having to retry the transaction.
>
> This patch prefixes our atomic access implementations with pldw
> instructions (on CPUs which support them) to try and grab the line in
> exclusive state from the start. Only the barrier-less functions are
> updated, since memory barriers can limit the usefulness of prefetching
> data.
>
> Acked-by: Nicolas Pitre <nico@linaro.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
>
> commit 4dcc1cf7316a26e112f5c9fcca531ff98ef44700
> Author: Chen Gang <gang.chen@asianux.com>
> Date: Sat Oct 26 15:07:25 2013 +0100
>
> ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval
>
> For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
> type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
> instruction).
>
> Reviewed-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Chen Gang <gang.chen@asianux.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
--
Julien Grall
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 06/17] xen: arm32: resync mem* with Linux v3.14-rc7
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (4 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 05/17] xen: arm32: resync atomics with (almost) v3.14-rc7 Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 07/17] xen: arm32: add optimised memchr routine Ian Campbell
` (12 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This pulls in the following Linux commits:
commit 455bd4c430b0c0a361f38e8658a0d6cb469942b5
Author: Ivan Djelic <ivan.djelic@parrot.com>
Date: Wed Mar 6 20:09:27 2013 +0100
ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimi
Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on
assumptions about the implementation of memset and similar functions.
The current ARM optimized memset code does not return the value of
its first argument, as is usually expected from standard implementations.
For instance in the following function:
void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waite
{
memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter));
waiter->magic = waiter;
INIT_LIST_HEAD(&waiter->list);
}
compiled as:
800554d0 <debug_mutex_lock_common>:
800554d0: e92d4008 push {r3, lr}
800554d4: e1a00001 mov r0, r1
800554d8: e3a02010 mov r2, #16 ; 0x10
800554dc: e3a01011 mov r1, #17 ; 0x11
800554e0: eb04426e bl 80165ea0 <memset>
800554e4: e1a03000 mov r3, r0
800554e8: e583000c str r0, [r3, #12]
800554ec: e5830000 str r0, [r3]
800554f0: e5830004 str r0, [r3, #4]
800554f4: e8bd8008 pop {r3, pc}
GCC assumes memset returns the value of pointer 'waiter' in register r0; ca
register/memory corruptions.
This patch fixes the return value of the assembly version of memset.
It adds a 'mov' instruction and merges an additional load+store into
existing load/store instructions.
For ease of review, here is a breakdown of the patch into 4 simple steps:
Step 1
======
Perform the following substitutions:
ip -> r8, then
r0 -> ip,
and insert 'mov ip, r0' as the first statement of the function.
At this point, we have a memset() implementation returning the proper resul
but corrupting r8 on some paths (the ones that were using ip).
Step 2
======
Make sure r8 is saved and restored when (! CALGN(1)+0) == 1:
save r8:
- str lr, [sp, #-4]!
+ stmfd sp!, {r8, lr}
and restore r8 on both exit paths:
- ldmeqfd sp!, {pc} @ Now <64 bytes to go.
+ ldmeqfd sp!, {r8, pc} @ Now <64 bytes to go.
(...)
tst r2, #16
stmneia ip!, {r1, r3, r8, lr}
- ldr lr, [sp], #4
+ ldmfd sp!, {r8, lr}
Step 3
======
Make sure r8 is saved and restored when (! CALGN(1)+0) == 0:
save r8:
- stmfd sp!, {r4-r7, lr}
+ stmfd sp!, {r4-r8, lr}
and restore r8 on both exit paths:
bgt 3b
- ldmeqfd sp!, {r4-r7, pc}
+ ldmeqfd sp!, {r4-r8, pc}
(...)
tst r2, #16
stmneia ip!, {r4-r7}
- ldmfd sp!, {r4-r7, lr}
+ ldmfd sp!, {r4-r8, lr}
Step 4
======
Rewrite register list "r4-r7, r8" as "r4-r8".
Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit 418df63adac56841ef6b0f1fcf435bc64d4ed177
Author: Nicolas Pitre <nicolas.pitre@linaro.org>
Date: Tue Mar 12 13:00:42 2013 +0100
ARM: 7670/1: fix the memset fix
Commit 455bd4c430b0 ("ARM: 7668/1: fix memset-related crashes caused by
recent GCC (4.7.2) optimizations") attempted to fix a compliance issue
with the memset return value. However the memset itself became broken
by that patch for misaligned pointers.
This fixes the above by branching over the entry code from the
misaligned fixup code to avoid reloading the original pointer.
Also, because the function entry alignment is wrong in the Thumb mode
compilation, that fixup code is moved to the end.
While at it, the entry instructions are slightly reworked to help dual
issue pipelines.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm32/lib/memset.S | 100 +++++++++++++++++++--------------------
1 file changed, 48 insertions(+), 52 deletions(-)
diff --git a/xen/arch/arm/arm32/lib/memset.S b/xen/arch/arm/arm32/lib/memset.S
index d2937a3..c8ab257 100644
--- a/xen/arch/arm/arm32/lib/memset.S
+++ b/xen/arch/arm/arm32/lib/memset.S
@@ -16,27 +16,15 @@
.text
.align 5
- .word 0
-
-1: subs r2, r2, #4 @ 1 do we have enough
- blt 5f @ 1 bytes to align with?
- cmp r3, #2 @ 1
- strltb r1, [r0], #1 @ 1
- strleb r1, [r0], #1 @ 1
- strb r1, [r0], #1 @ 1
- add r2, r2, r3 @ 1 (r2 = r2 - (4 - r3))
-/*
- * The pointer is now aligned and the length is adjusted. Try doing the
- * memset again.
- */
ENTRY(memset)
ands r3, r0, #3 @ 1 unaligned?
- bne 1b @ 1
+ mov ip, r0 @ preserve r0 as return value
+ bne 6f @ 1
/*
- * we know that the pointer in r0 is aligned to a word boundary.
+ * we know that the pointer in ip is aligned to a word boundary.
*/
- orr r1, r1, r1, lsl #8
+1: orr r1, r1, r1, lsl #8
orr r1, r1, r1, lsl #16
mov r3, r1
cmp r2, #16
@@ -45,29 +33,28 @@ ENTRY(memset)
#if ! CALGN(1)+0
/*
- * We need an extra register for this loop - save the return address and
- * use the LR
+ * We need 2 extra registers for this loop - use r8 and the LR
*/
- str lr, [sp, #-4]!
- mov ip, r1
+ stmfd sp!, {r8, lr}
+ mov r8, r1
mov lr, r1
2: subs r2, r2, #64
- stmgeia r0!, {r1, r3, ip, lr} @ 64 bytes at a time.
- stmgeia r0!, {r1, r3, ip, lr}
- stmgeia r0!, {r1, r3, ip, lr}
- stmgeia r0!, {r1, r3, ip, lr}
+ stmgeia ip!, {r1, r3, r8, lr} @ 64 bytes at a time.
+ stmgeia ip!, {r1, r3, r8, lr}
+ stmgeia ip!, {r1, r3, r8, lr}
+ stmgeia ip!, {r1, r3, r8, lr}
bgt 2b
- ldmeqfd sp!, {pc} @ Now <64 bytes to go.
+ ldmeqfd sp!, {r8, pc} @ Now <64 bytes to go.
/*
* No need to correct the count; we're only testing bits from now on
*/
tst r2, #32
- stmneia r0!, {r1, r3, ip, lr}
- stmneia r0!, {r1, r3, ip, lr}
+ stmneia ip!, {r1, r3, r8, lr}
+ stmneia ip!, {r1, r3, r8, lr}
tst r2, #16
- stmneia r0!, {r1, r3, ip, lr}
- ldr lr, [sp], #4
+ stmneia ip!, {r1, r3, r8, lr}
+ ldmfd sp!, {r8, lr}
#else
@@ -76,54 +63,63 @@ ENTRY(memset)
* whole cache lines at once.
*/
- stmfd sp!, {r4-r7, lr}
+ stmfd sp!, {r4-r8, lr}
mov r4, r1
mov r5, r1
mov r6, r1
mov r7, r1
- mov ip, r1
+ mov r8, r1
mov lr, r1
cmp r2, #96
- tstgt r0, #31
+ tstgt ip, #31
ble 3f
- and ip, r0, #31
- rsb ip, ip, #32
- sub r2, r2, ip
- movs ip, ip, lsl #(32 - 4)
- stmcsia r0!, {r4, r5, r6, r7}
- stmmiia r0!, {r4, r5}
- tst ip, #(1 << 30)
- mov ip, r1
- strne r1, [r0], #4
+ and r8, ip, #31
+ rsb r8, r8, #32
+ sub r2, r2, r8
+ movs r8, r8, lsl #(32 - 4)
+ stmcsia ip!, {r4, r5, r6, r7}
+ stmmiia ip!, {r4, r5}
+ tst r8, #(1 << 30)
+ mov r8, r1
+ strne r1, [ip], #4
3: subs r2, r2, #64
- stmgeia r0!, {r1, r3-r7, ip, lr}
- stmgeia r0!, {r1, r3-r7, ip, lr}
+ stmgeia ip!, {r1, r3-r8, lr}
+ stmgeia ip!, {r1, r3-r8, lr}
bgt 3b
- ldmeqfd sp!, {r4-r7, pc}
+ ldmeqfd sp!, {r4-r8, pc}
tst r2, #32
- stmneia r0!, {r1, r3-r7, ip, lr}
+ stmneia ip!, {r1, r3-r8, lr}
tst r2, #16
- stmneia r0!, {r4-r7}
- ldmfd sp!, {r4-r7, lr}
+ stmneia ip!, {r4-r7}
+ ldmfd sp!, {r4-r8, lr}
#endif
4: tst r2, #8
- stmneia r0!, {r1, r3}
+ stmneia ip!, {r1, r3}
tst r2, #4
- strne r1, [r0], #4
+ strne r1, [ip], #4
/*
* When we get here, we've got less than 4 bytes to zero. We
* may have an unaligned pointer as well.
*/
5: tst r2, #2
- strneb r1, [r0], #1
- strneb r1, [r0], #1
+ strneb r1, [ip], #1
+ strneb r1, [ip], #1
tst r2, #1
- strneb r1, [r0], #1
+ strneb r1, [ip], #1
mov pc, lr
+
+6: subs r2, r2, #4 @ 1 do we have enough
+ blt 5b @ 1 bytes to align with?
+ cmp r3, #2 @ 1
+ strltb r1, [ip], #1 @ 1
+ strleb r1, [ip], #1 @ 1
+ strb r1, [ip], #1 @ 1
+ add r2, r2, r3 @ 1 (r2 = r2 - (4 - r3))
+ b 1b
ENDPROC(memset)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 07/17] xen: arm32: add optimised memchr routine
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (5 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 06/17] xen: arm32: resync mem* with Linux v3.14-rc7 Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 08/17] xen: arm32: add optimised strchr and strrchr routines Ian Campbell
` (11 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This isn't used enough to be critical, but it completes the set of mem*.
Taken from Linux v3.14-rc7.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm32/lib/Makefile | 2 +-
xen/arch/arm/arm32/lib/memchr.S | 28 ++++++++++++++++++++++++++++
xen/include/asm-arm/string.h | 3 +++
3 files changed, 32 insertions(+), 1 deletion(-)
create mode 100644 xen/arch/arm/arm32/lib/memchr.S
diff --git a/xen/arch/arm/arm32/lib/Makefile b/xen/arch/arm/arm32/lib/Makefile
index 4cf41f4..fa4e241 100644
--- a/xen/arch/arm/arm32/lib/Makefile
+++ b/xen/arch/arm/arm32/lib/Makefile
@@ -1,4 +1,4 @@
-obj-y += memcpy.o memmove.o memset.o memzero.o
+obj-y += memcpy.o memmove.o memset.o memchr.o memzero.o
obj-y += findbit.o setbit.o
obj-y += setbit.o clearbit.o changebit.o
obj-y += testsetbit.o testclearbit.o testchangebit.o
diff --git a/xen/arch/arm/arm32/lib/memchr.S b/xen/arch/arm/arm32/lib/memchr.S
new file mode 100644
index 0000000..fd64ed8
--- /dev/null
+++ b/xen/arch/arm/arm32/lib/memchr.S
@@ -0,0 +1,28 @@
+/*
+ * linux/arch/arm/lib/memchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * ASM optimised string functions
+ */
+
+#include <xen/config.h>
+
+#include "assembler.h"
+
+ .text
+ .align 5
+ENTRY(memchr)
+1: subs r2, r2, #1
+ bmi 2f
+ ldrb r3, [r0], #1
+ teq r3, r1
+ bne 1b
+ sub r0, r0, #1
+2: movne r0, #0
+ mov pc, lr
+ENDPROC(memchr)
diff --git a/xen/include/asm-arm/string.h b/xen/include/asm-arm/string.h
index abfa9d2..2c9f4f7 100644
--- a/xen/include/asm-arm/string.h
+++ b/xen/include/asm-arm/string.h
@@ -14,6 +14,9 @@ extern void *memmove(void *dest, const void *src, size_t n);
#define __HAVE_ARCH_MEMSET
extern void * memset(void *, int, __kernel_size_t);
+#define __HAVE_ARCH_MEMCHR
+extern void * memchr(const void *, int, __kernel_size_t);
+
extern void __memzero(void *ptr, __kernel_size_t n);
#define memset(p,v,n) \
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 08/17] xen: arm32: add optimised strchr and strrchr routines
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (6 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 07/17] xen: arm32: add optimised memchr routine Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 09/17] xen: arm: remove atomic_clear_mask() Ian Campbell
` (10 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Taken from Linux v3.14-rc7.
These aren't widely used enough to be critical, but we may as well have them.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm32/lib/Makefile | 1 +
xen/arch/arm/arm32/lib/strchr.S | 29 +++++++++++++++++++++++++++++
xen/arch/arm/arm32/lib/strrchr.S | 28 ++++++++++++++++++++++++++++
xen/include/asm-arm/string.h | 12 ++++++++++++
4 files changed, 70 insertions(+)
create mode 100644 xen/arch/arm/arm32/lib/strchr.S
create mode 100644 xen/arch/arm/arm32/lib/strrchr.S
diff --git a/xen/arch/arm/arm32/lib/Makefile b/xen/arch/arm/arm32/lib/Makefile
index fa4e241..e9fbc59 100644
--- a/xen/arch/arm/arm32/lib/Makefile
+++ b/xen/arch/arm/arm32/lib/Makefile
@@ -2,4 +2,5 @@ obj-y += memcpy.o memmove.o memset.o memchr.o memzero.o
obj-y += findbit.o setbit.o
obj-y += setbit.o clearbit.o changebit.o
obj-y += testsetbit.o testclearbit.o testchangebit.o
+obj-y += strchr.o strrchr.o
obj-y += lib1funcs.o lshrdi3.o div64.o
diff --git a/xen/arch/arm/arm32/lib/strchr.S b/xen/arch/arm/arm32/lib/strchr.S
new file mode 100644
index 0000000..f01740e
--- /dev/null
+++ b/xen/arch/arm/arm32/lib/strchr.S
@@ -0,0 +1,29 @@
+/*
+ * linux/arch/arm/lib/strchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * ASM optimised string functions
+ */
+
+#include <xen/config.h>
+
+#include "assembler.h"
+
+ .text
+ .align 5
+ENTRY(strchr)
+ and r1, r1, #0xff
+1: ldrb r2, [r0], #1
+ teq r2, r1
+ teqne r2, #0
+ bne 1b
+ teq r2, r1
+ movne r0, #0
+ subeq r0, r0, #1
+ mov pc, lr
+ENDPROC(strchr)
diff --git a/xen/arch/arm/arm32/lib/strrchr.S b/xen/arch/arm/arm32/lib/strrchr.S
new file mode 100644
index 0000000..88fc0de
--- /dev/null
+++ b/xen/arch/arm/arm32/lib/strrchr.S
@@ -0,0 +1,28 @@
+/*
+ * linux/arch/arm/lib/strrchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * ASM optimised string functions
+ */
+
+#include <xen/config.h>
+
+#include "assembler.h"
+
+ .text
+ .align 5
+ENTRY(strrchr)
+ mov r3, #0
+1: ldrb r2, [r0], #1
+ teq r2, r1
+ subeq r3, r0, #1
+ teq r2, #0
+ bne 1b
+ mov r0, r3
+ mov pc, lr
+ENDPROC(strrchr)
diff --git a/xen/include/asm-arm/string.h b/xen/include/asm-arm/string.h
index 2c9f4f7..7d8b35a 100644
--- a/xen/include/asm-arm/string.h
+++ b/xen/include/asm-arm/string.h
@@ -4,6 +4,18 @@
#include <xen/config.h>
#if defined(CONFIG_ARM_32)
+
+/*
+ * We don't do inline string functions, since the
+ * optimised inline asm versions are not small.
+ */
+
+#define __HAVE_ARCH_STRRCHR
+extern char * strrchr(const char * s, int c);
+
+#define __HAVE_ARCH_STRCHR
+extern char * strchr(const char * s, int c);
+
#define __HAVE_ARCH_MEMCPY
extern void * memcpy(void *, const void *, __kernel_size_t);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 09/17] xen: arm: remove atomic_clear_mask()
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (7 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 08/17] xen: arm32: add optimised strchr and strrchr routines Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 10/17] xen: arm64: disable alignment traps Ian Campbell
` (9 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This has no users.
This brings arm32 atomic.h into sync with Linux v3.14-rc7.
arm64/atomic.h requires other patches for this to be the case.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/include/asm-arm/arm32/atomic.h | 16 ----------------
xen/include/asm-arm/arm64/atomic.h | 14 --------------
2 files changed, 30 deletions(-)
diff --git a/xen/include/asm-arm/arm32/atomic.h b/xen/include/asm-arm/arm32/atomic.h
index d309f66..3d601d1 100644
--- a/xen/include/asm-arm/arm32/atomic.h
+++ b/xen/include/asm-arm/arm32/atomic.h
@@ -117,22 +117,6 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
return oldval;
}
-static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
-{
- unsigned long tmp, tmp2;
-
- prefetchw(addr);
- __asm__ __volatile__("@ atomic_clear_mask\n"
-"1: ldrex %0, [%3]\n"
-" bic %0, %0, %4\n"
-" strex %1, %0, [%3]\n"
-" teq %1, #0\n"
-" bne 1b"
- : "=&r" (tmp), "=&r" (tmp2), "+Qo" (*addr)
- : "r" (addr), "Ir" (mask)
- : "cc");
-}
-
#define atomic_inc(v) atomic_add(1, v)
#define atomic_dec(v) atomic_sub(1, v)
diff --git a/xen/include/asm-arm/arm64/atomic.h b/xen/include/asm-arm/arm64/atomic.h
index b04e6d5..6b37945 100644
--- a/xen/include/asm-arm/arm64/atomic.h
+++ b/xen/include/asm-arm/arm64/atomic.h
@@ -110,20 +110,6 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
return oldval;
}
-static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
-{
- unsigned long tmp, tmp2;
-
- asm volatile("// atomic_clear_mask\n"
-"1: ldxr %0, %2\n"
-" bic %0, %0, %3\n"
-" stxr %w1, %0, %2\n"
-" cbnz %w1, 1b"
- : "=&r" (tmp), "=&r" (tmp2), "+Q" (*addr)
- : "Ir" (mask)
- : "cc");
-}
-
#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
static inline int __atomic_add_unless(atomic_t *v, int a, int u)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 10/17] xen: arm64: disable alignment traps
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (8 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 09/17] xen: arm: remove atomic_clear_mask() Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 11/17] xen: arm64: atomics: fix use of acquire + release for full barrier semantics Ian Campbell
` (8 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
The mem* primitives which I am about to import from Linux in a subsequent
patch rely on the hardware handling misalignment.
The benefits of an optimised memcpy etc outweigh the downsides.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
v2: fix typo in commit message and correct comment which refers to alignment
trapping status.
---
xen/arch/arm/arm64/head.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 9547ef5..d3e773d 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -239,9 +239,9 @@ skip_bss:
* Write-implies-XN disabled (for now),
* D-cache disabled (for now),
* I-cache enabled,
- * Alignment checking enabled,
+ * Alignment checking disabled,
* MMU translation disabled (for now). */
- ldr x0, =(HSCTLR_BASE|SCTLR_A)
+ ldr x0, =(HSCTLR_BASE)
msr SCTLR_EL2, x0
/* Rebuild the boot pagetable's first-level entries. The structure
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 11/17] xen: arm64: atomics: fix use of acquire + release for full barrier semantics
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (9 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 10/17] xen: arm64: disable alignment traps Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 12/17] xen: arm64: reinstate hard tabs in system.h cmpxchg Ian Campbell
` (7 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Xen, like Linux, expects full barrier semantics for bitops, atomics and
cmpxchgs. This issue was discovered on Linux and we get our implementation of
these from Linux so quoting Will Deacon in Linux commit 8e86f0b409a4 for the
gory details:
Linux requires a number of atomic operations to provide full barrier
semantics, that is no memory accesses after the operation can be
observed before any accesses up to and including the operation in
program order.
On arm64, these operations have been incorrectly implemented as follows:
// A, B, C are independent memory locations
<Access [A]>
// atomic_op (B)
1: ldaxr x0, [B] // Exclusive load with acquire
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
<Access [C]>
The assumption here being that two half barriers are equivalent to a
full barrier, so the only permitted ordering would be A -> B -> C
(where B is the atomic operation involving both a load and a store).
Unfortunately, this is not the case by the letter of the architecture
and, in fact, the accesses to A and C are permitted to pass their
nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
store-release on B). This is a clear violation of the full barrier
requirement.
The simple way to fix this is to implement the same algorithm as ARMv7
using explicit barriers:
<Access [A]>
// atomic_op (B)
dmb ish // Full barrier
1: ldxr x0, [B] // Exclusive load
<op(B)>
stxr w1, x0, [B] // Exclusive store
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
but this has the undesirable effect of introducing *two* full barrier
instructions. A better approach is actually the following, non-intuitive
sequence:
<Access [A]>
// atomic_op (B)
1: ldxr x0, [B] // Exclusive load
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
The simple observations here are:
- The dmb ensures that no subsequent accesses (e.g. the access to C)
can enter or pass the atomic sequence.
- The dmb also ensures that no prior accesses (e.g. the access to A)
can pass the atomic sequence.
- Therefore, no prior access can pass a subsequent access, or
vice-versa (i.e. A is strictly ordered before C).
- The stlxr ensures that no prior access can pass the store component
of the atomic operation.
The only tricky part remaining is the ordering between the ldxr and the
access to A, since the absence of the first dmb means that we're now
permitting re-ordering between the ldxr and any prior accesses.
From an (arbitrary) observer's point of view, there are two scenarios:
1. We have observed the ldxr. This means that if we perform a store to
[B], the ldxr will still return older data. If we can observe the
ldxr, then we can potentially observe the permitted re-ordering
with the access to A, which is clearly an issue when compared to
the dmb variant of the code. Thankfully, the exclusive monitor will
save us here since it will be cleared as a result of the store and
the ldxr will retry. Notice that any use of a later memory
observation to imply observation of the ldxr will also imply
observation of the access to A, since the stlxr/dmb ensure strict
ordering.
2. We have not observed the ldxr. This means we can perform a store
and influence the later ldxr. However, that doesn't actually tell
us anything about the access to [A], so we've not lost anything
here either when compared to the dmb variant.
This patch implements this solution for our barriered atomic operations,
ensuring that we satisfy the full barrier requirements where they are
needed.
Cc: <stable@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm64/lib/bitops.S | 3 +-
xen/include/asm-arm/arm64/atomic.h | 13 +++++---
xen/include/asm-arm/arm64/system.h | 61 ++++++++++++++++++------------------
3 files changed, 42 insertions(+), 35 deletions(-)
diff --git a/xen/arch/arm/arm64/lib/bitops.S b/xen/arch/arm/arm64/lib/bitops.S
index 80cc903..e1ad239 100644
--- a/xen/arch/arm/arm64/lib/bitops.S
+++ b/xen/arch/arm/arm64/lib/bitops.S
@@ -46,11 +46,12 @@ ENTRY( \name )
mov x2, #1
add x1, x1, x0, lsr #3 // Get word offset
lsl x4, x2, x3 // Create mask
-1: ldaxr w2, [x1]
+1: ldxr w2, [x1]
lsr w0, w2, w3 // Save old value of bit
\instr w2, w2, w4 // toggle bit
stlxr w5, w2, [x1]
cbnz w5, 1b
+ dmb ish
and w0, w0, #1
3: ret
ENDPROC(\name )
diff --git a/xen/include/asm-arm/arm64/atomic.h b/xen/include/asm-arm/arm64/atomic.h
index 6b37945..3f37ed5 100644
--- a/xen/include/asm-arm/arm64/atomic.h
+++ b/xen/include/asm-arm/arm64/atomic.h
@@ -48,7 +48,7 @@ static inline int atomic_add_return(int i, atomic_t *v)
int result;
asm volatile("// atomic_add_return\n"
-"1: ldaxr %w0, %2\n"
+"1: ldxr %w0, %2\n"
" add %w0, %w0, %w3\n"
" stlxr %w1, %w0, %2\n"
" cbnz %w1, 1b"
@@ -56,6 +56,7 @@ static inline int atomic_add_return(int i, atomic_t *v)
: "Ir" (i)
: "cc", "memory");
+ smp_mb();
return result;
}
@@ -80,7 +81,7 @@ static inline int atomic_sub_return(int i, atomic_t *v)
int result;
asm volatile("// atomic_sub_return\n"
-"1: ldaxr %w0, %2\n"
+"1: ldxr %w0, %2\n"
" sub %w0, %w0, %w3\n"
" stlxr %w1, %w0, %2\n"
" cbnz %w1, 1b"
@@ -88,6 +89,7 @@ static inline int atomic_sub_return(int i, atomic_t *v)
: "Ir" (i)
: "cc", "memory");
+ smp_mb();
return result;
}
@@ -96,17 +98,20 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
unsigned long tmp;
int oldval;
+ smp_mb();
+
asm volatile("// atomic_cmpxchg\n"
-"1: ldaxr %w1, %2\n"
+"1: ldxr %w1, %2\n"
" cmp %w1, %w3\n"
" b.ne 2f\n"
-" stlxr %w0, %w4, %2\n"
+" stxr %w0, %w4, %2\n"
" cbnz %w0, 1b\n"
"2:"
: "=&r" (tmp), "=&r" (oldval), "+Q" (ptr->counter)
: "Ir" (old), "r" (new)
: "cc", "memory");
+ smp_mb();
return oldval;
}
diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
index 570af5c..0db96e0 100644
--- a/xen/include/asm-arm/arm64/system.h
+++ b/xen/include/asm-arm/arm64/system.h
@@ -8,49 +8,50 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
{
unsigned long ret, tmp;
- switch (size) {
- case 1:
- asm volatile("// __xchg1\n"
- "1: ldaxrb %w0, %2\n"
- " stlxrb %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
+ switch (size) {
+ case 1:
+ asm volatile("// __xchg1\n"
+ "1: ldxrb %w0, %2\n"
+ " stlxrb %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
: "r" (x)
: "cc", "memory");
- break;
- case 2:
- asm volatile("// __xchg2\n"
- "1: ldaxrh %w0, %2\n"
- " stlxrh %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
+ break;
+ case 2:
+ asm volatile("// __xchg2\n"
+ "1: ldxrh %w0, %2\n"
+ " stlxrh %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
: "r" (x)
: "cc", "memory");
- break;
- case 4:
- asm volatile("// __xchg4\n"
- "1: ldaxr %w0, %2\n"
- " stlxr %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
+ break;
+ case 4:
+ asm volatile("// __xchg4\n"
+ "1: ldxr %w0, %2\n"
+ " stlxr %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
: "r" (x)
: "cc", "memory");
- break;
- case 8:
- asm volatile("// __xchg8\n"
- "1: ldaxr %0, %2\n"
- " stlxr %w1, %3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
+ break;
+ case 8:
+ asm volatile("// __xchg8\n"
+ "1: ldxr %0, %2\n"
+ " stlxr %w1, %3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
: "r" (x)
: "cc", "memory");
break;
default:
__bad_xchg(ptr, size), ret = 0;
break;
- }
+ }
- return ret;
+ smp_mb();
+ return ret;
}
#define xchg(ptr,x) \
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 12/17] xen: arm64: reinstate hard tabs in system.h cmpxchg
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (10 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 11/17] xen: arm64: atomics: fix use of acquire + release for full barrier semantics Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 13/17] xen: arm64: asm: remove redundant "cc" clobbers Ian Campbell
` (6 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
These functions are from Linux and the intention was to keep the formatting
the same to make resyncing easier.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/include/asm-arm/arm64/system.h | 196 ++++++++++++++++++------------------
1 file changed, 98 insertions(+), 98 deletions(-)
diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
index 0db96e0..9fa698b 100644
--- a/xen/include/asm-arm/arm64/system.h
+++ b/xen/include/asm-arm/arm64/system.h
@@ -6,7 +6,7 @@ extern void __bad_xchg(volatile void *, int);
static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
{
- unsigned long ret, tmp;
+ unsigned long ret, tmp;
switch (size) {
case 1:
@@ -15,8 +15,8 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" stlxrb %w1, %w3, %2\n"
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
- : "r" (x)
- : "cc", "memory");
+ : "r" (x)
+ : "cc", "memory");
break;
case 2:
asm volatile("// __xchg2\n"
@@ -24,8 +24,8 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" stlxrh %w1, %w3, %2\n"
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
- : "r" (x)
- : "cc", "memory");
+ : "r" (x)
+ : "cc", "memory");
break;
case 4:
asm volatile("// __xchg4\n"
@@ -33,8 +33,8 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" stlxr %w1, %w3, %2\n"
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
- : "r" (x)
- : "cc", "memory");
+ : "r" (x)
+ : "cc", "memory");
break;
case 8:
asm volatile("// __xchg8\n"
@@ -42,12 +42,12 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" stlxr %w1, %3, %2\n"
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
- : "r" (x)
- : "cc", "memory");
- break;
- default:
- __bad_xchg(ptr, size), ret = 0;
- break;
+ : "r" (x)
+ : "cc", "memory");
+ break;
+ default:
+ __bad_xchg(ptr, size), ret = 0;
+ break;
}
smp_mb();
@@ -55,107 +55,107 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
}
#define xchg(ptr,x) \
- ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
+ ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
extern void __bad_cmpxchg(volatile void *ptr, int size);
static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
+ unsigned long new, int size)
{
- unsigned long oldval = 0, res;
-
- switch (size) {
- case 1:
- do {
- asm volatile("// __cmpxchg1\n"
- " ldxrb %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxrb %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u8 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 2:
- do {
- asm volatile("// __cmpxchg2\n"
- " ldxrh %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxrh %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u16 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 4:
- do {
- asm volatile("// __cmpxchg4\n"
- " ldxr %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxr %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u32 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 8:
- do {
- asm volatile("// __cmpxchg8\n"
- " ldxr %1, %2\n"
- " mov %w0, #0\n"
- " cmp %1, %3\n"
- " b.ne 1f\n"
- " stxr %w0, %4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u64 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- default:
+ unsigned long oldval = 0, res;
+
+ switch (size) {
+ case 1:
+ do {
+ asm volatile("// __cmpxchg1\n"
+ " ldxrb %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxrb %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u8 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 2:
+ do {
+ asm volatile("// __cmpxchg2\n"
+ " ldxrh %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxrh %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u16 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 4:
+ do {
+ asm volatile("// __cmpxchg4\n"
+ " ldxr %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxr %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u32 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 8:
+ do {
+ asm volatile("// __cmpxchg8\n"
+ " ldxr %1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %1, %3\n"
+ " b.ne 1f\n"
+ " stxr %w0, %4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u64 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ default:
__bad_cmpxchg(ptr, size);
oldval = 0;
- }
+ }
- return oldval;
+ return oldval;
}
static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
+ unsigned long new, int size)
{
- unsigned long ret;
+ unsigned long ret;
- smp_mb();
- ret = __cmpxchg(ptr, old, new, size);
- smp_mb();
+ smp_mb();
+ ret = __cmpxchg(ptr, old, new, size);
+ smp_mb();
- return ret;
+ return ret;
}
-#define cmpxchg(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
-
-#define cmpxchg_local(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
+#define cmpxchg(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+
+#define cmpxchg_local(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
/* Uses uimm4 as a bitmask to select the clearing of one or more of
* the DAIF exception mask bits:
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 13/17] xen: arm64: asm: remove redundant "cc" clobbers
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (11 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 12/17] xen: arm64: reinstate hard tabs in system.h cmpxchg Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 14/17] xen: arm64: assembly optimised mem* and str* Ian Campbell
` (5 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
This resyncs atomics and cmpxchgs with Linux v3.14-rc7 by importing:
commit 95c4189689f92fba7ecf9097173404d4928c6e9b
Author: Will Deacon <will.deacon@arm.com>
Date: Tue Feb 4 12:29:13 2014 +0000
arm64: asm: remove redundant "cc" clobbers
cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
from inline asm blocks that only use these instructions to implement
conditional branches.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/include/asm-arm/arm64/atomic.h | 12 +++++-------
xen/include/asm-arm/arm64/spinlock.h | 6 +++---
xen/include/asm-arm/arm64/system.h | 8 ++++----
3 files changed, 12 insertions(+), 14 deletions(-)
diff --git a/xen/include/asm-arm/arm64/atomic.h b/xen/include/asm-arm/arm64/atomic.h
index 3f37ed5..b5d50f2 100644
--- a/xen/include/asm-arm/arm64/atomic.h
+++ b/xen/include/asm-arm/arm64/atomic.h
@@ -38,8 +38,7 @@ static inline void atomic_add(int i, atomic_t *v)
" stxr %w1, %w0, %2\n"
" cbnz %w1, 1b"
: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)
- : "Ir" (i)
- : "cc");
+ : "Ir" (i));
}
static inline int atomic_add_return(int i, atomic_t *v)
@@ -54,7 +53,7 @@ static inline int atomic_add_return(int i, atomic_t *v)
" cbnz %w1, 1b"
: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)
: "Ir" (i)
- : "cc", "memory");
+ : "memory");
smp_mb();
return result;
@@ -71,8 +70,7 @@ static inline void atomic_sub(int i, atomic_t *v)
" stxr %w1, %w0, %2\n"
" cbnz %w1, 1b"
: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)
- : "Ir" (i)
- : "cc");
+ : "Ir" (i));
}
static inline int atomic_sub_return(int i, atomic_t *v)
@@ -87,7 +85,7 @@ static inline int atomic_sub_return(int i, atomic_t *v)
" cbnz %w1, 1b"
: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)
: "Ir" (i)
- : "cc", "memory");
+ : "memory");
smp_mb();
return result;
@@ -109,7 +107,7 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
"2:"
: "=&r" (tmp), "=&r" (oldval), "+Q" (ptr->counter)
: "Ir" (old), "r" (new)
- : "cc", "memory");
+ : "cc");
smp_mb();
return oldval;
diff --git a/xen/include/asm-arm/arm64/spinlock.h b/xen/include/asm-arm/arm64/spinlock.h
index 3a36cfd..04300bc 100644
--- a/xen/include/asm-arm/arm64/spinlock.h
+++ b/xen/include/asm-arm/arm64/spinlock.h
@@ -70,7 +70,7 @@ static always_inline int _raw_read_trylock(raw_rwlock_t *rw)
"1:\n"
: "=&r" (tmp), "+r" (tmp2), "+Q" (rw->lock)
:
- : "cc", "memory");
+ : "memory");
return !tmp2;
}
@@ -86,7 +86,7 @@ static always_inline int _raw_write_trylock(raw_rwlock_t *rw)
"1:\n"
: "=&r" (tmp), "+Q" (rw->lock)
: "r" (0x80000000)
- : "cc", "memory");
+ : "memory");
return !tmp;
}
@@ -102,7 +102,7 @@ static inline void _raw_read_unlock(raw_rwlock_t *rw)
" cbnz %w1, 1b\n"
: "=&r" (tmp), "=&r" (tmp2), "+Q" (rw->lock)
:
- : "cc", "memory");
+ : "memory");
}
static inline void _raw_write_unlock(raw_rwlock_t *rw)
diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
index 9fa698b..fa50ead 100644
--- a/xen/include/asm-arm/arm64/system.h
+++ b/xen/include/asm-arm/arm64/system.h
@@ -16,7 +16,7 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
: "r" (x)
- : "cc", "memory");
+ : "memory");
break;
case 2:
asm volatile("// __xchg2\n"
@@ -25,7 +25,7 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
: "r" (x)
- : "cc", "memory");
+ : "memory");
break;
case 4:
asm volatile("// __xchg4\n"
@@ -34,7 +34,7 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
: "r" (x)
- : "cc", "memory");
+ : "memory");
break;
case 8:
asm volatile("// __xchg8\n"
@@ -43,7 +43,7 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
" cbnz %w1, 1b\n"
: "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
: "r" (x)
- : "cc", "memory");
+ : "memory");
break;
default:
__bad_xchg(ptr, size), ret = 0;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 14/17] xen: arm64: assembly optimised mem* and str*
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (12 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 13/17] xen: arm64: asm: remove redundant "cc" clobbers Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 13:38 ` [PATCH v2 15/17] xen: arm64: optimised clear_page Ian Campbell
` (4 subsequent siblings)
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Taken from Linux v3.14-rc7.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
---
xen/arch/arm/arm64/lib/Makefile | 2 ++
xen/arch/arm/arm64/lib/memchr.S | 43 +++++++++++++++++++++++++++++
xen/arch/arm/arm64/lib/memcpy.S | 52 +++++++++++++++++++++++++++++++++++
xen/arch/arm/arm64/lib/memmove.S | 56 ++++++++++++++++++++++++++++++++++++++
xen/arch/arm/arm64/lib/memset.S | 52 +++++++++++++++++++++++++++++++++++
xen/arch/arm/arm64/lib/strchr.S | 41 ++++++++++++++++++++++++++++
xen/arch/arm/arm64/lib/strrchr.S | 42 ++++++++++++++++++++++++++++
xen/include/asm-arm/string.h | 4 +--
8 files changed, 290 insertions(+), 2 deletions(-)
create mode 100644 xen/arch/arm/arm64/lib/memchr.S
create mode 100644 xen/arch/arm/arm64/lib/memcpy.S
create mode 100644 xen/arch/arm/arm64/lib/memmove.S
create mode 100644 xen/arch/arm/arm64/lib/memset.S
create mode 100644 xen/arch/arm/arm64/lib/strchr.S
create mode 100644 xen/arch/arm/arm64/lib/strrchr.S
diff --git a/xen/arch/arm/arm64/lib/Makefile b/xen/arch/arm/arm64/lib/Makefile
index 32c02c4..9f3b236 100644
--- a/xen/arch/arm/arm64/lib/Makefile
+++ b/xen/arch/arm/arm64/lib/Makefile
@@ -1 +1,3 @@
+obj-y += memcpy.o memmove.o memset.o memchr.o
obj-y += bitops.o find_next_bit.o
+obj-y += strchr.o strrchr.o
diff --git a/xen/arch/arm/arm64/lib/memchr.S b/xen/arch/arm/arm64/lib/memchr.S
new file mode 100644
index 0000000..3cc1b01
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/memchr.S
@@ -0,0 +1,43 @@
+/*
+ * Based on arch/arm/lib/memchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Find a character in an area of memory.
+ *
+ * Parameters:
+ * x0 - buf
+ * x1 - c
+ * x2 - n
+ * Returns:
+ * x0 - address of first occurrence of 'c' or 0
+ */
+ENTRY(memchr)
+ and w1, w1, #0xff
+1: subs x2, x2, #1
+ b.mi 2f
+ ldrb w3, [x0], #1
+ cmp w3, w1
+ b.ne 1b
+ sub x0, x0, #1
+ ret
+2: mov x0, #0
+ ret
+ENDPROC(memchr)
diff --git a/xen/arch/arm/arm64/lib/memcpy.S b/xen/arch/arm/arm64/lib/memcpy.S
new file mode 100644
index 0000000..c8197c6
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/memcpy.S
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Copy a buffer from src to dest (alignment handled by the hardware)
+ *
+ * Parameters:
+ * x0 - dest
+ * x1 - src
+ * x2 - n
+ * Returns:
+ * x0 - dest
+ */
+ENTRY(memcpy)
+ mov x4, x0
+ subs x2, x2, #8
+ b.mi 2f
+1: ldr x3, [x1], #8
+ subs x2, x2, #8
+ str x3, [x4], #8
+ b.pl 1b
+2: adds x2, x2, #4
+ b.mi 3f
+ ldr w3, [x1], #4
+ sub x2, x2, #4
+ str w3, [x4], #4
+3: adds x2, x2, #2
+ b.mi 4f
+ ldrh w3, [x1], #2
+ sub x2, x2, #2
+ strh w3, [x4], #2
+4: adds x2, x2, #1
+ b.mi 5f
+ ldrb w3, [x1]
+ strb w3, [x4]
+5: ret
+ENDPROC(memcpy)
diff --git a/xen/arch/arm/arm64/lib/memmove.S b/xen/arch/arm/arm64/lib/memmove.S
new file mode 100644
index 0000000..1bf0936
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/memmove.S
@@ -0,0 +1,56 @@
+/*
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Move a buffer from src to test (alignment handled by the hardware).
+ * If dest <= src, call memcpy, otherwise copy in reverse order.
+ *
+ * Parameters:
+ * x0 - dest
+ * x1 - src
+ * x2 - n
+ * Returns:
+ * x0 - dest
+ */
+ENTRY(memmove)
+ cmp x0, x1
+ b.ls memcpy
+ add x4, x0, x2
+ add x1, x1, x2
+ subs x2, x2, #8
+ b.mi 2f
+1: ldr x3, [x1, #-8]!
+ subs x2, x2, #8
+ str x3, [x4, #-8]!
+ b.pl 1b
+2: adds x2, x2, #4
+ b.mi 3f
+ ldr w3, [x1, #-4]!
+ sub x2, x2, #4
+ str w3, [x4, #-4]!
+3: adds x2, x2, #2
+ b.mi 4f
+ ldrh w3, [x1, #-2]!
+ sub x2, x2, #2
+ strh w3, [x4, #-2]!
+4: adds x2, x2, #1
+ b.mi 5f
+ ldrb w3, [x1, #-1]
+ strb w3, [x4, #-1]
+5: ret
+ENDPROC(memmove)
diff --git a/xen/arch/arm/arm64/lib/memset.S b/xen/arch/arm/arm64/lib/memset.S
new file mode 100644
index 0000000..25a4fb6
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/memset.S
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Fill in the buffer with character c (alignment handled by the hardware)
+ *
+ * Parameters:
+ * x0 - buf
+ * x1 - c
+ * x2 - n
+ * Returns:
+ * x0 - buf
+ */
+ENTRY(memset)
+ mov x4, x0
+ and w1, w1, #0xff
+ orr w1, w1, w1, lsl #8
+ orr w1, w1, w1, lsl #16
+ orr x1, x1, x1, lsl #32
+ subs x2, x2, #8
+ b.mi 2f
+1: str x1, [x4], #8
+ subs x2, x2, #8
+ b.pl 1b
+2: adds x2, x2, #4
+ b.mi 3f
+ sub x2, x2, #4
+ str w1, [x4], #4
+3: adds x2, x2, #2
+ b.mi 4f
+ sub x2, x2, #2
+ strh w1, [x4], #2
+4: adds x2, x2, #1
+ b.mi 5f
+ strb w1, [x4]
+5: ret
+ENDPROC(memset)
diff --git a/xen/arch/arm/arm64/lib/strchr.S b/xen/arch/arm/arm64/lib/strchr.S
new file mode 100644
index 0000000..9e265e4
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/strchr.S
@@ -0,0 +1,41 @@
+/*
+ * Based on arch/arm/lib/strchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Find the first occurrence of a character in a string.
+ *
+ * Parameters:
+ * x0 - str
+ * x1 - c
+ * Returns:
+ * x0 - address of first occurrence of 'c' or 0
+ */
+ENTRY(strchr)
+ and w1, w1, #0xff
+1: ldrb w2, [x0], #1
+ cmp w2, w1
+ ccmp w2, wzr, #4, ne
+ b.ne 1b
+ sub x0, x0, #1
+ cmp w2, w1
+ csel x0, x0, xzr, eq
+ ret
+ENDPROC(strchr)
diff --git a/xen/arch/arm/arm64/lib/strrchr.S b/xen/arch/arm/arm64/lib/strrchr.S
new file mode 100644
index 0000000..3791754
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/strrchr.S
@@ -0,0 +1,42 @@
+/*
+ * Based on arch/arm/lib/strrchr.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ * Copyright (C) 2013 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Find the last occurrence of a character in a string.
+ *
+ * Parameters:
+ * x0 - str
+ * x1 - c
+ * Returns:
+ * x0 - address of last occurrence of 'c' or 0
+ */
+ENTRY(strrchr)
+ mov x3, #0
+ and w1, w1, #0xff
+1: ldrb w2, [x0], #1
+ cbz w2, 2f
+ cmp w2, w1
+ b.ne 1b
+ sub x3, x0, #1
+ b 1b
+2: mov x0, x3
+ ret
+ENDPROC(strrchr)
diff --git a/xen/include/asm-arm/string.h b/xen/include/asm-arm/string.h
index 7d8b35a..3242762 100644
--- a/xen/include/asm-arm/string.h
+++ b/xen/include/asm-arm/string.h
@@ -3,8 +3,6 @@
#include <xen/config.h>
-#if defined(CONFIG_ARM_32)
-
/*
* We don't do inline string functions, since the
* optimised inline asm versions are not small.
@@ -29,6 +27,8 @@ extern void * memset(void *, int, __kernel_size_t);
#define __HAVE_ARCH_MEMCHR
extern void * memchr(const void *, int, __kernel_size_t);
+#if defined(CONFIG_ARM_32)
+
extern void __memzero(void *ptr, __kernel_size_t n);
#define memset(p,v,n) \
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v2 15/17] xen: arm64: optimised clear_page
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (13 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 14/17] xen: arm64: assembly optimised mem* and str* Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 14:16 ` Julien Grall
2014-03-26 13:38 ` [PATCH v2 16/17] xen: arm: refactor xchg and cmpxchg into their own headers Ian Campbell
` (3 subsequent siblings)
18 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Taken from Linux v3.14-rc7.
The clear_page header now needs to be within the !__ASSEMBLY__
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
xen/arch/arm/arm64/lib/Makefile | 1 +
xen/arch/arm/arm64/lib/clear_page.S | 36 +++++++++++++++++++++++++++++++++++
xen/include/asm-arm/page.h | 9 +++++++--
3 files changed, 44 insertions(+), 2 deletions(-)
create mode 100644 xen/arch/arm/arm64/lib/clear_page.S
diff --git a/xen/arch/arm/arm64/lib/Makefile b/xen/arch/arm/arm64/lib/Makefile
index 9f3b236..b895afa 100644
--- a/xen/arch/arm/arm64/lib/Makefile
+++ b/xen/arch/arm/arm64/lib/Makefile
@@ -1,3 +1,4 @@
obj-y += memcpy.o memmove.o memset.o memchr.o
+obj-y += clear_page.o
obj-y += bitops.o find_next_bit.o
obj-y += strchr.o strrchr.o
diff --git a/xen/arch/arm/arm64/lib/clear_page.S b/xen/arch/arm/arm64/lib/clear_page.S
new file mode 100644
index 0000000..8d5cadb
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/clear_page.S
@@ -0,0 +1,36 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Clear page @dest
+ *
+ * Parameters:
+ * x0 - dest
+ */
+ENTRY(clear_page)
+ mrs x1, dczid_el0
+ and w1, w1, #0xf
+ mov x2, #4
+ lsl x1, x2, x1
+
+1: dc zva, x0
+ add x0, x0, x1
+ tst x0, #(PAGE_SIZE - 1)
+ b.ne 1b
+ ret
+ENDPROC(clear_page)
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index d18ec2a..e880ae8 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -341,6 +341,13 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t *paddr)
/* Bits in the PAR returned by va_to_par */
#define PAR_FAULT 0x1
+
+#ifdef CONFIG_ARM_32
+#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
+#else
+extern void clear_page(void *to);
+#endif
+
#endif /* __ASSEMBLY__ */
/*
@@ -382,8 +389,6 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t *paddr)
#define third_table_offset(va) TABLE_OFFSET(third_linear_offset(va))
#define zeroeth_table_offset(va) TABLE_OFFSET(zeroeth_linear_offset(va))
-#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
-
#define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
#endif /* __ARM_PAGE_H__ */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v2 15/17] xen: arm64: optimised clear_page
2014-03-26 13:38 ` [PATCH v2 15/17] xen: arm64: optimised clear_page Ian Campbell
@ 2014-03-26 14:16 ` Julien Grall
2014-04-02 16:23 ` [PATCH v3 " Ian Campbell
0 siblings, 1 reply; 31+ messages in thread
From: Julien Grall @ 2014-03-26 14:16 UTC (permalink / raw)
To: Ian Campbell; +Cc: stefano.stabellini, tim, xen-devel
Hi Ian,
On 03/26/2014 01:38 PM, Ian Campbell wrote:
> Taken from Linux v3.14-rc7.
>
> The clear_page header now needs to be within the !__ASSEMBLY__
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
[..]
> +
> +#ifdef CONFIG_ARM_32
> +#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> +#else
> +extern void clear_page(void *to);
> +#endif
> +
I would prefer if you move the defined in arm{32,64}/page.h.
Regards,
--
Julien Grall
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH v3 15/17] xen: arm64: optimised clear_page
2014-03-26 14:16 ` Julien Grall
@ 2014-04-02 16:23 ` Ian Campbell
2014-04-02 16:26 ` Julien Grall
0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-04-02 16:23 UTC (permalink / raw)
To: Julien Grall; +Cc: stefano.stabellini, tim, xen-devel
> > +#ifdef CONFIG_ARM_32
> > +#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> > +#else
> > +extern void clear_page(void *to);
> > +#endif
> > +
>
> I would prefer if you move the defined in arm{32,64}/page.h.
Ack. Just resending this one:
------------<8------------------
>From f679cbc3a191c43aa303b4e74f241e8755430ba0 Mon Sep 17 00:00:00 2001
From: Ian Campbell <ian.campbell@citrix.com>
Date: Wed, 19 Mar 2014 17:19:56 +0000
Subject: [PATCH] xen: arm64: optimised clear_page
Taken from Linux v3.14-rc7.
The clear_page header now needs to be within the !__ASSEMBLY__
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
---
v3: Move prototypes to arm{32,64}/page.h
---
xen/arch/arm/arm64/lib/Makefile | 1 +
xen/arch/arm/arm64/lib/clear_page.S | 36 +++++++++++++++++++++++++++++++++++
xen/include/asm-arm/arm32/page.h | 2 ++
xen/include/asm-arm/arm64/page.h | 2 ++
xen/include/asm-arm/page.h | 2 --
5 files changed, 41 insertions(+), 2 deletions(-)
create mode 100644 xen/arch/arm/arm64/lib/clear_page.S
diff --git a/xen/arch/arm/arm64/lib/Makefile b/xen/arch/arm/arm64/lib/Makefile
index 9f3b236..b895afa 100644
--- a/xen/arch/arm/arm64/lib/Makefile
+++ b/xen/arch/arm/arm64/lib/Makefile
@@ -1,3 +1,4 @@
obj-y += memcpy.o memmove.o memset.o memchr.o
+obj-y += clear_page.o
obj-y += bitops.o find_next_bit.o
obj-y += strchr.o strrchr.o
diff --git a/xen/arch/arm/arm64/lib/clear_page.S b/xen/arch/arm/arm64/lib/clear_page.S
new file mode 100644
index 0000000..8d5cadb
--- /dev/null
+++ b/xen/arch/arm/arm64/lib/clear_page.S
@@ -0,0 +1,36 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+
+/*
+ * Clear page @dest
+ *
+ * Parameters:
+ * x0 - dest
+ */
+ENTRY(clear_page)
+ mrs x1, dczid_el0
+ and w1, w1, #0xf
+ mov x2, #4
+ lsl x1, x2, x1
+
+1: dc zva, x0
+ add x0, x0, x1
+ tst x0, #(PAGE_SIZE - 1)
+ b.ne 1b
+ ret
+ENDPROC(clear_page)
diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
index 191a108..a4c1a1a 100644
--- a/xen/include/asm-arm/arm32/page.h
+++ b/xen/include/asm-arm/arm32/page.h
@@ -111,6 +111,8 @@ static inline uint64_t gva_to_ipa_par(vaddr_t va)
return par;
}
+#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
+
#endif /* __ASSEMBLY__ */
#endif /* __ARM_ARM32_PAGE_H__ */
diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
index 20b4c5a..91e1914 100644
--- a/xen/include/asm-arm/arm64/page.h
+++ b/xen/include/asm-arm/arm64/page.h
@@ -105,6 +105,8 @@ static inline uint64_t gva_to_ipa_par(vaddr_t va)
return par;
}
+extern void clear_page(void *to);
+
#endif /* __ASSEMBLY__ */
#endif /* __ARM_ARM64_PAGE_H__ */
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index d18ec2a..c77ba85 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -382,8 +382,6 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t *paddr)
#define third_table_offset(va) TABLE_OFFSET(third_linear_offset(va))
#define zeroeth_table_offset(va) TABLE_OFFSET(zeroeth_linear_offset(va))
-#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
-
#define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
#endif /* __ARM_PAGE_H__ */
--
1.7.10.4
> Regards,
>
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v3 15/17] xen: arm64: optimised clear_page
2014-04-02 16:23 ` [PATCH v3 " Ian Campbell
@ 2014-04-02 16:26 ` Julien Grall
0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2014-04-02 16:26 UTC (permalink / raw)
To: Ian Campbell; +Cc: stefano.stabellini, tim, xen-devel
On 04/02/2014 05:23 PM, Ian Campbell wrote:
>
>>> +#ifdef CONFIG_ARM_32
>>> +#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
>>> +#else
>>> +extern void clear_page(void *to);
>>> +#endif
>>> +
>>
>> I would prefer if you move the defined in arm{32,64}/page.h.
>
> Ack. Just resending this one:
Thanks!
> ------------<8------------------
>
>
> From f679cbc3a191c43aa303b4e74f241e8755430ba0 Mon Sep 17 00:00:00 2001
> From: Ian Campbell <ian.campbell@citrix.com>
> Date: Wed, 19 Mar 2014 17:19:56 +0000
> Subject: [PATCH] xen: arm64: optimised clear_page
>
> Taken from Linux v3.14-rc7.
>
> The clear_page header now needs to be within the !__ASSEMBLY__
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Julien Grall <julien.grall@linaro.org>
> ---
> v3: Move prototypes to arm{32,64}/page.h
> ---
> xen/arch/arm/arm64/lib/Makefile | 1 +
> xen/arch/arm/arm64/lib/clear_page.S | 36 +++++++++++++++++++++++++++++++++++
> xen/include/asm-arm/arm32/page.h | 2 ++
> xen/include/asm-arm/arm64/page.h | 2 ++
> xen/include/asm-arm/page.h | 2 --
> 5 files changed, 41 insertions(+), 2 deletions(-)
> create mode 100644 xen/arch/arm/arm64/lib/clear_page.S
>
> diff --git a/xen/arch/arm/arm64/lib/Makefile b/xen/arch/arm/arm64/lib/Makefile
> index 9f3b236..b895afa 100644
> --- a/xen/arch/arm/arm64/lib/Makefile
> +++ b/xen/arch/arm/arm64/lib/Makefile
> @@ -1,3 +1,4 @@
> obj-y += memcpy.o memmove.o memset.o memchr.o
> +obj-y += clear_page.o
> obj-y += bitops.o find_next_bit.o
> obj-y += strchr.o strrchr.o
> diff --git a/xen/arch/arm/arm64/lib/clear_page.S b/xen/arch/arm/arm64/lib/clear_page.S
> new file mode 100644
> index 0000000..8d5cadb
> --- /dev/null
> +++ b/xen/arch/arm/arm64/lib/clear_page.S
> @@ -0,0 +1,36 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/config.h>
> +
> +/*
> + * Clear page @dest
> + *
> + * Parameters:
> + * x0 - dest
> + */
> +ENTRY(clear_page)
> + mrs x1, dczid_el0
> + and w1, w1, #0xf
> + mov x2, #4
> + lsl x1, x2, x1
> +
> +1: dc zva, x0
> + add x0, x0, x1
> + tst x0, #(PAGE_SIZE - 1)
> + b.ne 1b
> + ret
> +ENDPROC(clear_page)
> diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
> index 191a108..a4c1a1a 100644
> --- a/xen/include/asm-arm/arm32/page.h
> +++ b/xen/include/asm-arm/arm32/page.h
> @@ -111,6 +111,8 @@ static inline uint64_t gva_to_ipa_par(vaddr_t va)
> return par;
> }
>
> +#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* __ARM_ARM32_PAGE_H__ */
> diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
> index 20b4c5a..91e1914 100644
> --- a/xen/include/asm-arm/arm64/page.h
> +++ b/xen/include/asm-arm/arm64/page.h
> @@ -105,6 +105,8 @@ static inline uint64_t gva_to_ipa_par(vaddr_t va)
> return par;
> }
>
> +extern void clear_page(void *to);
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* __ARM_ARM64_PAGE_H__ */
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index d18ec2a..c77ba85 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -382,8 +382,6 @@ static inline int gva_to_ipa(vaddr_t va, paddr_t *paddr)
> #define third_table_offset(va) TABLE_OFFSET(third_linear_offset(va))
> #define zeroeth_table_offset(va) TABLE_OFFSET(zeroeth_linear_offset(va))
>
> -#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> -
> #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>
> #endif /* __ARM_PAGE_H__ */
>
--
Julien Grall
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 16/17] xen: arm: refactor xchg and cmpxchg into their own headers
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (14 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 15/17] xen: arm64: optimised clear_page Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 14:18 ` Julien Grall
2014-03-26 13:38 ` [PATCH v2 17/17] xen: arm: document what low level primitives we have imported from Linux Ian Campbell
` (2 subsequent siblings)
18 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
Since these functions are taken from Linux this makes it easier to compare
against the Lihnux cmpxchg.h headers (which were split out from Linux's
system.h a while back).
Since these functions are from Linux the intention is to use Linux coding
style, therefore include a suitable emacs magic block.
For this reason also fix up the indentation in the 32-bit version to use hard
tabs while moving it. The 64-bit version was already correct.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
xen/include/asm-arm/arm32/cmpxchg.h | 146 ++++++++++++++++++++++++++++++
xen/include/asm-arm/arm32/system.h | 135 +---------------------------
xen/include/asm-arm/arm64/cmpxchg.h | 167 +++++++++++++++++++++++++++++++++++
xen/include/asm-arm/arm64/system.h | 155 +-------------------------------
4 files changed, 315 insertions(+), 288 deletions(-)
create mode 100644 xen/include/asm-arm/arm32/cmpxchg.h
create mode 100644 xen/include/asm-arm/arm64/cmpxchg.h
diff --git a/xen/include/asm-arm/arm32/cmpxchg.h b/xen/include/asm-arm/arm32/cmpxchg.h
new file mode 100644
index 0000000..70c6090
--- /dev/null
+++ b/xen/include/asm-arm/arm32/cmpxchg.h
@@ -0,0 +1,146 @@
+#ifndef __ASM_ARM32_CMPXCHG_H
+#define __ASM_ARM32_CMPXCHG_H
+
+extern void __bad_xchg(volatile void *, int);
+
+static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
+{
+ unsigned long ret;
+ unsigned int tmp;
+
+ smp_mb();
+
+ switch (size) {
+ case 1:
+ asm volatile("@ __xchg1\n"
+ "1: ldrexb %0, [%3]\n"
+ " strexb %1, %2, [%3]\n"
+ " teq %1, #0\n"
+ " bne 1b"
+ : "=&r" (ret), "=&r" (tmp)
+ : "r" (x), "r" (ptr)
+ : "memory", "cc");
+ break;
+ case 4:
+ asm volatile("@ __xchg4\n"
+ "1: ldrex %0, [%3]\n"
+ " strex %1, %2, [%3]\n"
+ " teq %1, #0\n"
+ " bne 1b"
+ : "=&r" (ret), "=&r" (tmp)
+ : "r" (x), "r" (ptr)
+ : "memory", "cc");
+ break;
+ default:
+ __bad_xchg(ptr, size), ret = 0;
+ break;
+ }
+ smp_mb();
+
+ return ret;
+}
+
+/*
+ * Atomic compare and exchange. Compare OLD with MEM, if identical,
+ * store NEW in MEM. Return the initial value in MEM. Success is
+ * indicated by comparing RETURN with OLD.
+ */
+
+extern void __bad_cmpxchg(volatile void *ptr, int size);
+
+static always_inline unsigned long __cmpxchg(
+ volatile void *ptr, unsigned long old, unsigned long new, int size)
+{
+ unsigned long oldval, res;
+
+ switch (size) {
+ case 1:
+ do {
+ asm volatile("@ __cmpxchg1\n"
+ " ldrexb %1, [%2]\n"
+ " mov %0, #0\n"
+ " teq %1, %3\n"
+ " strexbeq %0, %4, [%2]\n"
+ : "=&r" (res), "=&r" (oldval)
+ : "r" (ptr), "Ir" (old), "r" (new)
+ : "memory", "cc");
+ } while (res);
+ break;
+ case 2:
+ do {
+ asm volatile("@ __cmpxchg2\n"
+ " ldrexh %1, [%2]\n"
+ " mov %0, #0\n"
+ " teq %1, %3\n"
+ " strexheq %0, %4, [%2]\n"
+ : "=&r" (res), "=&r" (oldval)
+ : "r" (ptr), "Ir" (old), "r" (new)
+ : "memory", "cc");
+ } while (res);
+ break;
+ case 4:
+ do {
+ asm volatile("@ __cmpxchg4\n"
+ " ldrex %1, [%2]\n"
+ " mov %0, #0\n"
+ " teq %1, %3\n"
+ " strexeq %0, %4, [%2]\n"
+ : "=&r" (res), "=&r" (oldval)
+ : "r" (ptr), "Ir" (old), "r" (new)
+ : "memory", "cc");
+ } while (res);
+ break;
+#if 0
+ case 8:
+ do {
+ asm volatile("@ __cmpxchg8\n"
+ " ldrexd %1, [%2]\n"
+ " mov %0, #0\n"
+ " teq %1, %3\n"
+ " strexdeq %0, %4, [%2]\n"
+ : "=&r" (res), "=&r" (oldval)
+ : "r" (ptr), "Ir" (old), "r" (new)
+ : "memory", "cc");
+ } while (res);
+ break;
+#endif
+ default:
+ __bad_cmpxchg(ptr, size);
+ oldval = 0;
+ }
+
+ return oldval;
+}
+
+static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
+ unsigned long new, int size)
+{
+ unsigned long ret;
+
+ smp_mb();
+ ret = __cmpxchg(ptr, old, new, size);
+ smp_mb();
+
+ return ret;
+}
+
+#define cmpxchg(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+
+#define cmpxchg_local(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+#endif
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
index dfaa3b6..b47b942 100644
--- a/xen/include/asm-arm/arm32/system.h
+++ b/xen/include/asm-arm/arm32/system.h
@@ -2,140 +2,7 @@
#ifndef __ASM_ARM32_SYSTEM_H
#define __ASM_ARM32_SYSTEM_H
-extern void __bad_xchg(volatile void *, int);
-
-static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
-{
- unsigned long ret;
- unsigned int tmp;
-
- smp_mb();
-
- switch (size) {
- case 1:
- asm volatile("@ __xchg1\n"
- "1: ldrexb %0, [%3]\n"
- " strexb %1, %2, [%3]\n"
- " teq %1, #0\n"
- " bne 1b"
- : "=&r" (ret), "=&r" (tmp)
- : "r" (x), "r" (ptr)
- : "memory", "cc");
- break;
- case 4:
- asm volatile("@ __xchg4\n"
- "1: ldrex %0, [%3]\n"
- " strex %1, %2, [%3]\n"
- " teq %1, #0\n"
- " bne 1b"
- : "=&r" (ret), "=&r" (tmp)
- : "r" (x), "r" (ptr)
- : "memory", "cc");
- break;
- default:
- __bad_xchg(ptr, size), ret = 0;
- break;
- }
- smp_mb();
-
- return ret;
-}
-
-/*
- * Atomic compare and exchange. Compare OLD with MEM, if identical,
- * store NEW in MEM. Return the initial value in MEM. Success is
- * indicated by comparing RETURN with OLD.
- */
-
-extern void __bad_cmpxchg(volatile void *ptr, int size);
-
-static always_inline unsigned long __cmpxchg(
- volatile void *ptr, unsigned long old, unsigned long new, int size)
-{
- unsigned long /*long*/ oldval, res;
-
- switch (size) {
- case 1:
- do {
- asm volatile("@ __cmpxchg1\n"
- " ldrexb %1, [%2]\n"
- " mov %0, #0\n"
- " teq %1, %3\n"
- " strexbeq %0, %4, [%2]\n"
- : "=&r" (res), "=&r" (oldval)
- : "r" (ptr), "Ir" (old), "r" (new)
- : "memory", "cc");
- } while (res);
- break;
- case 2:
- do {
- asm volatile("@ __cmpxchg2\n"
- " ldrexh %1, [%2]\n"
- " mov %0, #0\n"
- " teq %1, %3\n"
- " strexheq %0, %4, [%2]\n"
- : "=&r" (res), "=&r" (oldval)
- : "r" (ptr), "Ir" (old), "r" (new)
- : "memory", "cc");
- } while (res);
- break;
- case 4:
- do {
- asm volatile("@ __cmpxchg4\n"
- " ldrex %1, [%2]\n"
- " mov %0, #0\n"
- " teq %1, %3\n"
- " strexeq %0, %4, [%2]\n"
- : "=&r" (res), "=&r" (oldval)
- : "r" (ptr), "Ir" (old), "r" (new)
- : "memory", "cc");
- } while (res);
- break;
-#if 0
- case 8:
- do {
- asm volatile("@ __cmpxchg8\n"
- " ldrexd %1, [%2]\n"
- " mov %0, #0\n"
- " teq %1, %3\n"
- " strexdeq %0, %4, [%2]\n"
- : "=&r" (res), "=&r" (oldval)
- : "r" (ptr), "Ir" (old), "r" (new)
- : "memory", "cc");
- } while (res);
- break;
-#endif
- default:
- __bad_cmpxchg(ptr, size);
- oldval = 0;
- }
-
- return oldval;
-}
-
-static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
-{
- unsigned long ret;
-
- smp_mb();
- ret = __cmpxchg(ptr, old, new, size);
- smp_mb();
-
- return ret;
-}
-
-#define cmpxchg(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
-
-#define cmpxchg_local(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
+#include <asm/arm32/cmpxchg.h>
#define local_irq_disable() asm volatile ( "cpsid i @ local_irq_disable\n" : : : "cc" )
#define local_irq_enable() asm volatile ( "cpsie i @ local_irq_enable\n" : : : "cc" )
diff --git a/xen/include/asm-arm/arm64/cmpxchg.h b/xen/include/asm-arm/arm64/cmpxchg.h
new file mode 100644
index 0000000..4e930ce
--- /dev/null
+++ b/xen/include/asm-arm/arm64/cmpxchg.h
@@ -0,0 +1,167 @@
+#ifndef __ASM_ARM64_CMPXCHG_H
+#define __ASM_ARM64_CMPXCHG_H
+
+extern void __bad_xchg(volatile void *, int);
+
+static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
+{
+ unsigned long ret, tmp;
+
+ switch (size) {
+ case 1:
+ asm volatile("// __xchg1\n"
+ "1: ldxrb %w0, %2\n"
+ " stlxrb %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
+ : "r" (x)
+ : "memory");
+ break;
+ case 2:
+ asm volatile("// __xchg2\n"
+ "1: ldxrh %w0, %2\n"
+ " stlxrh %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
+ : "r" (x)
+ : "memory");
+ break;
+ case 4:
+ asm volatile("// __xchg4\n"
+ "1: ldxr %w0, %2\n"
+ " stlxr %w1, %w3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
+ : "r" (x)
+ : "memory");
+ break;
+ case 8:
+ asm volatile("// __xchg8\n"
+ "1: ldxr %0, %2\n"
+ " stlxr %w1, %3, %2\n"
+ " cbnz %w1, 1b\n"
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
+ : "r" (x)
+ : "memory");
+ break;
+ default:
+ __bad_xchg(ptr, size), ret = 0;
+ break;
+ }
+
+ smp_mb();
+ return ret;
+}
+
+#define xchg(ptr,x) \
+ ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
+
+extern void __bad_cmpxchg(volatile void *ptr, int size);
+
+static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
+ unsigned long new, int size)
+{
+ unsigned long oldval = 0, res;
+
+ switch (size) {
+ case 1:
+ do {
+ asm volatile("// __cmpxchg1\n"
+ " ldxrb %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxrb %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u8 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 2:
+ do {
+ asm volatile("// __cmpxchg2\n"
+ " ldxrh %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxrh %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u16 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 4:
+ do {
+ asm volatile("// __cmpxchg4\n"
+ " ldxr %w1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %w1, %w3\n"
+ " b.ne 1f\n"
+ " stxr %w0, %w4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u32 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ case 8:
+ do {
+ asm volatile("// __cmpxchg8\n"
+ " ldxr %1, %2\n"
+ " mov %w0, #0\n"
+ " cmp %1, %3\n"
+ " b.ne 1f\n"
+ " stxr %w0, %4, %2\n"
+ "1:\n"
+ : "=&r" (res), "=&r" (oldval), "+Q" (*(u64 *)ptr)
+ : "Ir" (old), "r" (new)
+ : "cc");
+ } while (res);
+ break;
+
+ default:
+ __bad_cmpxchg(ptr, size);
+ oldval = 0;
+ }
+
+ return oldval;
+}
+
+static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
+ unsigned long new, int size)
+{
+ unsigned long ret;
+
+ smp_mb();
+ ret = __cmpxchg(ptr, old, new, size);
+ smp_mb();
+
+ return ret;
+}
+
+#define cmpxchg(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+
+#define cmpxchg_local(ptr,o,n) \
+ ((__typeof__(*(ptr)))__cmpxchg((ptr), \
+ (unsigned long)(o), \
+ (unsigned long)(n), \
+ sizeof(*(ptr))))
+
+#endif
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 8
+ * indent-tabs-mode: t
+ * End:
+ */
diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
index fa50ead..6efced3 100644
--- a/xen/include/asm-arm/arm64/system.h
+++ b/xen/include/asm-arm/arm64/system.h
@@ -2,160 +2,7 @@
#ifndef __ASM_ARM64_SYSTEM_H
#define __ASM_ARM64_SYSTEM_H
-extern void __bad_xchg(volatile void *, int);
-
-static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
-{
- unsigned long ret, tmp;
-
- switch (size) {
- case 1:
- asm volatile("// __xchg1\n"
- "1: ldxrb %w0, %2\n"
- " stlxrb %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)
- : "r" (x)
- : "memory");
- break;
- case 2:
- asm volatile("// __xchg2\n"
- "1: ldxrh %w0, %2\n"
- " stlxrh %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr)
- : "r" (x)
- : "memory");
- break;
- case 4:
- asm volatile("// __xchg4\n"
- "1: ldxr %w0, %2\n"
- " stlxr %w1, %w3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr)
- : "r" (x)
- : "memory");
- break;
- case 8:
- asm volatile("// __xchg8\n"
- "1: ldxr %0, %2\n"
- " stlxr %w1, %3, %2\n"
- " cbnz %w1, 1b\n"
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr)
- : "r" (x)
- : "memory");
- break;
- default:
- __bad_xchg(ptr, size), ret = 0;
- break;
- }
-
- smp_mb();
- return ret;
-}
-
-#define xchg(ptr,x) \
- ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
-
-extern void __bad_cmpxchg(volatile void *ptr, int size);
-
-static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
-{
- unsigned long oldval = 0, res;
-
- switch (size) {
- case 1:
- do {
- asm volatile("// __cmpxchg1\n"
- " ldxrb %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxrb %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u8 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 2:
- do {
- asm volatile("// __cmpxchg2\n"
- " ldxrh %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxrh %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u16 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 4:
- do {
- asm volatile("// __cmpxchg4\n"
- " ldxr %w1, %2\n"
- " mov %w0, #0\n"
- " cmp %w1, %w3\n"
- " b.ne 1f\n"
- " stxr %w0, %w4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u32 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- case 8:
- do {
- asm volatile("// __cmpxchg8\n"
- " ldxr %1, %2\n"
- " mov %w0, #0\n"
- " cmp %1, %3\n"
- " b.ne 1f\n"
- " stxr %w0, %4, %2\n"
- "1:\n"
- : "=&r" (res), "=&r" (oldval), "+Q" (*(u64 *)ptr)
- : "Ir" (old), "r" (new)
- : "cc");
- } while (res);
- break;
-
- default:
- __bad_cmpxchg(ptr, size);
- oldval = 0;
- }
-
- return oldval;
-}
-
-static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
-{
- unsigned long ret;
-
- smp_mb();
- ret = __cmpxchg(ptr, old, new, size);
- smp_mb();
-
- return ret;
-}
-
-#define cmpxchg(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
-
-#define cmpxchg_local(ptr,o,n) \
- ((__typeof__(*(ptr)))__cmpxchg((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))))
+#include <asm/arm64/cmpxchg.h>
/* Uses uimm4 as a bitmask to select the clearing of one or more of
* the DAIF exception mask bits:
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v2 16/17] xen: arm: refactor xchg and cmpxchg into their own headers
2014-03-26 13:38 ` [PATCH v2 16/17] xen: arm: refactor xchg and cmpxchg into their own headers Ian Campbell
@ 2014-03-26 14:18 ` Julien Grall
0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2014-03-26 14:18 UTC (permalink / raw)
To: Ian Campbell; +Cc: stefano.stabellini, tim, xen-devel
On 03/26/2014 01:38 PM, Ian Campbell wrote:
> Since these functions are taken from Linux this makes it easier to compare
> against the Lihnux cmpxchg.h headers (which were split out from Linux's
> system.h a while back).
>
> Since these functions are from Linux the intention is to use Linux coding
> style, therefore include a suitable emacs magic block.
>
> For this reason also fix up the indentation in the 32-bit version to use hard
> tabs while moving it. The 64-bit version was already correct.
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
--
Julien Grall
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 17/17] xen: arm: document what low level primitives we have imported from Linux
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (15 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 16/17] xen: arm: refactor xchg and cmpxchg into their own headers Ian Campbell
@ 2014-03-26 13:38 ` Ian Campbell
2014-03-26 14:20 ` Julien Grall
2014-03-27 16:59 ` [PATCH v2 00/17] xen: arm: resync low level asm primitive " Tim Deegan
2014-04-03 16:32 ` Ian Campbell
18 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-26 13:38 UTC (permalink / raw)
To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini
As part of the recent update I had to reverse engineer what we had, which was
very tedious. Check in my notes so that I have a reference for next time.
Now the secret is to remember to update this file every time!
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Don't mention assembler.h, it is part of Xen not imported from Linux
---
xen/arch/arm/README.LinuxPrimitives | 158 +++++++++++++++++++++++++++++++++++
1 file changed, 158 insertions(+)
create mode 100644 xen/arch/arm/README.LinuxPrimitives
diff --git a/xen/arch/arm/README.LinuxPrimitives b/xen/arch/arm/README.LinuxPrimitives
new file mode 100644
index 0000000..6cd03ca
--- /dev/null
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -0,0 +1,158 @@
+Xen on ARM uses various low level assembly primitives from the Linux
+kernel. This file tracks what files have been imported and when they
+were last updated.
+
+=====================================================================
+arm64:
+=====================================================================
+
+bitops: last sync @ v3.14-rc7 (last commit: 8e86f0b)
+
+linux/arch/arm64/lib/bitops.S xen/arch/arm/arm64/lib/bitops.S
+linux/arch/arm64/include/asm/bitops.h xen/include/asm-arm/arm64/bitops.h
+
+---------------------------------------------------------------------
+
+cmpxchg: last sync @ v3.14-rc7 (last commit: 95c4189)
+
+linux/arch/arm64/include/asm/cmpxchg.h xen/include/asm-arm/arm64/cmpxchg.h
+
+Skipped:
+ 60010e5 arm64: cmpxchg: update macros to prevent warnings
+
+---------------------------------------------------------------------
+
+atomics: last sync @ v3.14-rc7 (last commit: 95c4189)
+
+linux/arch/arm64/include/asm/atomic.h xen/include/asm-arm/arm64/atomic.h
+
+---------------------------------------------------------------------
+
+spinlocks: last sync @ v3.14-rc7 (last commit: 95c4189)
+
+linux/arch/arm64/include/asm/spinlock.h xen/include/asm-arm/arm64/spinlock.h
+
+Skipped:
+ 5686b06 arm64: lockref: add support for lockless lockrefs using cmpxchg
+ 52ea2a5 arm64: locks: introduce ticket-based spinlock implementation
+
+---------------------------------------------------------------------
+
+mem*: last sync @ v3.14-rc7 (last commit: 4a89922)
+
+linux/arch/arm64/lib/memchr.S xen/arch/arm/arm64/lib/memchr.S
+linux/arch/arm64/lib/memcpy.S xen/arch/arm/arm64/lib/memcpy.S
+linux/arch/arm64/lib/memmove.S xen/arch/arm/arm64/lib/memmove.S
+linux/arch/arm64/lib/memset.S xen/arch/arm/arm64/lib/memset.S
+
+for i in memchr.S memcpy.S memmove.S memset.S ; do
+ diff -u linux/arch/arm64/lib/$i xen/arch/arm/arm64/lib/$i
+done
+
+---------------------------------------------------------------------
+
+str*: last sync @ v3.14-rc7 (last commit: 2b8cac8)
+
+linux/arch/arm/lib/strchr.S xen/arch/arm/arm64/lib/strchr.S
+linux/arch/arm/lib/strrchr.S xen/arch/arm/arm64/lib/strrchr.S
+
+---------------------------------------------------------------------
+
+{clear,copy}_page: last sync @ v3.14-rc7 (last commit: f27bb13)
+
+linux/arch/arm64/lib/clear_page.S unused in Xen
+linux/arch/arm64/lib/copy_page.S xen/arch/arm/arm64/lib/copy_page.S
+
+=====================================================================
+arm32
+=====================================================================
+
+bitops: last sync @ v3.14-rc7 (last commit: b7ec699)
+
+linux/arch/arm/lib/bitops.h xen/arch/arm/arm32/lib/bitops.h
+linux/arch/arm/lib/changebit.S xen/arch/arm/arm32/lib/changebit.S
+linux/arch/arm/lib/clearbit.S xen/arch/arm/arm32/lib/clearbit.S
+linux/arch/arm/lib/findbit.S xen/arch/arm/arm32/lib/findbit.S
+linux/arch/arm/lib/setbit.S xen/arch/arm/arm32/lib/setbit.S
+linux/arch/arm/lib/testchangebit.S xen/arch/arm/arm32/lib/testchangebit.S
+linux/arch/arm/lib/testclearbit.S xen/arch/arm/arm32/lib/testclearbit.S
+linux/arch/arm/lib/testsetbit.S xen/arch/arm/arm32/lib/testsetbit.S
+
+for i in bitops.h changebit.S clearbit.S findbit.S setbit.S testchangebit.S \
+ testclearbit.S testsetbit.S; do
+ diff -u ../linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i;
+done
+
+---------------------------------------------------------------------
+
+cmpxchg: last sync @ v3.14-rc7 (last commit: 775ebcc)
+
+linux/arch/arm/include/asm/cmpxchg.h xen/include/asm-arm/arm32/cmpxchg.h
+
+---------------------------------------------------------------------
+
+atomics: last sync @ v3.14-rc7 (last commit: aed3a4e)
+
+linux/arch/arm/include/asm/atomic.h xen/include/asm-arm/arm32/atomic.h
+
+---------------------------------------------------------------------
+
+spinlocks: last sync: 15e7e5c1ebf5
+
+linux/arch/arm/include/asm/spinlock.h xen/include/asm-arm/arm32/spinlock.h
+
+resync to v3.14-rc7:
+
+ 7c8746a ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
+ 0cbad9c ARM: 7854/1: lockref: add support for lockless lockrefs using cmpxchg64
+ 9bb17be ARM: locks: prefetch the destination word for write prior to strex
+ 27a8479 ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.
+ 00efaa0 ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lo
+ afa31d8 ARM: 7811/1: locks: use early clobber in arch_spin_trylock
+ 73a6fdc ARM: spinlock: use inner-shareable dsb variant prior to sev instruction
+
+---------------------------------------------------------------------
+
+mem*: last sync @ v3.14-rc7 (last commit: 418df63a)
+
+linux/arch/arm/lib/copy_template.S xen/arch/arm/arm32/lib/copy_template.S
+linux/arch/arm/lib/memchr.S xen/arch/arm/arm32/lib/memchr.S
+linux/arch/arm/lib/memcpy.S xen/arch/arm/arm32/lib/memcpy.S
+linux/arch/arm/lib/memmove.S xen/arch/arm/arm32/lib/memmove.S
+linux/arch/arm/lib/memset.S xen/arch/arm/arm32/lib/memset.S
+linux/arch/arm/lib/memzero.S xen/arch/arm/arm32/lib/memzero.S
+
+linux/arch/arm/lib/strchr.S xen/arch/arm/arm32/lib/strchr.S
+linux/arch/arm/lib/strrchr.S xen/arch/arm/arm32/lib/strrchr.S
+
+for i in copy_template.S memchr.S memcpy.S memmove.S memset.S \
+ memzero.S ; do
+ diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
+done
+
+---------------------------------------------------------------------
+
+str*: last sync @ v3.13-rc7 (last commit: 93ed397)
+
+linux/arch/arm/lib/strchr.S xen/arch/arm/arm32/lib/strchr.S
+linux/arch/arm/lib/strrchr.S xen/arch/arm/arm32/lib/strrchr.S
+
+---------------------------------------------------------------------
+
+{clear,copy}_page: last sync: Never
+
+linux/arch/arm/lib/copy_page.S unused in Xen
+
+clear_page == memset
+
+---------------------------------------------------------------------
+
+libgcc: last sync @ v3.14-rc7 (last commit: 01885bc)
+
+linux/arch/arm/lib/lib1funcs.S xen/arch/arm/arm32/lib/lib1funcs.S
+linux/arch/arm/lib/lshrdi3.S xen/arch/arm/arm32/lib/lshrdi3.S
+linux/arch/arm/lib/div64.S xen/arch/arm/arm32/lib/div64.S
+
+for i in lib1funcs.S lshrdi3.S div64.S ; do
+ diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
+done
--
1.7.10.4
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (16 preceding siblings ...)
2014-03-26 13:38 ` [PATCH v2 17/17] xen: arm: document what low level primitives we have imported from Linux Ian Campbell
@ 2014-03-27 16:59 ` Tim Deegan
2014-03-27 17:02 ` Ian Campbell
2014-04-03 16:32 ` Ian Campbell
18 siblings, 1 reply; 31+ messages in thread
From: Tim Deegan @ 2014-03-27 16:59 UTC (permalink / raw)
To: Ian Campbell
Cc: Julien Grall, Stefano Stabellini, Keir Fraser, Jan Beulich,
xen-devel
At 13:36 +0000 on 26 Mar (1395837409), Ian Campbell wrote:
> (Jan/Keir -- only the first patch is of interest to you, Jan has
> reviewed it but an Ack from you Keir would be useful)
>
> The following resyncs the bitops, atomics, cmpxchg and various optimised
> library functions (str*, mem*, clear_page) from Linux. It also adds
> various additional optimised variants, especially for arm64 which was
> lacking them in Linux when we started.
Acked-by: Tim Deegan <tim@xen.org>
On ething that occurred to me, reading so soon after Julien's clang
series, is that we might just export mem{copy,move,set} symbols
directly as both their plain names and __aeabi_ versions, to save
having an eabi.c that just wraps them.
Tim.
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-27 16:59 ` [PATCH v2 00/17] xen: arm: resync low level asm primitive " Tim Deegan
@ 2014-03-27 17:02 ` Ian Campbell
2014-03-27 17:04 ` Tim Deegan
0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-27 17:02 UTC (permalink / raw)
To: Tim Deegan
Cc: Julien Grall, Stefano Stabellini, Keir Fraser, Jan Beulich,
xen-devel
On Thu, 2014-03-27 at 17:59 +0100, Tim Deegan wrote:
> At 13:36 +0000 on 26 Mar (1395837409), Ian Campbell wrote:
> > (Jan/Keir -- only the first patch is of interest to you, Jan has
> > reviewed it but an Ack from you Keir would be useful)
> >
> > The following resyncs the bitops, atomics, cmpxchg and various optimised
> > library functions (str*, mem*, clear_page) from Linux. It also adds
> > various additional optimised variants, especially for arm64 which was
> > lacking them in Linux when we started.
>
> Acked-by: Tim Deegan <tim@xen.org>
Thanks.
> On ething that occurred to me, reading so soon after Julien's clang
> series, is that we might just export mem{copy,move,set} symbols
> directly as both their plain names and __aeabi_ versions, to save
> having an eabi.c that just wraps them.
You mean in xen/arch/arm/arm32/lib/memcpy.S:
ENTRY(memcpy)
ENTRY(__aeabi_memcpy)
/* body */
ENDPROC(???)
Or something? That does seem quite sensible actually.
Ian.
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-27 17:02 ` Ian Campbell
@ 2014-03-27 17:04 ` Tim Deegan
2014-03-27 17:18 ` Ian Campbell
0 siblings, 1 reply; 31+ messages in thread
From: Tim Deegan @ 2014-03-27 17:04 UTC (permalink / raw)
To: Ian Campbell
Cc: Julien Grall, Stefano Stabellini, Keir Fraser, Jan Beulich,
xen-devel
At 17:02 +0000 on 27 Mar (1395936128), Ian Campbell wrote:
> On Thu, 2014-03-27 at 17:59 +0100, Tim Deegan wrote:
> > At 13:36 +0000 on 26 Mar (1395837409), Ian Campbell wrote:
> > > (Jan/Keir -- only the first patch is of interest to you, Jan has
> > > reviewed it but an Ack from you Keir would be useful)
> > >
> > > The following resyncs the bitops, atomics, cmpxchg and various optimised
> > > library functions (str*, mem*, clear_page) from Linux. It also adds
> > > various additional optimised variants, especially for arm64 which was
> > > lacking them in Linux when we started.
> >
> > Acked-by: Tim Deegan <tim@xen.org>
>
> Thanks.
>
> > On ething that occurred to me, reading so soon after Julien's clang
> > series, is that we might just export mem{copy,move,set} symbols
> > directly as both their plain names and __aeabi_ versions, to save
> > having an eabi.c that just wraps them.
>
> You mean in xen/arch/arm/arm32/lib/memcpy.S:
>
> ENTRY(memcpy)
> ENTRY(__aeabi_memcpy)
> /* body */
> ENDPROC(???)
>
> Or something? That does seem quite sensible actually.
Yep, something along those lines. Or equivalent linker runes, I
suppose, though that's nastier.
Tim.
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-27 17:04 ` Tim Deegan
@ 2014-03-27 17:18 ` Ian Campbell
2014-03-28 18:00 ` Julien Grall
0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2014-03-27 17:18 UTC (permalink / raw)
To: Tim Deegan
Cc: Julien Grall, Keir Fraser, Stefano Stabellini, Jan Beulich,
xen-devel
On Thu, 2014-03-27 at 18:04 +0100, Tim Deegan wrote:
> At 17:02 +0000 on 27 Mar (1395936128), Ian Campbell wrote:
> > On Thu, 2014-03-27 at 17:59 +0100, Tim Deegan wrote:
> > > At 13:36 +0000 on 26 Mar (1395837409), Ian Campbell wrote:
> > > > (Jan/Keir -- only the first patch is of interest to you, Jan has
> > > > reviewed it but an Ack from you Keir would be useful)
> > > >
> > > > The following resyncs the bitops, atomics, cmpxchg and various optimised
> > > > library functions (str*, mem*, clear_page) from Linux. It also adds
> > > > various additional optimised variants, especially for arm64 which was
> > > > lacking them in Linux when we started.
> > >
> > > Acked-by: Tim Deegan <tim@xen.org>
> >
> > Thanks.
> >
> > > On ething that occurred to me, reading so soon after Julien's clang
> > > series, is that we might just export mem{copy,move,set} symbols
> > > directly as both their plain names and __aeabi_ versions, to save
> > > having an eabi.c that just wraps them.
> >
> > You mean in xen/arch/arm/arm32/lib/memcpy.S:
> >
> > ENTRY(memcpy)
> > ENTRY(__aeabi_memcpy)
> > /* body */
> > ENDPROC(???)
> >
> > Or something? That does seem quite sensible actually.
>
> Yep, something along those lines. Or equivalent linker runes, I
> suppose, though that's nastier.
Right. Not sure if ENDPROC need duplcating or not, and if not whether
the unclosed ENTRY would be an issue. I suppose that would then require
linker runes.
Ian.
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-27 17:18 ` Ian Campbell
@ 2014-03-28 18:00 ` Julien Grall
0 siblings, 0 replies; 31+ messages in thread
From: Julien Grall @ 2014-03-28 18:00 UTC (permalink / raw)
To: Ian Campbell
Cc: Keir Fraser, Tim Deegan, xen-devel, Julien Grall,
Stefano Stabellini, Jan Beulich
On 03/27/2014 05:18 PM, Ian Campbell wrote:
>> Yep, something along those lines. Or equivalent linker runes, I
>> suppose, though that's nastier.
>
> Right. Not sure if ENDPROC need duplcating or not, and if not whether
> the unclosed ENTRY would be an issue. I suppose that would then require
> linker runes.
ENTRY and ENDPROC are not correlated:
- ENTRY: is mandatory for both to set the symbol as global
- ENDPROC: will set useful information for debugging.
Regards,
--
Julien Grall
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux
2014-03-26 13:36 [PATCH v2 00/17] xen: arm: resync low level asm primitive from Linux Ian Campbell
` (17 preceding siblings ...)
2014-03-27 16:59 ` [PATCH v2 00/17] xen: arm: resync low level asm primitive " Tim Deegan
@ 2014-04-03 16:32 ` Ian Campbell
18 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2014-04-03 16:32 UTC (permalink / raw)
To: xen-devel; +Cc: Julien Grall, Keir Fraser
On Wed, 2014-03-26 at 13:36 +0000, Ian Campbell wrote:
> The following resyncs the bitops, atomics, cmpxchg and various optimised
> library functions (str*, mem*, clear_page) from Linux. It also adds
> various additional optimised variants, especially for arm64 which was
> lacking them in Linux when we started.
Applied with the various acks etc.
Ian.
^ permalink raw reply [flat|nested] 31+ messages in thread