* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
@ 2007-11-22 8:40 ` Zoltan Menyhart
2007-12-13 23:58 ` akpm
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Zoltan Menyhart @ 2007-11-22 8:40 UTC (permalink / raw)
To: linux-ia64
Almost there :-)
> +static __inline__ void
> +__clear_bit_unlock(int nr, volatile void *addr)
> +{
> + __u32 mask, new;
> + volatile __u32 *m;
> +
> + m = (volatile __u32 *)addr + (nr >> 5);
Still cannot see why you need an ".acq" on this load.
Why do you use "volatile"?
What about this one?
static __inline__ void
__clear_bit_unlock(int const nr, void * const addr)
{
__u32 * const m = addr + (nr >> 5);
__u32 new;
new = *m & ~(1 << (nr & 31));
barrier();
asm volatile ("st4.rel.nta [%0] = %1\n\t" :: "r"(m), "r"(new));
}
Thanks,
Zoltan Menyhart
^ permalink raw reply [flat|nested] 10+ messages in thread* [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
2007-11-22 8:40 ` Zoltan Menyhart
@ 2007-12-13 23:58 ` akpm
2008-01-02 9:54 ` Zoltan Menyhart
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: akpm @ 2007-12-13 23:58 UTC (permalink / raw)
To: linux-ia64
From: Christoph Lameter <clameter@sgi.com>
__clear_bit_unlock does not need to perform atomic operations on the
variable. Avoid a cmpxchg and simply do a store with release semantics.
Add a barrier to be safe that the compiler does not do funky things.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/asm-ia64/bitops.h | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff -puN include/asm-ia64/bitops.h~ia64-slim-down-__clear_bit_unlock include/asm-ia64/bitops.h
--- a/include/asm-ia64/bitops.h~ia64-slim-down-__clear_bit_unlock
+++ a/include/asm-ia64/bitops.h
@@ -124,10 +124,21 @@ clear_bit_unlock (int nr, volatile void
/**
* __clear_bit_unlock - Non-atomically clear a bit with release
*
- * This is like clear_bit_unlock, but the implementation may use a non-atomic
- * store (this one uses an atomic, however).
+ * This is like clear_bit_unlock, but the implementation uses a store
+ * with release semantics. See also __raw_spin_unlock().
*/
-#define __clear_bit_unlock clear_bit_unlock
+static __inline__ void
+__clear_bit_unlock(int nr, volatile void *addr)
+{
+ __u32 mask, new;
+ volatile __u32 *m;
+
+ m = (volatile __u32 *)addr + (nr >> 5);
+ mask = ~(1 << (nr & 31));
+ new = *m & mask;
+ barrier();
+ asm volatile ("st4.rel.nta [%0] = %1\n\t" :: "r"(m), "r"(new));
+}
/**
* __clear_bit - Clears a bit in memory (non-atomic version)
_
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
2007-11-22 8:40 ` Zoltan Menyhart
2007-12-13 23:58 ` akpm
@ 2008-01-02 9:54 ` Zoltan Menyhart
2008-01-02 20:19 ` Christoph Lameter
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Zoltan Menyhart @ 2008-01-02 9:54 UTC (permalink / raw)
To: linux-ia64
Apparently, you reposted the same patch as you did on the 21st of November.
So let me ask again, why you use "volatile" here, why you need an ".acq"
on this load:
> + m = (volatile __u32 *)addr + (nr >> 5);
I still think the right solution is in my post on the 22nd of November.
Thanks,
Zoltan Menyhart
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (2 preceding siblings ...)
2008-01-02 9:54 ` Zoltan Menyhart
@ 2008-01-02 20:19 ` Christoph Lameter
2008-01-03 13:36 ` Zoltan Menyhart
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2008-01-02 20:19 UTC (permalink / raw)
To: linux-ia64
On Wed, 2 Jan 2008, Zoltan Menyhart wrote:
> I still think the right solution is in my post on the 22nd of November.
Please get us a patch against mainline so that we can consider merging
your changes.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (3 preceding siblings ...)
2008-01-02 20:19 ` Christoph Lameter
@ 2008-01-03 13:36 ` Zoltan Menyhart
2008-01-03 22:14 ` Luck, Tony
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Zoltan Menyhart @ 2008-01-03 13:36 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 734 bytes --]
Please have a look at the patch below.
Taking this opportunity, in addition:
- I removed the useless "volatile" stuff from the non-atomic versions
of the bit operations.
- I removed the unnecessary barrier() from __clear_bit_unlock().
ia64_st4_rel_nta() makes sure all the modifications are globally
seen before the bit is seen to be off.
- I made __clear_bit() modeled after __set_bit() and __change_bit().
- I corrected some comments sating that a memory barrier is provided,
yet in reality, it is the acquisition side of the memory barrier only.
- I corrected some comments, e.g. test_and_clear_bit() was peaking
about "bit to set".
Signed-off-by: Zoltan Menyhart, <Zoltan.Menyhart@bull.net>
Thanks,
Zoltan Menyhart
[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 4690 bytes --]
--- include/asm/bitops.h-old 2007-12-21 02:25:48.000000000 +0100
+++ include/asm/bitops.h 2008-01-03 14:25:17.000000000 +0100
@@ -60,7 +60,7 @@
* may be that only one operation succeeds.
*/
static __inline__ void
-__set_bit (int nr, volatile void *addr)
+__set_bit (int nr, void *addr)
{
*((__u32 *) addr + (nr >> 5)) |= (1 << (nr & 31));
}
@@ -122,38 +122,40 @@
}
/**
- * __clear_bit_unlock - Non-atomically clear a bit with release
+ * __clear_bit_unlock - Non-atomically clears a bit in memory with release
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
*
- * This is like clear_bit_unlock, but the implementation uses a store
+ * Similarly to clear_bit_unlock, the implementation uses a store
* with release semantics. See also __raw_spin_unlock().
*/
static __inline__ void
-__clear_bit_unlock(int nr, volatile void *addr)
+__clear_bit_unlock(int nr, void *addr)
{
- __u32 mask, new;
- volatile __u32 *m;
+ __u32 * const m = (__u32 *) addr + (nr >> 5);
+ __u32 const new = *m & ~(1 << (nr & 31));
- m = (volatile __u32 *)addr + (nr >> 5);
- mask = ~(1 << (nr & 31));
- new = *m & mask;
- barrier();
ia64_st4_rel_nta(m, new);
}
/**
* __clear_bit - Clears a bit in memory (non-atomic version)
+ * @nr: the bit to clear
+ * @addr: the address to start counting from
+ *
+ * Unlike clear_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
*/
static __inline__ void
-__clear_bit (int nr, volatile void *addr)
+__clear_bit (int nr, void *addr)
{
- volatile __u32 *p = (__u32 *) addr + (nr >> 5);
- __u32 m = 1 << (nr & 31);
- *p &= ~m;
+ *((__u32 *) addr + (nr >> 5)) &= ~(1 << (nr & 31));
}
/**
* change_bit - Toggle a bit in memory
- * @nr: Bit to clear
+ * @nr: Bit to toggle
* @addr: Address to start counting from
*
* change_bit() is atomic and may not be reordered.
@@ -178,7 +180,7 @@
/**
* __change_bit - Toggle a bit in memory
- * @nr: the bit to set
+ * @nr: the bit to toggle
* @addr: the address to start counting from
*
* Unlike change_bit(), this function is non-atomic and may be reordered.
@@ -186,7 +188,7 @@
* may be that only one operation succeeds.
*/
static __inline__ void
-__change_bit (int nr, volatile void *addr)
+__change_bit (int nr, void *addr)
{
*((__u32 *) addr + (nr >> 5)) ^= (1 << (nr & 31));
}
@@ -197,7 +199,7 @@
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_set_bit (int nr, volatile void *addr)
@@ -235,7 +237,7 @@
* but actually fail. You must protect multiple accesses with a lock.
*/
static __inline__ int
-__test_and_set_bit (int nr, volatile void *addr)
+__test_and_set_bit (int nr, void *addr)
{
__u32 *p = (__u32 *) addr + (nr >> 5);
__u32 m = 1 << (nr & 31);
@@ -247,11 +249,11 @@
/**
* test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to clear
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_clear_bit (int nr, volatile void *addr)
@@ -272,7 +274,7 @@
/**
* __test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to clear
* @addr: Address to count from
*
* This operation is non-atomic and can be reordered.
@@ -280,7 +282,7 @@
* but actually fail. You must protect multiple accesses with a lock.
*/
static __inline__ int
-__test_and_clear_bit(int nr, volatile void * addr)
+__test_and_clear_bit(int nr, void * addr)
{
__u32 *p = (__u32 *) addr + (nr >> 5);
__u32 m = 1 << (nr & 31);
@@ -292,11 +294,11 @@
/**
* test_and_change_bit - Change a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to change
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_change_bit (int nr, volatile void *addr)
@@ -315,8 +317,12 @@
return (old & bit) != 0;
}
-/*
- * WARNING: non atomic version.
+/**
+ * __test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
*/
static __inline__ int
__test_and_change_bit (int nr, void *addr)
^ permalink raw reply [flat|nested] 10+ messages in thread* RE: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (4 preceding siblings ...)
2008-01-03 13:36 ` Zoltan Menyhart
@ 2008-01-03 22:14 ` Luck, Tony
2008-01-11 2:02 ` Nick Piggin
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Luck, Tony @ 2008-01-03 22:14 UTC (permalink / raw)
To: linux-ia64
> - I removed the useless "volatile" stuff from the non-atomic versions
> of the bit operations.
This is the correct thing to do ... but I wonder how we can validate
that there are no callers that were depending on the extra ordering
that the old volatile versions were providing. Clearly any such callers
are broken ... but finding them the hard way (by executing the kernel
and waiting to see if something stange happens) is going to be unpleasant.
-Tony
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (5 preceding siblings ...)
2008-01-03 22:14 ` Luck, Tony
@ 2008-01-11 2:02 ` Nick Piggin
2008-01-15 13:15 ` [patch 1/3] IA64: Slim down __clear_bit_unlock #2 Zoltan Menyhart
2008-01-16 6:26 ` Nick Piggin
8 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2008-01-11 2:02 UTC (permalink / raw)
To: linux-ia64
On Friday 04 January 2008 00:36, Zoltan Menyhart wrote:
> Please have a look at the patch below.
OK, I just had a couple of comments...
> Taking this opportunity, in addition:
> - I removed the useless "volatile" stuff from the non-atomic versions
> of the bit operations.
This is a relatively big thing to be doing. I actually want to
remove all volatiles (except maybe in special accessor functions)
from the kernel, so great :) However it needs to be in a separate
patch, and it needs to be done for all architectures and
asm-generic to spread out the burden of testing. You should also
cc lkml and Linus on that one.
Make it on top of the __clear_bit_unlock work, so the ia64 specific
patch doesn't get held up.
> - I removed the unnecessary barrier() from __clear_bit_unlock().
> ia64_st4_rel_nta() makes sure all the modifications are globally
> seen before the bit is seen to be off.
Fine. I guess it doesn't need a comment because you ia64 guys know
this intimately.
> - I made __clear_bit() modeled after __set_bit() and __change_bit().
> - I corrected some comments sating that a memory barrier is provided,
> yet in reality, it is the acquisition side of the memory barrier only.
> - I corrected some comments, e.g. test_and_clear_bit() was peaking
> about "bit to set".
>
> Signed-off-by: Zoltan Menyhart, <Zoltan.Menyhart@bull.net>
I guess removing the acquire barrier from close to the release barrier
is a good idea. I won't ask for performance numbers because I guess
they are too hard to get a meaningful number for such a small and
obviously better change. It would just be good to know that code size
ends up being as small or smaller.
Anyway, I don't want to actually say ack to the ia64 parts without
having done any compilation or testing myself, but I would like
especially the volatile change to be moved. I guess Tony does too :)
Thanks,
Nick
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock #2
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (6 preceding siblings ...)
2008-01-11 2:02 ` Nick Piggin
@ 2008-01-15 13:15 ` Zoltan Menyhart
2008-01-16 6:26 ` Nick Piggin
8 siblings, 0 replies; 10+ messages in thread
From: Zoltan Menyhart @ 2008-01-15 13:15 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 2319 bytes --]
Please have a look at the patch below.
Taking this opportunity, in addition:
- I removed the unnecessary barrier() from __clear_bit_unlock().
ia64_st4_rel_nta() makes sure all the modifications are globally
seen before the bit is seen to be off.
- I made __clear_bit() modeled after __set_bit() and __change_bit().
- I corrected some comments sating that a memory barrier is provided,
yet in reality, it is the acquisition side of the memory barrier only.
- I corrected some comments, e.g. test_and_clear_bit() was peaking
about "bit to set".
Here is the code generated from my and the old versions.
(Though I do not know why "and" is moved in the 2nd bundle.):
test_new()
{
__clear_bit_unlock(3, &data);
}
test_old()
{
old__clear_bit_unlock(3, &data);
}
0000000000000000 <test_new>:
0: 02 00 00 00 01 00 [MII] nop.m 0x0
6: e0 00 04 00 48 00 addl r14=0,r1;;
c: 00 00 04 00 nop.i 0x0
10: 0b 10 00 1c 10 10 [MMI] ld4 r2=[r14];;
16: f0 b8 0b 58 44 00 and r15=-9,r2
1c: 00 00 04 00 nop.i 0x0;;
20: 0a 00 3c 1c b6 11 [MMI] st4.rel.nta [r14]=r15;;
26: 00 00 00 02 00 00 nop.m 0x0
2c: 01 70 00 84 mov r8=r14
30: 1d 00 00 00 01 00 [MFB] nop.m 0x0
36: 00 00 00 02 00 80 nop.f 0x0
3c: 08 00 84 00 br.ret.sptk.many b0;;
0000000000000040 <test_old>:
40: 02 00 00 00 01 00 [MII] nop.m 0x0
46: e0 00 04 00 48 00 addl r14=0,r1;;
4c: 00 00 04 00 nop.i 0x0
50: 03 10 00 1c b0 10 [MII] ld4.acq r2=[r14]
56: 00 00 00 02 00 e0 nop.i 0x0;;
5c: 71 17 b0 88 and r15=-9,r2;;
60: 0a 00 3c 1c b6 11 [MMI] st4.rel.nta [r14]=r15;;
66: 00 00 00 02 00 00 nop.m 0x0
6c: 01 70 00 84 mov r8=r14
70: 1d 00 00 00 01 00 [MFB] nop.m 0x0
76: 00 00 00 02 00 80 nop.f 0x0
7c: 08 00 84 00 br.ret.sptk.many b0;;
Signed-off-by: Zoltan Menyhart, <Zoltan.Menyhart@bull.net>
Thanks,
Zoltan Menyhart
[-- Attachment #2: new-diff --]
[-- Type: text/plain, Size: 3644 bytes --]
--- include/asm-ia64/bitops.h-old 2007-12-21 02:25:48.000000000 +0100
+++ include/asm-ia64/bitops.h-tmp 2008-01-15 13:16:26.000000000 +0100
@@ -122,38 +122,40 @@
}
/**
- * __clear_bit_unlock - Non-atomically clear a bit with release
+ * __clear_bit_unlock - Non-atomically clears a bit in memory with release
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
*
- * This is like clear_bit_unlock, but the implementation uses a store
+ * Similarly to clear_bit_unlock, the implementation uses a store
* with release semantics. See also __raw_spin_unlock().
*/
static __inline__ void
-__clear_bit_unlock(int nr, volatile void *addr)
+__clear_bit_unlock(int nr, void *addr)
{
- __u32 mask, new;
- volatile __u32 *m;
+ __u32 * const m = (__u32 *) addr + (nr >> 5);
+ __u32 const new = *m & ~(1 << (nr & 31));
- m = (volatile __u32 *)addr + (nr >> 5);
- mask = ~(1 << (nr & 31));
- new = *m & mask;
- barrier();
ia64_st4_rel_nta(m, new);
}
/**
* __clear_bit - Clears a bit in memory (non-atomic version)
+ * @nr: the bit to clear
+ * @addr: the address to start counting from
+ *
+ * Unlike clear_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
*/
static __inline__ void
__clear_bit (int nr, volatile void *addr)
{
- volatile __u32 *p = (__u32 *) addr + (nr >> 5);
- __u32 m = 1 << (nr & 31);
- *p &= ~m;
+ *((__u32 *) addr + (nr >> 5)) &= ~(1 << (nr & 31));
}
/**
* change_bit - Toggle a bit in memory
- * @nr: Bit to clear
+ * @nr: Bit to toggle
* @addr: Address to start counting from
*
* change_bit() is atomic and may not be reordered.
@@ -178,7 +180,7 @@
/**
* __change_bit - Toggle a bit in memory
- * @nr: the bit to set
+ * @nr: the bit to toggle
* @addr: the address to start counting from
*
* Unlike change_bit(), this function is non-atomic and may be reordered.
@@ -197,7 +199,7 @@
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_set_bit (int nr, volatile void *addr)
@@ -247,11 +249,11 @@
/**
* test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to clear
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_clear_bit (int nr, volatile void *addr)
@@ -272,7 +274,7 @@
/**
* __test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to clear
* @addr: Address to count from
*
* This operation is non-atomic and can be reordered.
@@ -292,11 +294,11 @@
/**
* test_and_change_bit - Change a bit and return its old value
- * @nr: Bit to set
+ * @nr: Bit to change
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * It also implies the acquisition side of the memory barrier.
*/
static __inline__ int
test_and_change_bit (int nr, volatile void *addr)
@@ -315,8 +317,12 @@
return (old & bit) != 0;
}
-/*
- * WARNING: non atomic version.
+/**
+ * __test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
*/
static __inline__ int
__test_and_change_bit (int nr, void *addr)
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [patch 1/3] IA64: Slim down __clear_bit_unlock #2
2007-11-21 22:58 [patch 1/3] IA64: Slim down __clear_bit_unlock akpm
` (7 preceding siblings ...)
2008-01-15 13:15 ` [patch 1/3] IA64: Slim down __clear_bit_unlock #2 Zoltan Menyhart
@ 2008-01-16 6:26 ` Nick Piggin
8 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2008-01-16 6:26 UTC (permalink / raw)
To: linux-ia64
On Wednesday 16 January 2008 00:15, Zoltan Menyhart wrote:
> Please have a look at the patch below.
>
> Taking this opportunity, in addition:
> - I removed the unnecessary barrier() from __clear_bit_unlock().
> ia64_st4_rel_nta() makes sure all the modifications are globally
> seen before the bit is seen to be off.
> - I made __clear_bit() modeled after __set_bit() and __change_bit().
> - I corrected some comments sating that a memory barrier is provided,
> yet in reality, it is the acquisition side of the memory barrier only.
> - I corrected some comments, e.g. test_and_clear_bit() was peaking
> about "bit to set".
>
> Here is the code generated from my and the old versions.
> (Though I do not know why "and" is moved in the 2nd bundle.):
>
> test_new()
> {
> __clear_bit_unlock(3, &data);
> }
>
> test_old()
> {
> old__clear_bit_unlock(3, &data);
> }
>
> 0000000000000000 <test_new>:
> 0: 02 00 00 00 01 00 [MII] nop.m 0x0
> 6: e0 00 04 00 48 00 addl r14=0,r1;;
> c: 00 00 04 00 nop.i 0x0
> 10: 0b 10 00 1c 10 10 [MMI] ld4 r2=[r14];;
> 16: f0 b8 0b 58 44 00 and r15=-9,r2
> 1c: 00 00 04 00 nop.i 0x0;;
> 20: 0a 00 3c 1c b6 11 [MMI] st4.rel.nta [r14]=r15;;
> 26: 00 00 00 02 00 00 nop.m 0x0
> 2c: 01 70 00 84 mov r8=r14
> 30: 1d 00 00 00 01 00 [MFB] nop.m 0x0
> 36: 00 00 00 02 00 80 nop.f 0x0
> 3c: 08 00 84 00 br.ret.sptk.many b0;;
>
> 0000000000000040 <test_old>:
> 40: 02 00 00 00 01 00 [MII] nop.m 0x0
> 46: e0 00 04 00 48 00 addl r14=0,r1;;
> 4c: 00 00 04 00 nop.i 0x0
> 50: 03 10 00 1c b0 10 [MII] ld4.acq r2=[r14]
> 56: 00 00 00 02 00 e0 nop.i 0x0;;
> 5c: 71 17 b0 88 and r15=-9,r2;;
> 60: 0a 00 3c 1c b6 11 [MMI] st4.rel.nta [r14]=r15;;
> 66: 00 00 00 02 00 00 nop.m 0x0
> 6c: 01 70 00 84 mov r8=r14
> 70: 1d 00 00 00 01 00 [MFB] nop.m 0x0
> 76: 00 00 00 02 00 80 nop.f 0x0
> 7c: 08 00 84 00 br.ret.sptk.many b0;;
>
> Signed-off-by: Zoltan Menyhart, <Zoltan.Menyhart@bull.net>
I don't see any problem with this patch.
FWIW:
Acked-by: Nick Piggin <npiggin@suse.de>
^ permalink raw reply [flat|nested] 10+ messages in thread