From: Pan Xinhui <xinhui@linux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
paulmck@linux.vnet.ibm.com, tglx@linutronix.de
Subject: Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16
Date: Fri, 22 Apr 2016 09:59:22 +0800 [thread overview]
Message-ID: <5719857A.5080201@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160421155257.GA20657@insomnia>
On 2016年04月21日 23:52, Boqun Feng wrote:
> On Thu, Apr 21, 2016 at 11:35:07PM +0800, Pan Xinhui wrote:
>> On 2016年04月20日 22:24, Peter Zijlstra wrote:
>>> On Wed, Apr 20, 2016 at 09:24:00PM +0800, Pan Xinhui wrote:
>>>
>>>> +#define __XCHG_GEN(cmp, type, sfx, skip, v) \
>>>> +static __always_inline unsigned long \
>>>> +__cmpxchg_u32##sfx(v unsigned int *p, unsigned long old, \
>>>> + unsigned long new); \
>>>> +static __always_inline u32 \
>>>> +__##cmp##xchg_##type##sfx(v void *ptr, u32 old, u32 new) \
>>>> +{ \
>>>> + int size = sizeof (type); \
>>>> + int off = (unsigned long)ptr % sizeof(u32); \
>>>> + volatile u32 *p = ptr - off; \
>>>> + int bitoff = BITOFF_CAL(size, off); \
>>>> + u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff; \
>>>> + u32 oldv, newv, tmp; \
>>>> + u32 ret; \
>>>> + oldv = READ_ONCE(*p); \
>>>> + do { \
>>>> + ret = (oldv & bitmask) >> bitoff; \
>>>> + if (skip && ret != old) \
>>>> + break; \
>>>> + newv = (oldv & ~bitmask) | (new << bitoff); \
>>>> + tmp = oldv; \
>>>> + oldv = __cmpxchg_u32##sfx((v u32*)p, oldv, newv); \
>>>> + } while (tmp != oldv); \
>>>> + return ret; \
>>>> +}
>>>
>>> So for an LL/SC based arch using cmpxchg() like that is sub-optimal.
>>>
>>> Why did you choose to write it entirely in C?
>>>
>> yes, you are right. more load/store will be done in C code.
>> However such xchg_u8/u16 is just used by qspinlock now. and I did not see any performance regression.
>> So just wrote in C, for simple. :)
>>
>> Of course I have done xchg tests.
>> we run code just like xchg((u8*)&v, j++); in several threads.
>> and the result is,
>> [ 768.374264] use time[1550072]ns in xchg_u8_asm
>
> How was xchg_u8_asm() implemented, using lbarx or using a 32bit ll/sc
> loop with shifting and masking in it?
>
yes, using 32bit ll/sc loops.
looks like:
__asm__ __volatile__(
"1: lwarx %0,0,%3\n"
" and %1,%0,%5\n"
" or %1,%1,%4\n"
PPC405_ERR77(0,%2)
" stwcx. %1,0,%3\n"
" bne- 1b"
: "=&r" (_oldv), "=&r" (tmp), "+m" (*(volatile unsigned int *)_p)
: "r" (_p), "r" (_newv), "r" (_oldv_mask)
: "cc", "memory");
> Regards,
> Boqun
>
>> [ 768.377102] use time[2826802]ns in xchg_u8_c
>>
>> I think this is because there is one more load in C.
>> If possible, we can move such code in asm-generic/.
>>
>> thanks
>> xinhui
>>
next prev parent reply other threads:[~2016-04-22 2:00 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-19 6:29 [PATCH V2] powerpc: Implement {cmp}xchg for u8 and u16 Pan Xinhui
2016-04-19 9:18 ` Boqun Feng
2016-04-20 3:39 ` Pan Xinhui
2016-04-20 13:24 ` [PATCH V3] " Pan Xinhui
2016-04-20 14:24 ` Peter Zijlstra
2016-04-21 15:35 ` Pan Xinhui
2016-04-21 15:52 ` Boqun Feng
2016-04-22 1:59 ` Pan Xinhui [this message]
2016-04-22 3:16 ` Boqun Feng
2016-04-21 16:13 ` Peter Zijlstra
2016-04-25 10:10 ` Pan Xinhui
2016-04-25 15:37 ` Peter Zijlstra
2016-04-26 11:35 ` Pan Xinhui
2016-04-27 9:16 ` [PATCH V4] " Pan Xinhui
2016-04-27 13:58 ` Boqun Feng
2016-04-27 14:16 ` Boqun Feng
2016-04-27 14:50 ` Boqun Feng
2016-04-27 14:59 ` Boqun Feng
2016-04-28 10:21 ` Pan Xinhui
2016-04-28 7:59 ` Peter Zijlstra
2016-04-28 10:21 ` Pan Xinhui
2016-11-25 0:04 ` [V4] " Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5719857A.5080201@linux.vnet.ibm.com \
--to=xinhui@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=boqun.feng@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulmck@linux.vnet.ibm.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).