All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
	paulmck@linux.vnet.ibm.com, tglx@linutronix.de
Subject: Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16
Date: Fri, 22 Apr 2016 11:16:24 +0800	[thread overview]
Message-ID: <20160422031624.GB20657@insomnia> (raw)
In-Reply-To: <5719857A.5080201@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3194 bytes --]

On Fri, Apr 22, 2016 at 09:59:22AM +0800, Pan Xinhui wrote:
> On 2016年04月21日 23:52, Boqun Feng wrote:
> > On Thu, Apr 21, 2016 at 11:35:07PM +0800, Pan Xinhui wrote:
> >> On 2016年04月20日 22:24, Peter Zijlstra wrote:
> >>> On Wed, Apr 20, 2016 at 09:24:00PM +0800, Pan Xinhui wrote:
> >>>
> >>>> +#define __XCHG_GEN(cmp, type, sfx, skip, v)				\
> >>>> +static __always_inline unsigned long					\
> >>>> +__cmpxchg_u32##sfx(v unsigned int *p, unsigned long old,		\
> >>>> +			 unsigned long new);				\
> >>>> +static __always_inline u32						\
> >>>> +__##cmp##xchg_##type##sfx(v void *ptr, u32 old, u32 new)		\
> >>>> +{									\
> >>>> +	int size = sizeof (type);					\
> >>>> +	int off = (unsigned long)ptr % sizeof(u32);			\
> >>>> +	volatile u32 *p = ptr - off;					\
> >>>> +	int bitoff = BITOFF_CAL(size, off);				\
> >>>> +	u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff;	\
> >>>> +	u32 oldv, newv, tmp;						\
> >>>> +	u32 ret;							\
> >>>> +	oldv = READ_ONCE(*p);						\
> >>>> +	do {								\
> >>>> +		ret = (oldv & bitmask) >> bitoff;			\
> >>>> +		if (skip && ret != old)					\
> >>>> +			break;						\
> >>>> +		newv = (oldv & ~bitmask) | (new << bitoff);		\
> >>>> +		tmp = oldv;						\
> >>>> +		oldv = __cmpxchg_u32##sfx((v u32*)p, oldv, newv);	\
> >>>> +	} while (tmp != oldv);						\
> >>>> +	return ret;							\
> >>>> +}
> >>>
> >>> So for an LL/SC based arch using cmpxchg() like that is sub-optimal.
> >>>
> >>> Why did you choose to write it entirely in C?
> >>>
> >> yes, you are right. more load/store will be done in C code.
> >> However such xchg_u8/u16 is just used by qspinlock now. and I did not see any performance regression.
> >> So just wrote in C, for simple. :)
> >>
> >> Of course I have done xchg tests.
> >> we run code just like xchg((u8*)&v, j++); in several threads.
> >> and the result is,
> >> [  768.374264] use time[1550072]ns in xchg_u8_asm
> > 
> > How was xchg_u8_asm() implemented, using lbarx or using a 32bit ll/sc
> > loop with shifting and masking in it?
> > 
> yes, using 32bit ll/sc loops.
> 
> looks like:
>         __asm__ __volatile__(
> "1:     lwarx   %0,0,%3\n"
> "       and %1,%0,%5\n"
> "       or %1,%1,%4\n"
>        PPC405_ERR77(0,%2)
> "       stwcx.  %1,0,%3\n"
> "       bne-    1b"
>         : "=&r" (_oldv), "=&r" (tmp), "+m" (*(volatile unsigned int *)_p)
>         : "r" (_p), "r" (_newv), "r" (_oldv_mask)
>         : "cc", "memory");
> 

Good, so this works for all ppc ISAs too.

Given the performance benefit(maybe caused by the reason Peter
mentioned), I think we should use this as the implementation of u8/u16
{cmp}xchg for now. For Power7 and later, we can always switch to the
lbarx/lharx version if observable performance benefit can be achieved.

But the choice is left to you. After all, as you said, qspinlock is the
only user ;-)

Regards,
Boqun

> 
> > Regards,
> > Boqun
> > 
> >> [  768.377102] use time[2826802]ns in xchg_u8_c
> >>
> >> I think this is because there is one more load in C.
> >> If possible, we can move such code in asm-generic/.
> >>
> >> thanks
> >> xinhui
> >>
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2016-04-22  3:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19  6:29 [PATCH V2] powerpc: Implement {cmp}xchg for u8 and u16 Pan Xinhui
2016-04-19  9:18 ` Boqun Feng
2016-04-19  9:18   ` Boqun Feng
2016-04-20  3:39   ` Pan Xinhui
2016-04-20 13:24 ` [PATCH V3] " Pan Xinhui
2016-04-20 14:24   ` Peter Zijlstra
2016-04-21 15:35     ` Pan Xinhui
2016-04-21 15:52       ` Boqun Feng
2016-04-22  1:59         ` Pan Xinhui
2016-04-22  3:16           ` Boqun Feng [this message]
2016-04-21 16:13       ` Peter Zijlstra
2016-04-25 10:10         ` Pan Xinhui
2016-04-25 15:37           ` Peter Zijlstra
2016-04-26 11:35             ` Pan Xinhui
2016-04-27  9:16   ` [PATCH V4] " Pan Xinhui
2016-04-27 13:58     ` Boqun Feng
2016-04-27 14:16       ` Boqun Feng
2016-04-27 14:50       ` Boqun Feng
2016-04-27 14:59         ` Boqun Feng
2016-04-28 10:21           ` Pan Xinhui
2016-04-28  7:59     ` Peter Zijlstra
2016-04-28 10:21       ` Pan Xinhui
2016-11-25  0:04     ` [V4] " Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160422031624.GB20657@insomnia \
    --to=boqun.feng@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=xinhui@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.