Re: [Possible BUG] count_lim_atomic.c fails on POWER8

Discussions of the Parallel Programming book
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Junchang Wang <junchangwang@gmail.com>
Cc: Akira Yokosawa <akiyks@gmail.com>, perfbook@vger.kernel.org
Subject: Re: [Possible BUG] count_lim_atomic.c fails on POWER8
Date: Thu, 25 Oct 2018 08:24:50 -0700	[thread overview]
Message-ID: <20181025152450.GS4170@linux.ibm.com> (raw)
In-Reply-To: <CABoNC81fnRn3H3wB+Zx7WG7QPw7qLa13pVcUzxQ-zdX3vtaxuA@mail.gmail.com>

On Thu, Oct 25, 2018 at 10:09:22PM +0800, Junchang Wang wrote:
> On Thu, Oct 25, 2018 at 5:45 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> >
> > On Thu, Oct 25, 2018 at 10:11:18AM +0800, Junchang Wang wrote:
> > > Hi Akira,
> > >
> > > Thanks for the mail. My understanding is that PPC uses LL/SC to
> > > emulate CAS by using a tiny loop. Unfortunately, the LL/SC loop itself
> > > could fail (due to, for example, context switches) even if *ptr equals
> > > to old. In such a case, a CAS instruction in actually should return a
> > > success. I think this is what the term "spurious fail" describes. Here
> > > is a reference:
> > > http://liblfds.org/mediawiki/index.php?title=Article:CAS_and_LL/SC_Implementation_Details_by_Processor_family
> >
> > First, thank you both for your work on this!  And yes, my cmpxchg() code
> > is clearly quite broken.
> >
> > > It seems that __atomic_compare_exchange_n() provides option "weak" for
> > > performance. I tested these two solutions and got the following
> > > results:
> > >
> > >                            1      4      8     16     32    64
> > > my patch (ns)     35    34    37    73    142  281
> > > strong (ns)          39    39    41    79    158  313
> >
> > So strong is a bit slower, correct?
> >
> > > I tested the performance of count_lim_atomic by varying the number of
> > > updaters (./count_lim_atomic N uperf) on a 8-core PPC server. The
> > > first row in the table is the result when my patch is used, and the
> > > second row is the result when the 4th argument of the function is set
> > > to false(0). It seems performance improves slightly if option "weak"
> > > is used. However, there is no performance boost as we expected. So
> > > your solution sounds good if safety is one major concern because
> > > option "weak" seems risky to me :-)
> > >
> > > Another interesting observation is that the performance of LL/SC-based
> > > CAS instruction deteriorates dramatically when the number of working
> > > threads exceeds the number of CPU cores.
> >
> > If weak is faster, would it make sense to return (~o), that is,
> > the bitwise complement of the expected arguement, when the weak
> > __atomic_compare_exchange_n() fails?  This would get the improved
> > performance (if I understand your results above) while correctly handling
> > the strange (but possible) case where o==n.
> >
> > Does that make sense, or am I missing something?
> 
> Hi Paul and Akira,
> 
> Yes, the weak version is faster. The solution looks good. But when I
> tried to use the following patch
> 
> #define cmpxchg(ptr, o, n) \
> ({ \
>         typeof(*ptr) old = (o); \
>         (__atomic_compare_exchange_n(ptr, (void *)&old, (n), 1,
> __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST))? \
>                                 (o) : (~o); \
> })
> 
> gcc complains of my use of complement symbol
> 
> ../api.h:769:12: error: wrong type argument to bit-complement
>      (o) : (~o); \
>               ^
> 
> Any suggestions?

You might need to do this for the macro argument: "(~(o))".

Another possibility is ((o) + 1), which would work for pointers as well
as for integers.

							Thanx, Paul

     prev parent reply	other threads:[~2018-10-25 23:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-20 15:53 [Possible BUG] count_lim_atomic.c fails on POWER8 Akira Yokosawa
2018-10-20 16:36 ` Paul E. McKenney
2018-10-24 15:53   ` Junchang Wang
2018-10-24 22:05     ` Akira Yokosawa
2018-10-24 22:29       ` Akira Yokosawa
2018-10-25  2:18         ` Junchang Wang
2018-10-25  2:11       ` Junchang Wang
2018-10-25  9:45         ` Paul E. McKenney
2018-10-25 12:23           ` Akira Yokosawa
2018-10-25 14:09           ` Junchang Wang
2018-10-25 15:17             ` Akira Yokosawa
2018-10-25 22:04               ` Akira Yokosawa
2018-10-26  0:58                 ` Junchang Wang
2018-10-27 14:56                   ` Akira Yokosawa
     [not found]                     ` <20181028001723.GJ4170@linux.ibm.com>
2018-10-28 12:08                       ` Junchang Wang
2018-10-28 13:19                         ` Paul E. McKenney
2018-10-28 13:22                         ` Akira Yokosawa
2018-10-28 14:24                       ` Akira Yokosawa
2018-10-28 16:43                         ` Paul E. McKenney
2018-10-29 14:45                           ` Akira Yokosawa
2018-10-29 15:30                             ` Paul E. McKenney
2018-10-26  1:12                 ` Akira Yokosawa
2018-10-26 11:34                   ` Akira Yokosawa
2018-10-26 16:06                     ` Junchang Wang
2018-10-25 15:24             ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181025152450.GS4170@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akiyks@gmail.com \
    --cc=junchangwang@gmail.com \
    --cc=perfbook@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox