All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Metcalf <cmetcalf@ezchip.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will.deacon@arm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	"mtk.manpages@gmail.com" <mtk.manpages@gmail.com>,
	"dvhart@infradead.org" <dvhart@infradead.org>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"Vineet.Gupta1@synopsys.com" <Vineet.Gupta1@synopsys.com>,
	"ralf@linux-mips.org" <ralf@linux-mips.org>,
	"ddaney@caviumnetworks.com" <ddaney@caviumnetworks.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	<linux@arm.linux.org.uk>, <rth@twiddle.net>
Subject: Re: futex atomic vs ordering constraints
Date: Wed, 2 Sep 2015 13:25:34 -0400	[thread overview]
Message-ID: <55E7310E.7030200@ezchip.com> (raw)
In-Reply-To: <20150902170008.GU19282@twins.programming.kicks-ass.net>

On 09/02/2015 01:00 PM, Peter Zijlstra wrote:
> On Wed, Sep 02, 2015 at 12:10:58PM -0400, Chris Metcalf wrote:
>> On 09/02/2015 08:55 AM, Peter Zijlstra wrote:
>>> So here goes..
>>>
>>> Chris, I'm awfully sorry, but I seem to be Tile challenged.
>>>
>>> TileGX seems to define:
>>>
>>> #define smp_mb__before_atomic()	smp_mb()
>>> #define smp_mb__after_atomic()	smp_mb()
>>>
>>> However, its atomic_add_return() implementation looks like:
>>>
>>> static inline int atomic_add_return(int i, atomic_t *v)
>>> {
>>> 	int val;
>>> 	smp_mb();  /* barrier for proper semantics */
>>> 	val = __insn_fetchadd4((void *)&v->counter, i) + i;
>>> 	barrier();  /* the "+ i" above will wait on memory */
>>> 	return val;
>>> }
>>>
>>> Which leaves me confused on smp_mb__after_atomic().
>> Are you concerned about whether it has proper memory
>> barrier semantics already, i.e. full barriers before and after?
>> In fact we do have a full barrier before, but then because of the
>> "+ i" / "barrier()", we know that the only other operation since
>> the previous mb(), namely the read of v->counter, has
>> completed after the atomic operation.  As a result we can
>> omit explicitly having a second barrier.
>>
>> It does seem like all the current memory-order semantics are
>> correct, unless I'm missing something!
> So I'm reading that code like:
>
> 	MB
>   [RmW]	ret = *val += i
>
>
> So what is stopping later memory ops like:
>
>     [R]	a = *foo
>     [S]	*bar = b
>
>  From getting reordered with the RmW, like:
>
> 	MB
>
>     [R]	a = *foo
>     [S]	*bar = b
>
>   [RmW]	ret = *val += i
>
> Are you saying Tile does not reorder things like that? If so, why then
> is smp_mb__after_atomic() a full mb(). If it does, I don't see how your
> add_return is correct.
>
> Alternatively I'm just confused..

Tile does not do out-of-order instruction issue, but it does have an
out-of-order memory subsystem, so in addition to stores becoming
unpredictably visible without a memory barrier, loads will also
potentially not read from memory predictably after issue.
As a result, later operations that use a register that was previously
loaded may stall instruction issue until the load value is available.
A memory fence instruction will cause the core to wait for all
stores to become visible and all load values to be available.

So [R] can't move up to before [RmW] due to the in-order issue
nature of the processor.  And smp_mb__after_atomic() has to
be a full mb() because that's the only barrier we have available
to guarantee that the load has read from memory.  (If the
value of the actual atomic was passed to smp_mb__after_atomic()
then we could just generate a fake use of the value, basically
generating something like "move r1, r1", which would cause
the instruction issue to halt until the value had been read.)

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


  reply	other threads:[~2015-09-02 17:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-26 18:16 futex atomic vs ordering constraints Peter Zijlstra
2015-08-29  1:33 ` Davidlohr Bueso
2015-09-01 16:38   ` Peter Zijlstra
2015-09-01 16:31 ` Will Deacon
2015-09-01 16:42   ` Peter Zijlstra
2015-09-01 16:47     ` Will Deacon
2015-09-01 19:05     ` Thomas Gleixner
2015-09-02 12:55       ` Peter Zijlstra
2015-09-02 16:10         ` Chris Metcalf
2015-09-02 17:00           ` Peter Zijlstra
2015-09-02 17:25             ` Chris Metcalf [this message]
2015-09-02 21:18             ` Linus Torvalds
2015-09-04 17:25               ` Chris Metcalf
2015-09-05 17:53               ` Peter Zijlstra
2015-09-07  9:30                 ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55E7310E.7030200@ezchip.com \
    --to=cmetcalf@ezchip.com \
    --cc=Vineet.Gupta1@synopsys.com \
    --cc=dave@stgolabs.net \
    --cc=ddaney@caviumnetworks.com \
    --cc=dvhart@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mingo@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=ralf@linux-mips.org \
    --cc=rth@twiddle.net \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.