From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57462)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1aNpE5-0007xH-Sc
	for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:10 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1aNpE0-0003ob-Tg
	for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:09 -0500
Received: from mail-wm0-x243.google.com ([2a00:1450:400c:c09::243]:34332)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1aNpE0-0003o7-Nh
	for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:04 -0500
Received: by mail-wm0-x243.google.com with SMTP id b14so13890693wmb.1
	for <qemu-devel@nongnu.org>; Mon, 25 Jan 2016 14:03:04 -0800 (PST)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
References: <1453740558-16303-1-git-send-email-alex.bennee@linaro.org>
	<1453740558-16303-4-git-send-email-alex.bennee@linaro.org>
	<56A663C0.80806@redhat.com> <87twm1o6w1.fsf@linaro.org>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <56A69B93.4030500@redhat.com>
Date: Mon, 25 Jan 2016 23:02:59 +0100
MIME-Version: 1.0
In-Reply-To: <87twm1o6w1.fsf@linaro.org>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [RFC PATCH 3/4] include/qemu/atomic.h: default to
 __atomic functions
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?UTF-8?Q?Alex_Benn=c3=a9e?= <alex.bennee@linaro.org>
Cc: mttcg@greensocs.com, peter.maydell@linaro.org, mark.burton@greensocs.com, a.rigo@virtualopensystems.com, qemu-devel@nongnu.org, fred.konrad@greensocs.com


On 25/01/2016 19:23, Alex Bennée wrote:
>>> +    })
>>> +
>>> +#define atomic_mb_set(ptr, i)   ((void)atomic_xchg(ptr, i))
>>> +#define atomic_mb_read(ptr)                             \
>>> +    ({                                                  \
>>> +    typeof(*ptr) _val;                                  \
>>> +     __atomic_load(ptr, &_val, __ATOMIC_ACQUIRE);       \
>>> +    _val;                                               \
>>> +    })
>>
>> Please leave these defined in terms of relaxed accesses and memory
>> barriers.
> 
> May I ask why? Doesn't the __ATOMIC_ACQUIRE ensure the load barrier is
> in place? Or should it be a fill CST barrier?

First, because they are defined like this in docs/atomics.txt. :)

Second, because the right definition would be
__atomic_load(__ATOMIC_SEQ_CST) and __atomic_store(__ATOMIC_SEQ_CST).
These produce even better code than the memory barriers on x86 but,
unfortunately, on POWER they are implemented in such a way that make
things very slow.

Basically, instead of

    mb_read -> load; rmb()
    mb_set -> wmb(); store(); mb()

POWER uses this:

    mb_read -> load; mb()
    mb_set -> wmb(); store(); mb()

There are reasons for these, and they are proved in
http://www.cl.cam.ac.uk/~pes20/cppppc/popl079-batty.pdf (I'm not
pretending I understand this paper except inasmuch as it was necessary
to write docs/atomics.txt).  However, the cases that actually matter
basically never arise.  Again, this is documented in docs/atomics.txt
around line 90.

As an alternative to explicit memory barriers, you can use this on POWER:

   mb_read -> load-acquire
   mb_set -> store-release + mb()

and seq-cst load/store everywhere else.

Paolo