From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNpE5-0007xH-Sc for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNpE0-0003ob-Tg for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:09 -0500 Received: from mail-wm0-x243.google.com ([2a00:1450:400c:c09::243]:34332) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNpE0-0003o7-Nh for qemu-devel@nongnu.org; Mon, 25 Jan 2016 17:03:04 -0500 Received: by mail-wm0-x243.google.com with SMTP id b14so13890693wmb.1 for ; Mon, 25 Jan 2016 14:03:04 -0800 (PST) Sender: Paolo Bonzini References: <1453740558-16303-1-git-send-email-alex.bennee@linaro.org> <1453740558-16303-4-git-send-email-alex.bennee@linaro.org> <56A663C0.80806@redhat.com> <87twm1o6w1.fsf@linaro.org> From: Paolo Bonzini Message-ID: <56A69B93.4030500@redhat.com> Date: Mon, 25 Jan 2016 23:02:59 +0100 MIME-Version: 1.0 In-Reply-To: <87twm1o6w1.fsf@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH 3/4] include/qemu/atomic.h: default to __atomic functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?Q?Alex_Benn=c3=a9e?= Cc: mttcg@greensocs.com, peter.maydell@linaro.org, mark.burton@greensocs.com, a.rigo@virtualopensystems.com, qemu-devel@nongnu.org, fred.konrad@greensocs.com On 25/01/2016 19:23, Alex Bennée wrote: >>> + }) >>> + >>> +#define atomic_mb_set(ptr, i) ((void)atomic_xchg(ptr, i)) >>> +#define atomic_mb_read(ptr) \ >>> + ({ \ >>> + typeof(*ptr) _val; \ >>> + __atomic_load(ptr, &_val, __ATOMIC_ACQUIRE); \ >>> + _val; \ >>> + }) >> >> Please leave these defined in terms of relaxed accesses and memory >> barriers. > > May I ask why? Doesn't the __ATOMIC_ACQUIRE ensure the load barrier is > in place? Or should it be a fill CST barrier? First, because they are defined like this in docs/atomics.txt. :) Second, because the right definition would be __atomic_load(__ATOMIC_SEQ_CST) and __atomic_store(__ATOMIC_SEQ_CST). These produce even better code than the memory barriers on x86 but, unfortunately, on POWER they are implemented in such a way that make things very slow. Basically, instead of mb_read -> load; rmb() mb_set -> wmb(); store(); mb() POWER uses this: mb_read -> load; mb() mb_set -> wmb(); store(); mb() There are reasons for these, and they are proved in http://www.cl.cam.ac.uk/~pes20/cppppc/popl079-batty.pdf (I'm not pretending I understand this paper except inasmuch as it was necessary to write docs/atomics.txt). However, the cases that actually matter basically never arise. Again, this is documented in docs/atomics.txt around line 90. As an alternative to explicit memory barriers, you can use this on POWER: mb_read -> load-acquire mb_set -> store-release + mb() and seq-cst load/store everywhere else. Paolo