From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41361)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1b9xF4-0005eN-FY
	for qemu-devel@nongnu.org; Mon, 06 Jun 2016 12:19:10 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1b9xEz-0006xM-4t
	for qemu-devel@nongnu.org; Mon, 06 Jun 2016 12:19:05 -0400
Received: from mail-wm0-x229.google.com ([2a00:1450:400c:c09::229]:36377)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1b9xEy-0006xA-Uw
	for qemu-devel@nongnu.org; Mon, 06 Jun 2016 12:19:01 -0400
Received: by mail-wm0-x229.google.com with SMTP id n184so100199290wmn.1
	for <qemu-devel@nongnu.org>; Mon, 06 Jun 2016 09:19:00 -0700 (PDT)
From: Alex =?utf-8?Q?Benn=C3=A9e?= <alex.bennee@linaro.org>
In-reply-to: <5751DF2D.5040709@gmail.com>
Date: Mon, 06 Jun 2016 17:19:13 +0100
Message-ID: <874m96i932.fsf@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for
 memory barrier
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Sergey Fedorov <serge.fdrv@gmail.com>
Cc: Pranith Kumar <bobby.prani@gmail.com>, Richard Henderson <rth@twiddle.net>, "open list:All patches CC here" <qemu-devel@nongnu.org>


Sergey Fedorov <serge.fdrv@gmail.com> writes:

> On 03/06/16 21:30, Pranith Kumar wrote:
>> On Thu, Jun 2, 2016 at 9:08 PM, Richard Henderson <rth@twiddle.net> wrote:
>>> On 06/02/2016 02:37 PM, Sergey Fedorov wrote:
>>>>
>>>> It would give us three TCG operations for each memory operation instead
>>>> of one. But then we might like to combine these barrier operations back
>>>> with memory operations in each backend. If we propagate memory ordering
>>>> semantics up to the backend, it can decide itself what instructions are
>>>> best to generate.
>>>
>>> A strongly ordered target would generally only set BEFORE bits or AFTER
>>> bits, but not both (and I suggest we canonicalize on AFTER for all such
>>> targets). Thus a strongly ordered target would produce only 2 opcodes per
>>> memory op.
>>>
>>> I supplied both to make it easier to handle a weakly ordered target with
>>> acquire/release bits.
>>>
>>> I would *not* combine the barrier operations back with memory operations in
>>> the backend.  Only armv8 and ia64 can do that, and given the optimization
>>> level at which we generate code, I doubt it would really make much
>>> difference above separate barriers.
>>>
>> On armv8, using load_acquire/store_release instructions makes a
>> significant difference in performance when compared to plain
>> dmb+memory instruction sequence. So I would really like to keep the
>> option of generating acq/rel instructions(by combining barrier and
>> memory or some other way) open.
>
> I'm not so sure about acq/rel flags. Is there any architecture which has
> explicit acq/rel barriers? I suppose acq/rel memory access instructions
> are always load-link and store-conditional and thus rely on exclusive
> memory monitor to support that "conditional" behaviour.

Nope, you can have acq/rel memory operations without exclusive
operations (see ARMv8 ldar and stlr). The exclusive operations also have
ordered and non-ordered variants (ldxr, strx).

> To emulate this
> behaviour we need something more special like "Slow-path for atomic
> instruction translation" [1].
>
> [1] http://thread.gmane.org/gmane.comp.emulators.qemu/407419
>
> Kind regards,
> Sergey


--
Alex Bennée