From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42442) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8v5g-0000LN-Gl for qemu-devel@nongnu.org; Fri, 03 Jun 2016 15:49:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8v5c-0007VF-BI for qemu-devel@nongnu.org; Fri, 03 Jun 2016 15:49:07 -0400 Received: from mail-lf0-x22c.google.com ([2a00:1450:4010:c07::22c]:32825) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8v5c-0007UH-1x for qemu-devel@nongnu.org; Fri, 03 Jun 2016 15:49:04 -0400 Received: by mail-lf0-x22c.google.com with SMTP id s64so61043994lfe.0 for ; Fri, 03 Jun 2016 12:49:03 -0700 (PDT) References: <20160531183928.29406-1-bobby.prani@gmail.com> <20160531183928.29406-2-bobby.prani@gmail.com> <57505F1A.3020808@gmail.com> <68c32d50-adc2-25b2-b136-2a486f6b3de7@twiddle.net> <5750995D.6030005@gmail.com> <8e9b8569-89a5-845a-a856-7f2fa4435659@twiddle.net> <5750A725.2050303@gmail.com> <8a253238-5718-10d4-a1b9-d9c0c890a457@twiddle.net> From: Sergey Fedorov Message-ID: <5751DF2D.5040709@gmail.com> Date: Fri, 3 Jun 2016 22:49:01 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pranith Kumar , Richard Henderson Cc: "open list:All patches CC here" , =?UTF-8?Q?Alex_Benn=c3=a9e?= On 03/06/16 21:30, Pranith Kumar wrote: > On Thu, Jun 2, 2016 at 9:08 PM, Richard Henderson wrote: >> On 06/02/2016 02:37 PM, Sergey Fedorov wrote: >>> >>> It would give us three TCG operations for each memory operation instead >>> of one. But then we might like to combine these barrier operations back >>> with memory operations in each backend. If we propagate memory ordering >>> semantics up to the backend, it can decide itself what instructions are >>> best to generate. >> >> A strongly ordered target would generally only set BEFORE bits or AFTER >> bits, but not both (and I suggest we canonicalize on AFTER for all such >> targets). Thus a strongly ordered target would produce only 2 opcodes per >> memory op. >> >> I supplied both to make it easier to handle a weakly ordered target with >> acquire/release bits. >> >> I would *not* combine the barrier operations back with memory operations in >> the backend. Only armv8 and ia64 can do that, and given the optimization >> level at which we generate code, I doubt it would really make much >> difference above separate barriers. >> > On armv8, using load_acquire/store_release instructions makes a > significant difference in performance when compared to plain > dmb+memory instruction sequence. So I would really like to keep the > option of generating acq/rel instructions(by combining barrier and > memory or some other way) open. I'm not so sure about acq/rel flags. Is there any architecture which has explicit acq/rel barriers? I suppose acq/rel memory access instructions are always load-link and store-conditional and thus rely on exclusive memory monitor to support that "conditional" behaviour. To emulate this behaviour we need something more special like "Slow-path for atomic instruction translation" [1]. [1] http://thread.gmane.org/gmane.comp.emulators.qemu/407419 Kind regards, Sergey