Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Sergey Fedorov <serge.fdrv@gmail.com>
To: Richard Henderson <rth@twiddle.net>,
	Pranith Kumar <bobby.prani@gmail.com>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>
Cc: serge.fdrv@linaro.org, alex.bennee@linaro.org
Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier
Date: Fri, 3 Jun 2016 18:16:32 +0300	[thread overview]
Message-ID: <57519F50.2090509@gmail.com> (raw)
In-Reply-To: <8a253238-5718-10d4-a1b9-d9c0c890a457@twiddle.net>

On 03/06/16 04:08, Richard Henderson wrote:
> On 06/02/2016 02:37 PM, Sergey Fedorov wrote:
>> On 03/06/16 00:18, Richard Henderson wrote:
>>> On 06/02/2016 01:38 PM, Sergey Fedorov wrote:
>>>> On 02/06/16 23:36, Richard Henderson wrote:
>>>>> On 06/02/2016 09:30 AM, Sergey Fedorov wrote:
>>>>>> I think we need to extend TCG load/store instruction attributes to
>>>>>> provide information about guest ordering requirements and leave
>>>>>> this TCG
>>>>>> operation only for explicit barrier instruction translation.
>>>>>
>>>>> I do not agree.  I think separate barriers are much cleaner and
>>>>> easier
>>>>> to manage and reason with.
>>>>>
>>>>
>>>> How are we going to emulate strongly-ordered guests on weakly-ordered
>>>> hosts then? I think if every load/store operation must specify which
>>>> ordering it implies then this task would be quite simple.
>>>
>>> Hum.  That does seem helpful-ish.  But I'm not certain how helpful it
>>> is to complicate the helper functions even further.
>>>
>>> What if we have tcg_canonicalize_memop (or some such) split off the
>>> barriers into separate opcodes.  E.g.
>>>
>>> MO_BAR_LD_B = 32    // prevent earlier loads from crossing current op
>>> MO_BAR_ST_B = 64    // prevent earlier stores from crossing current op
>>> MO_BAR_LD_A = 128    // prevent later loads from crossing current op
>>> MO_BAR_ST_A = 256    // prevent later stores from crossing current op
>>> MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B
>>> MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A
>>> MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A
>>>
>>> // Match Sparc MEMBAR as the most flexible host.
>>> TCG_BAR_LD_LD = 1    // #LoadLoad barrier
>>> TCG_BAR_ST_LD = 2    // #StoreLoad barrier
>>> TCG_BAR_LD_ST = 4    // #LoadStore barrier
>>> TCG_BAR_ST_ST = 8    // #StoreStore barrier
>>> TCG_BAR_SYNC  = 64    // SEQ_CST barrier
>>>
>>> where
>>>
>>>   tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER)
>>>
>>> emits
>>>
>>>   mb        TCG_BAR_LD_LD
>>>   qemu_ld_i32    x, y, i, m
>>>   mb        TCG_BAR_LD_ST
>>>
>>> We can then add an optimization pass which folds barriers with no
>>> memory operations in between, so that duplicates are eliminated.
>>
>> It would give us three TCG operations for each memory operation instead
>> of one. But then we might like to combine these barrier operations back
>> with memory operations in each backend. If we propagate memory ordering
>> semantics up to the backend, it can decide itself what instructions are
>> best to generate.
>
> A strongly ordered target would generally only set BEFORE bits or
> AFTER bits, but not both (and I suggest we canonicalize on AFTER for
> all such targets). Thus a strongly ordered target would produce only 2
> opcodes per memory op.
>
> I supplied both to make it easier to handle a weakly ordered target
> with acquire/release bits.
>
> I would *not* combine the barrier operations back with memory
> operations in the backend.  Only armv8 and ia64 can do that, and given
> the optimization level at which we generate code, I doubt it would
> really make much difference above separate barriers.

So your suggestion is to generate different TCG opcode sequences
depending on the underlying target architecture? And you are against
forwarding this task further, to the backend code?

>
>> So I would just focus on translating only explicit memory barrier
>> operations for now.
>
> Then why did you bring it up?

I'm not sure I got the question right. I suggested to avoid using this
TCG operation to emulate guest's memory ordering requirements for
loads/stores that can be supplied with memory ordering requirement
information which each backend can decide how to translate together with
the load/store (possible just ignore it as it is the case for
strongly-ordered hosts). I think we just need to translate explicit
memory barrier instructions.

For example, emulating ARM guest on x86 host requires ARM dmb
instruction to be translated to x86 mfence instruction to prevent
store-after-load reordering. At the same time, we don't have to generate
anything special for loads/stores since x86 is a strongly-ordered
architecture.

Kind regards,
Sergey

next prev parent reply	other threads:[~2016-06-03 15:16 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-31 18:39 [Qemu-devel] [RFC v2 PATCH 00/13] tcg: Add fence gen support Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier Pranith Kumar
2016-05-31 20:24   ` Richard Henderson
2016-06-01 18:43     ` Pranith Kumar
2016-06-01 21:35       ` Richard Henderson
2016-06-02 16:18         ` Sergey Fedorov
2016-06-02 16:30   ` Sergey Fedorov
2016-06-02 18:42     ` Pranith Kumar
2016-06-02 20:36       ` Richard Henderson
2016-06-02 20:36     ` Richard Henderson
2016-06-02 20:38       ` Sergey Fedorov
2016-06-02 21:18         ` Richard Henderson
2016-06-02 21:37           ` Sergey Fedorov
2016-06-03  1:08             ` Richard Henderson
2016-06-03 15:16               ` Sergey Fedorov [this message]
2016-06-03 15:45                 ` Richard Henderson
2016-06-03 16:06                   ` Sergey Fedorov
2016-06-03 18:30               ` Pranith Kumar
2016-06-03 19:49                 ` Sergey Fedorov
2016-06-03 20:43                   ` Peter Maydell
2016-06-03 21:33                     ` Sergey Fedorov
2016-06-06 16:19                   ` Alex Bennée
2016-06-03 18:27           ` Pranith Kumar
2016-06-03 19:52             ` Sergey Fedorov
2016-06-06 15:44             ` Sergey Fedorov
2016-06-06 15:47               ` Pranith Kumar
2016-06-06 15:49                 ` Sergey Fedorov
2016-06-06 15:58                   ` Pranith Kumar
2016-06-06 16:14                     ` Sergey Fedorov
2016-06-06 17:11                       ` Pranith Kumar
2016-06-06 19:23                         ` Richard Henderson
2016-06-06 19:28                           ` Pranith Kumar
2016-06-06 20:30                             ` Sergey Fedorov
2016-06-06 21:00                               ` Peter Maydell
2016-06-06 21:49                                 ` Sergey Fedorov
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 02/13] tcg/i386: Add support for fence Pranith Kumar
2016-05-31 20:27   ` Richard Henderson
2016-06-01 18:49     ` Pranith Kumar
2016-06-01 21:17       ` Richard Henderson
2016-06-01 21:44         ` Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 03/13] tcg/aarch64: " Pranith Kumar
2016-05-31 18:59   ` Claudio Fontana
2016-05-31 20:34   ` Richard Henderson
2016-06-16 22:03     ` Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 04/13] tcg/arm: " Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 05/13] tcg/ia64: " Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 06/13] tcg/mips: " Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 07/13] tcg/ppc: " Pranith Kumar
2016-05-31 20:41   ` Richard Henderson
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 08/13] tcg/s390: " Pranith Kumar
2016-06-02 19:31   ` Sergey Fedorov
2016-06-02 20:38     ` Richard Henderson
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 09/13] tcg/sparc: " Pranith Kumar
2016-05-31 20:45   ` Richard Henderson
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 10/13] tcg/tci: " Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 11/13] target-arm: Generate fences in ARMv7 frontend Pranith Kumar
2016-06-02 19:37   ` Sergey Fedorov
2016-06-04 14:50     ` Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 12/13] target-alpha: Generate fence op Pranith Kumar
2016-05-31 18:39 ` [Qemu-devel] [RFC v2 PATCH 13/13] tcg: Generate fences only for SMP MTTCG guests Pranith Kumar
2016-05-31 18:46 ` [Qemu-devel] [RFC v2 PATCH 00/13] tcg: Add fence gen support Pranith Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57519F50.2090509@gmail.com \
    --to=serge.fdrv@gmail.com \
    --cc=alex.bennee@linaro.org \
    --cc=bobby.prani@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).