From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52305) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bhk80-0000Lk-U0 for qemu-devel@nongnu.org; Wed, 07 Sep 2016 17:11:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bhk7z-0002j5-RL for qemu-devel@nongnu.org; Wed, 07 Sep 2016 17:11:28 -0400 Received: from mail-yw0-x244.google.com ([2607:f8b0:4002:c05::244]:34225) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bhk7z-0002j0-Nl for qemu-devel@nongnu.org; Wed, 07 Sep 2016 17:11:27 -0400 Received: by mail-yw0-x244.google.com with SMTP id j1so1172016ywb.1 for ; Wed, 07 Sep 2016 14:11:27 -0700 (PDT) Sender: Richard Henderson From: Richard Henderson Date: Wed, 7 Sep 2016 14:10:35 -0700 Message-Id: <1473282648-23487-6-git-send-email-rth@twiddle.net> In-Reply-To: <1473282648-23487-1-git-send-email-rth@twiddle.net> References: <1473282648-23487-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PULL 05/18] tcg/i386: Add support for fence List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: peter.maydell@linaro.org, Pranith Kumar From: Pranith Kumar Generate a 'lock orl $0,0(%esp)' instruction for ordering instead of mfence which has similar ordering semantics. Signed-off-by: Pranith Kumar Message-Id: <20160714202026.9727-3-bobby.prani@gmail.com> Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.inc.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 1573e69..b4f3223 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -686,6 +686,18 @@ static inline void tcg_out_pushi(TCGContext *s, tcg_target_long val) } } +static inline void tcg_out_mb(TCGContext *s, TCGArg a0) +{ + /* Given the strength of x86 memory ordering, we only need care for + store-load ordering. Experimentally, "lock orl $0,0(%esp)" is + faster than "mfence", so don't bother with the sse insn. */ + if (a0 & TCG_MO_ST_LD) { + tcg_out8(s, 0xf0); + tcg_out_modrm_offset(s, OPC_ARITH_EvIb, ARITH_OR, TCG_REG_ESP, 0); + tcg_out8(s, 0); + } +} + static inline void tcg_out_push(TCGContext *s, int reg) { tcg_out_opc(s, OPC_PUSH_r32 + LOWREGMASK(reg), 0, reg, 0); @@ -2130,6 +2142,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, } break; + case INDEX_op_mb: + tcg_out_mb(s, args[0]); + break; case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ @@ -2195,6 +2210,8 @@ static const TCGTargetOpDef x86_op_defs[] = { { INDEX_op_add2_i32, { "r", "r", "0", "1", "ri", "ri" } }, { INDEX_op_sub2_i32, { "r", "r", "0", "1", "ri", "ri" } }, + { INDEX_op_mb, { } }, + #if TCG_TARGET_REG_BITS == 32 { INDEX_op_brcond2_i32, { "r", "r", "ri", "ri" } }, { INDEX_op_setcond2_i32, { "r", "r", "r", "ri", "ri" } }, -- 2.7.4