From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39979) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPcQb-00067Y-UL for qemu-devel@nongnu.org; Mon, 17 Mar 2014 14:38:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WPcQW-0001uS-2n for qemu-devel@nongnu.org; Mon, 17 Mar 2014 14:38:25 -0400 Received: from mail-qa0-x22a.google.com ([2607:f8b0:400d:c00::22a]:36964) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPcQV-0001uG-Uz for qemu-devel@nongnu.org; Mon, 17 Mar 2014 14:38:20 -0400 Received: by mail-qa0-f42.google.com with SMTP id k15so5855977qaq.1 for ; Mon, 17 Mar 2014 11:38:19 -0700 (PDT) Sender: Richard Henderson From: Richard Henderson Date: Mon, 17 Mar 2014 11:37:49 -0700 Message-Id: <1395081476-6038-8-git-send-email-rth@twiddle.net> In-Reply-To: <1395081476-6038-1-git-send-email-rth@twiddle.net> References: <1395081476-6038-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PATCH 07/14] tcg-sparc: Implement muls2_i32 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: blauwirbel@gmail.com, aurelien@aurel32.net Using the 32-bit SMUL is a tad more efficient than resorting to extending and using the 64-bit MULX. Signed-off-by: Richard Henderson --- tcg/sparc/tcg-target.c | 18 +++++++++++++++--- tcg/sparc/tcg-target.h | 2 +- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c index d086c10..43ede5b 100644 --- a/tcg/sparc/tcg-target.c +++ b/tcg/sparc/tcg-target.c @@ -200,6 +200,7 @@ static const int tcg_target_call_oarg_regs[] = { #define ARITH_ADDX (INSN_OP(2) | INSN_OP3(0x08)) #define ARITH_SUBX (INSN_OP(2) | INSN_OP3(0x0c)) #define ARITH_UMUL (INSN_OP(2) | INSN_OP3(0x0a)) +#define ARITH_SMUL (INSN_OP(2) | INSN_OP3(0x0b)) #define ARITH_UDIV (INSN_OP(2) | INSN_OP3(0x0e)) #define ARITH_SDIV (INSN_OP(2) | INSN_OP3(0x0f)) #define ARITH_MULX (INSN_OP(2) | INSN_OP3(0x09)) @@ -1284,9 +1285,19 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, ARITH_SUBCC, ARITH_SUBX); break; case INDEX_op_mulu2_i32: - tcg_out_arithc(s, args[0], args[2], args[3], const_args[3], - ARITH_UMUL); - tcg_out_rdy(s, args[1]); + c = ARITH_UMUL; + goto do_mul2; + case INDEX_op_muls2_i32: + c = ARITH_SMUL; + do_mul2: + /* The 32-bit multiply insns produce a full 64-bit result. If the + destination register can hold it, we can avoid the slower RDY. */ + tcg_out_arithc(s, args[0], args[2], args[3], const_args[3], c); + if (SPARC64 || args[0] <= TCG_REG_O7) { + tcg_out_arithi(s, args[1], args[0], 32, SHIFT_SRLX); + } else { + tcg_out_rdy(s, args[1]); + } break; case INDEX_op_qemu_ld_i32: @@ -1418,6 +1429,7 @@ static const TCGTargetOpDef sparc_op_defs[] = { { INDEX_op_add2_i32, { "r", "r", "rZ", "rZ", "rJ", "rJ" } }, { INDEX_op_sub2_i32, { "r", "r", "rZ", "rZ", "rJ", "rJ" } }, { INDEX_op_mulu2_i32, { "r", "r", "rZ", "rJ" } }, + { INDEX_op_muls2_i32, { "r", "r", "rZ", "rJ" } }, { INDEX_op_mov_i64, { "R", "R" } }, { INDEX_op_movi_i64, { "R" } }, diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h index 5442d45..091224c 100644 --- a/tcg/sparc/tcg-target.h +++ b/tcg/sparc/tcg-target.h @@ -108,7 +108,7 @@ typedef enum { #define TCG_TARGET_HAS_add2_i32 1 #define TCG_TARGET_HAS_sub2_i32 1 #define TCG_TARGET_HAS_mulu2_i32 1 -#define TCG_TARGET_HAS_muls2_i32 0 +#define TCG_TARGET_HAS_muls2_i32 1 #define TCG_TARGET_HAS_muluh_i32 0 #define TCG_TARGET_HAS_mulsh_i32 0 -- 1.8.5.3