From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58770) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VAptb-000857-J6 for qemu-devel@nongnu.org; Sat, 17 Aug 2013 19:27:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VAptT-0001V7-5A for qemu-devel@nongnu.org; Sat, 17 Aug 2013 19:26:59 -0400 Received: from mail-pb0-x233.google.com ([2607:f8b0:400e:c01::233]:64598) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VAptS-0001V2-U1 for qemu-devel@nongnu.org; Sat, 17 Aug 2013 19:26:51 -0400 Received: by mail-pb0-f51.google.com with SMTP id jt11so3396838pbb.24 for ; Sat, 17 Aug 2013 16:26:50 -0700 (PDT) Sender: Richard Henderson From: Richard Henderson Date: Sat, 17 Aug 2013 16:26:42 -0700 Message-Id: <1376782006-31746-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PATCH 0/4] tcg: Add muluh and mulsh opcodes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: aurelien@aurel32.net We have -- or will have -- several targets which have a native multiply-highpart instruction: ppc*, ia64, aarch64, alpha. If we leave only the mul[us]2 opcode with which to expose this, we have to handle the register allocation bits in the backends. Better, IMO, to expose the two parts at the TCG opcode level, simplifying the backends. I've left tcg_gen_mul[us]_i{32,64} as the "public" interface to these opcodes at the translator level. If the guest does not need both results, they can just be ignored. If the host has a combined mult insn (i386, arm) then one output is garbage; if the host has separate mult insns, then the optimizer can delete the unused opcode. Really only tested with x86_64 and ppc64. The linux-user-test image for alpha sees: IN: 0x0000004000814148: umulh t5,t0,t0 OP: ld_i32 tmp0,env,$0xffffffffffffffa8 movi_i32 tmp1,$0x0 brcond_i32 tmp0,tmp1,ne,$0x0 ---- 0x4000814148 mul_i64 tmp3,ir6,ir1 muluh_i64 ir1,ir6,ir1 mov_i64 tmp2,tmp3 movi_i64 pc,$0x400081414c exit_tb $0x0 set_label $0x0 exit_tb $0x3fff8c244483 OP after optimization and liveness analysis: ld_i32 tmp0,env,$0xffffffffffffffa8 movi_i32 tmp1,$0x0 brcond_i32 tmp0,tmp1,ne,$0x0 ---- 0x4000814148 nopn $0x3,$0xd,$0x3 muluh_i64 ir1,ir1,ir6 nopn $0x2,$0x2 movi_i64 pc,$0x400081414c exit_tb $0x0 set_label $0x0 exit_tb $0x3fff8c244483 end OUT: [size=76] 0x6011b0f0: lwz r14,-88(r27) 0x6011b0f4: cmpwi cr7,r14,0 0x6011b0f8: bne- cr7,0x6011b128 0x6011b0fc: ld r14,8(r27) 0x6011b100: ld r15,48(r27) 0x6011b104: mulhdu r14,r14,r15 0x6011b108: std r14,8(r27) ... r~ Richard Henderson (4): tcg: Add muluh and mulsh opcodes tcg-mips: Implement mulsh, muluh tcg-ppc64: Implement muluh, mulsh tcg: Constant fold div, rem tcg/aarch64/tcg-target.h | 4 ++++ tcg/arm/tcg-target.h | 2 ++ tcg/hppa/tcg-target.h | 2 ++ tcg/i386/tcg-target.h | 4 ++++ tcg/ia64/tcg-target.h | 4 ++++ tcg/mips/tcg-target.c | 10 ++++++++++ tcg/mips/tcg-target.h | 2 ++ tcg/optimize.c | 43 +++++++++++++++++++++++++++++++++++++++++++ tcg/ppc/tcg-target.h | 2 ++ tcg/ppc64/tcg-target.c | 32 +++++++------------------------- tcg/ppc64/tcg-target.h | 8 ++++++-- tcg/s390/tcg-target.h | 4 ++++ tcg/sparc/tcg-target.h | 4 ++++ tcg/tcg-op.h | 40 ++++++++++++++++++++++++++++++++++++---- tcg/tcg-opc.h | 4 ++++ tcg/tcg.c | 36 ++++++++++++++++++++++++++++++------ tcg/tcg.h | 2 ++ tcg/tci/tcg-target.h | 5 ++++- 18 files changed, 170 insertions(+), 38 deletions(-) -- 1.8.1.4