From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40086) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XY6wW-0002f8-Tv for qemu-devel@nongnu.org; Sun, 28 Sep 2014 01:22:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XY6wN-0003H4-Qn for qemu-devel@nongnu.org; Sun, 28 Sep 2014 01:22:44 -0400 Received: from mail-pa0-x233.google.com ([2607:f8b0:400e:c03::233]:47808) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XY6wN-0003Gt-IO for qemu-devel@nongnu.org; Sun, 28 Sep 2014 01:22:35 -0400 Received: by mail-pa0-f51.google.com with SMTP id lj1so711832pab.10 for ; Sat, 27 Sep 2014 22:22:29 -0700 (PDT) Sender: Richard Henderson Message-ID: <54279B11.2030907@twiddle.net> Date: Sat, 27 Sep 2014 22:22:25 -0700 From: Richard Henderson MIME-Version: 1.0 References: <1411829891-24866-1-git-send-email-kbastian@mail.uni-paderborn.de> <1411829891-24866-5-git-send-email-kbastian@mail.uni-paderborn.de> In-Reply-To: <1411829891-24866-5-git-send-email-kbastian@mail.uni-paderborn.de> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/5] target-tricore: Add instructions of BIT opcode format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bastian Koppelmann , qemu-devel@nongnu.org Cc: peter.maydell@linaro.org On 09/27/2014 07:58 AM, Bastian Koppelmann wrote: > +/* D[c] = D[c][0] op1 (D[a][pos1] op2 D[b][pos2]);*/ > +static inline void gen_bit_2op(TCGv ret, TCGv r1, TCGv r2, TCGv r3, > + int pos1, int pos2, > + void(*op1)(TCGv, TCGv, TCGv), > + void(*op2)(TCGv, TCGv, TCGv)) > +{ > + TCGv temp1, temp2, temp3; > + > + temp1 = tcg_temp_new(); > + temp2 = tcg_temp_new(); > + temp3 = tcg_temp_new(); > + > + tcg_gen_andi_tl(temp3, r3, 0x1); > + > + tcg_gen_andi_tl(temp2, r2 , (0x1u << pos2)); > + tcg_gen_shri_tl(temp2, temp2, pos2); > + > + tcg_gen_andi_tl(temp1, r1, (0x1u << pos1)); > + tcg_gen_shri_tl(temp1, temp1, pos1); > + > + (*op1)(temp1, temp1, temp2); > + (*op2)(ret , temp3, temp1); This incorrectly clobbers bits 1:31 of ret. You want shri tmp1, r2, pos2 shri tmp1, r1, pos1 op1(tmp1, tmp1, tmp2) op2(tmp1, r3, tmp1) deposit ret, ret, tmp1, 0, 1 > + TCGv temp1, temp2; > + > + temp1 = tcg_temp_new(); > + temp2 = tcg_temp_new(); > + > + tcg_gen_andi_tl(temp2, r2, (0x1u << pos2)); > + tcg_gen_shri_tl(temp2, temp2, pos2); > + > + tcg_gen_andi_tl(temp1, r1, (0x1u << pos1)); > + tcg_gen_shri_tl(temp1, temp1, pos1); > + > + (*op1)(ret, temp1, temp2); This one, though, does get to set the whole register. That said, I think you should *not* mask the two inputs, but instead mask the one output. That saves one operation, and allows NOR to not need a special case. > + case OPC2_32_BIT_AND_NOR_T: > + gen_bit_2op(temp, cpu_gpr_d[r1], cpu_gpr_d[r2], cpu_gpr_d[r3], > + pos1, pos2, &tcg_gen_or_tl, &tcg_gen_andc_tl); > + break; Without trying to take into account the properties of the tcg backend, this seems less than ideal. Yes, it's correct, but so is tcg_gen_nor_tl, tcg_gen_and_tl which matches the name of the instruction. If the tcg backend is sparc or ppc, it's more efficient too, since nor is actually present in the isa. But of course, arm and even haswell x86 have andc but don't have nor. This is stuff that a normal compiler optimizer could sort out, but we don't have that for tcg. I'd be willing to accept something conditionalized on TCG_TARGET_HAS_nor_i32, if you like, but otherwise just match the instruction. > +static void decode_bit_insert(CPUTriCoreState *env, DisasContext *ctx) It's probably better to implement this with one right-shift and one deposit. Certainly would be easier to read and follow. > + case OPC2_32_BIT_XNOR_T: > + gen_bit_1op(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2], > + pos1, pos2, &tcg_gen_xor_tl); tcg_gen_eqv_tl > + case OPC2_32_BIT_OR_NOR_T: > + gen_bit_2op(temp, cpu_gpr_d[r1], cpu_gpr_d[r2], cpu_gpr_d[r3], > + pos1, pos2, &tcg_gen_or_tl, &tcg_gen_orc_tl); > + break; Again, probably better with nor + or, or conditionalization. > + case OPC2_32_BIT_SH_XNOR_T: > + gen_bit_1op(temp, cpu_gpr_d[r1], cpu_gpr_d[r2], > + pos1, pos2, &tcg_gen_xor_tl); > + tcg_gen_not_tl(temp, temp); Again, eqv. r~