From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46173) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VvTEN-0007rZ-E1 for qemu-devel@nongnu.org; Tue, 24 Dec 2013 09:45:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VvTEB-0002ts-1Y for qemu-devel@nongnu.org; Tue, 24 Dec 2013 09:45:11 -0500 Received: from mail-pd0-x229.google.com ([2607:f8b0:400e:c02::229]:45970) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VvTEA-0002tm-QO for qemu-devel@nongnu.org; Tue, 24 Dec 2013 09:44:58 -0500 Received: by mail-pd0-f169.google.com with SMTP id v10so6380078pde.14 for ; Tue, 24 Dec 2013 06:44:57 -0800 (PST) Sender: Richard Henderson Message-ID: <52B99DE6.5030200@twiddle.net> Date: Tue, 24 Dec 2013 06:44:54 -0800 From: Richard Henderson MIME-Version: 1.0 References: <1387713039-9584-1-git-send-email-aurelien@aurel32.net> <1387713039-9584-7-git-send-email-aurelien@aurel32.net> In-Reply-To: <1387713039-9584-7-git-send-email-aurelien@aurel32.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 6/9] target-sh4: split out Q and M from of SR and optimize div1 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno , qemu-devel@nongnu.org On 12/22/2013 03:50 AM, Aurelien Jarno wrote: > static void gen_read_sr(TCGv dst) > { > - tcg_gen_andi_i32(dst, cpu_sr, ~(1u << SR_T)); > - tcg_gen_or_i32(dst, dst, cpu_sr_t); > + TCGv t0 = tcg_temp_new(); > + tcg_gen_andi_i32(dst, cpu_sr, > + ~((1u << SR_Q) | (1u << SR_M) | (1u << SR_T))); > + tcg_gen_shli_i32(t0, cpu_sr_q, SR_Q); > + tcg_gen_or_i32(dst, dst, t0); > + tcg_gen_shli_i32(t0, cpu_sr_m, SR_M); > + tcg_gen_or_i32(dst, dst, t0); > + tcg_gen_shli_i32(t0, cpu_sr_t, SR_T); > + tcg_gen_or_i32(dst, dst, t0); > + tcg_temp_free_i32(t0); > } Similar comments for SR_[QM] as for SR_T wrt who clears the relevant bits in env->sr. > case 0x2007: /* div0s Rm,Rn */ > { > - gen_copy_bit_i32(cpu_sr, SR_Q, REG(B11_8), 31); /* SR_Q */ > - gen_copy_bit_i32(cpu_sr, SR_M, REG(B7_4), 31); /* SR_M */ > + tcg_gen_shri_i32(cpu_sr_q, REG(B11_8), 31); /* SR_Q */ > + tcg_gen_mov_i32(cpu_sr_m, cpu_sr_q); /* SR_M */ > TCGv val = tcg_temp_new(); > tcg_gen_xor_i32(cpu_sr_t, REG(B7_4), REG(B11_8)); > tcg_gen_shri_i32(cpu_sr_t, cpu_sr_t, 31); /* SR_T */ Error setting M. Q and M are set from different source registers. And as a point of optimization, T no longer needs the shift if one uses the extracted Q and M as inputs. > + /* add or subtract arg0 from arg1 depending if Q == M */ > + tcg_gen_xor_i32(t1, cpu_sr_q, cpu_sr_m); > + tcg_gen_subi_i32(t1, t1, 1); > + tcg_gen_neg_i32(t2, REG(B7_4)); > + tcg_gen_movcond_i32(TCG_COND_EQ, t2, t1, zero, REG(B7_4), t2); > + tcg_gen_add2_i32(REG(B11_8), t1, REG(B11_8), zero, t2, t1); Why so complicated with the comparison? I'd have expected tcg_gen_movcond_i32(TCG_COND_EQ, t2, cpu_sr_q, cpu_sr_m, REG(B7_4), t2); Hmm... except I see you're re-using the condition as the high-part of the add2. That's pretty tricky. Perhaps expand upon the comment? r~