From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N1Svy-0007yG-Tx for qemu-devel@nongnu.org; Fri, 23 Oct 2009 18:48:34 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N1Svy-0007xj-Ih for qemu-devel@nongnu.org; Fri, 23 Oct 2009 18:48:34 -0400 Received: from [199.232.76.173] (port=47670 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N1Svy-0007xY-8I for qemu-devel@nongnu.org; Fri, 23 Oct 2009 18:48:34 -0400 Received: from fg-out-1718.google.com ([72.14.220.154]:59171) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N1Svx-0003Tf-Qr for qemu-devel@nongnu.org; Fri, 23 Oct 2009 18:48:34 -0400 Received: by fg-out-1718.google.com with SMTP id d23so3652200fga.10 for ; Fri, 23 Oct 2009 15:48:32 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <54D1051E-4C0D-4EB5-9396-4B094F08626F@nokia.com> References: <54D1051E-4C0D-4EB5-9396-4B094F08626F@nokia.com> Date: Sat, 24 Oct 2009 00:48:32 +0200 Message-ID: <761ea48b0910231548u2f014d49p67b363db644ef2b6@mail.gmail.com> Subject: Re: [Qemu-devel] [PATCH 06/12] target-arm: optimize arm load/store multiple ops From: Laurent Desnogues Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juha.Riihimaki@nokia.com Cc: qemu-devel@nongnu.org On Wed, Oct 21, 2009 at 12:17 PM, wrote: > ARM load/store multiple instructions can be slightly optimized by > loading the register offset constant into a variable outside the > register loop and using the preloaded variable inside the loop instead > of reloading the offset value to a temporary variable on each loop > iteration. This causes less TCG ops to be generated for a ARM load/ > store multiple instruction. > > Signed-off-by: Juha Riihim=E4ki Acked-by: Laurent Desnogues > --- > diff --git a/target-arm/translate.c b/target-arm/translate.c > index e5a2881..bae1122 100644 > --- a/target-arm/translate.c > +++ b/target-arm/translate.c > @@ -6852,6 +6852,7 @@ static void disas_arm_insn(CPUState * env, > DisasContext *s) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rn =3D (insn >> 16) & 0xf; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0addr =3D load_reg(s, rn); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tmp2 =3D tcg_const_i32(4); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* compute total size */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0loaded_base =3D 0; > @@ -6865,7 +6866,7 @@ static void disas_arm_insn(CPUState * env, > DisasContext *s) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (insn & (1 << 23)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (insn & (1 << 24)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* pre increment */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_addi_i32(addr, a= ddr, 4); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_add_i32(addr, ad= dr, tmp2); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* post increment */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > @@ -6918,7 +6919,7 @@ static void disas_arm_insn(CPUState * env, > DisasContext *s) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0j++; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* no need to add afte= r the last transfer */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (j !=3D n) > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_addi_i32= (addr, addr, 4); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_add_i32(= addr, addr, tmp2); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (insn & (1 << 21)) { > @@ -6928,7 +6929,7 @@ static void disas_arm_insn(CPUState * env, > DisasContext *s) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* pre increme= nt */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* post increm= ent */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_addi_i32= (addr, addr, 4); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_gen_add_i32(= addr, addr, tmp2); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (insn & (1 << 24)) = { > @@ -6944,6 +6945,7 @@ static void disas_arm_insn(CPUState * env, > DisasContext *s) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dead_tmp(addr); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_temp_free_i32(tmp2); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (loaded_base) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0store_reg(s, rn, loaded_var); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > Laurent