From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=54553 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PbEg7-0005h9-DA for qemu-devel@nongnu.org; Fri, 07 Jan 2011 10:56:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PbEg6-0005T9-CM for qemu-devel@nongnu.org; Fri, 07 Jan 2011 10:56:35 -0500 Received: from mail-fx0-f45.google.com ([209.85.161.45]:60135) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PbEg6-0005T4-6R for qemu-devel@nongnu.org; Fri, 07 Jan 2011 10:56:34 -0500 Received: by fxm12 with SMTP id 12so16747191fxm.4 for ; Fri, 07 Jan 2011 07:56:33 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20110107144035.GA18176@hall.aurel32.net> References: <1294350874-6885-1-git-send-email-aurelien@aurel32.net> <1294350874-6885-3-git-send-email-aurelien@aurel32.net> <20110107144035.GA18176@hall.aurel32.net> Date: Fri, 7 Jan 2011 16:56:32 +0100 Message-ID: Subject: Re: [Qemu-devel] [PATCH 3/3] tcg/arm: improve constant loading From: andrzej zaborowski Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno Cc: qemu-devel@nongnu.org On 7 January 2011 15:40, Aurelien Jarno wrote: > On Fri, Jan 07, 2011 at 01:52:25PM +0100, andrzej zaborowski wrote: >> Hi, >> >> On 6 January 2011 22:54, Aurelien Jarno wrote: >> > Improve constant loading in two ways: >> > - On all ARM versions, it's possible to load 0xffffff00 =3D -0x100 usi= ng >> > =C2=A0the mvn rd, #0. Fix the conditions. >> > - On <=3D ARMv6 versions, where movw and movt are not available, load = the >> > =C2=A0constants using mov and orr with rotations depending on the cons= tant >> > =C2=A0to load. This is very useful for example to load constants where= the >> > =C2=A0low byte is 0. This reduce the generated code size by about 7%. >> >> That's a nice improvement. =C2=A0For some instructions using MVN and AND >> could yield even shorter code and I think with that the optimisation >> options (except loading from a constant pool) would be exhausted :) > > I also did something with MVN and BIC, it works well, but the problem is > to find the right heuristic to choose between MOV/ORR and MVN/BIC. In my > tries, it was making the code bigger. I was thinking of running both without writing the instructions, then comparing the lengths and then running the better method. It's possible that the cost of this outweights the shorter code advantage though. > >> ... >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 } >> > + =C2=A0 =C2=A0} else { >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0int opc =3D ARITH_MOV; >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0int rn =3D 0; >> > + >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0do { >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int i, rot; >> > + >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0i =3D ctz32(arg) & ~1; >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rot =3D ((32 - i) << 7) & 0= xf00; >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcg_out_dat_imm(s, cond, op= c, rd, rn, ((arg >> i) & 0xff) | rot); >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0arg &=3D ~(0xff << i); >> > + >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0opc =3D ARITH_ORR; >> > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rn =3D rd; >> >> I think you could get rid of rn and just use rd from the start of the >> loop. =C2=A0Otherwise acked by me too. >> > > What do you mean exactly? rn has to be 0 when opc is ARITH_MOV in order > to generate a correct ARM instruction. According to my ARM926 manual rn is ignored for MOV/MVN, perhaps it's different in later revisions. Cheers