From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=58258 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Pc3vj-0004kI-22
	for qemu-devel@nongnu.org; Sun, 09 Jan 2011 17:40:07 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <aurelien@aurel32.net>) id 1Pc3vh-00068W-OP
	for qemu-devel@nongnu.org; Sun, 09 Jan 2011 17:40:06 -0500
Received: from hall.aurel32.net ([88.191.126.93]:47382)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <aurelien@aurel32.net>) id 1Pc3vh-00068N-Fo
	for qemu-devel@nongnu.org; Sun, 09 Jan 2011 17:40:05 -0500
Date: Sun, 9 Jan 2011 23:40:03 +0100
From: Aurelien Jarno <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH 3/3] tcg/arm: improve constant loading
Message-ID: <20110109224002.GC21189@volta.aurel32.net>
References: <1294350874-6885-1-git-send-email-aurelien@aurel32.net>
	<1294350874-6885-3-git-send-email-aurelien@aurel32.net>
	<AANLkTi=MYpxWWM90y73WrQby+iVPR3naYO0_soWV+99q@mail.gmail.com>
	<20110107144035.GA18176@hall.aurel32.net>
	<AANLkTinRct_QTF3j3u86rpAt42Z6GK+cAEFO27xxTf=S@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <AANLkTinRct_QTF3j3u86rpAt42Z6GK+cAEFO27xxTf=S@mail.gmail.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: andrzej zaborowski <balrogg@gmail.com>
Cc: qemu-devel@nongnu.org

On Fri, Jan 07, 2011 at 04:56:32PM +0100, andrzej zaborowski wrote:
> On 7 January 2011 15:40, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > On Fri, Jan 07, 2011 at 01:52:25PM +0100, andrzej zaborowski wrote:
> >> Hi,
> >>
> >> On 6 January 2011 22:54, Aurelien Jarno <aurelien@aurel32.net> wrote:
> >> > Improve constant loading in two ways:
> >> > - On all ARM versions, it's possible to load 0xffffff00 = -0x100 using
> >> >  the mvn rd, #0. Fix the conditions.
> >> > - On <= ARMv6 versions, where movw and movt are not available, load the
> >> >  constants using mov and orr with rotations depending on the constant
> >> >  to load. This is very useful for example to load constants where the
> >> >  low byte is 0. This reduce the generated code size by about 7%.
> >>
> >> That's a nice improvement.  For some instructions using MVN and AND
> >> could yield even shorter code and I think with that the optimisation
> >> options (except loading from a constant pool) would be exhausted :)
> >
> > I also did something with MVN and BIC, it works well, but the problem is
> > to find the right heuristic to choose between MOV/ORR and MVN/BIC. In my
> > tries, it was making the code bigger.
> 
> I was thinking of running both without writing the instructions, then
> comparing the lengths and then running the better method.  It's
> possible that the cost of this outweights the shorter code advantage
> though.
> 
> >
> >> ...
> >> >         }
> >> > +    } else {
> >> > +        int opc = ARITH_MOV;
> >> > +        int rn = 0;
> >> > +
> >> > +        do {
> >> > +            int i, rot;
> >> > +
> >> > +            i = ctz32(arg) & ~1;
> >> > +            rot = ((32 - i) << 7) & 0xf00;
> >> > +            tcg_out_dat_imm(s, cond, opc, rd, rn, ((arg >> i) & 0xff) | rot);
> >> > +            arg &= ~(0xff << i);
> >> > +
> >> > +            opc = ARITH_ORR;
> >> > +            rn = rd;
> >>
> >> I think you could get rid of rn and just use rd from the start of the
> >> loop.  Otherwise acked by me too.
> >>
> >
> > What do you mean exactly? rn has to be 0 when opc is ARITH_MOV in order
> > to generate a correct ARM instruction.
> 
> According to my ARM926 manual rn is ignored for MOV/MVN, perhaps it's
> different in later revisions.
> 

I have just tried, and it actually works (tried on ARMv5 and ARMv7). 
Note that binutils is not able to disassemble such an instruction and
outputs in qemu.log something like:
| 0x01000008:  e3aa50ff  undefined instruction 0xe3aa50ff

However what worries me the most is that the "ARM Architecture Reference
Manual ARMv7-A and ARMv7-R edition" defines this opcode with the rn field
as "(0)(0)(0)(0)". Looking at what it means:

| An instruction is UNPREDICTABLE if:
| [...]
| * the pseudocode for that encoding does not indicate that a different
|   special case applies, and a bit marked (0) or (1) in the encoding 
| diagram of an instruction is not 0 or 1 respectively.

In short is it still going to work on newer CPUs?

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net