From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:35882)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <aurelien@aurel32.net>) id 1QNVW5-0003ng-GY
	for qemu-devel@nongnu.org; Fri, 20 May 2011 15:37:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <aurelien@aurel32.net>) id 1QNVW4-0003RE-Cc
	for qemu-devel@nongnu.org; Fri, 20 May 2011 15:37:45 -0400
Received: from hall.aurel32.net ([88.191.126.93]:39719)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <aurelien@aurel32.net>) id 1QNVW4-0003RA-65
	for qemu-devel@nongnu.org; Fri, 20 May 2011 15:37:44 -0400
Date: Fri, 20 May 2011 21:37:41 +0200
From: Aurelien Jarno <aurelien@aurel32.net>
Message-ID: <20110520193741.GC27170@hall.aurel32.net>
References: <cover.1305889001.git.batuzovk@ispras.ru>
	<4DD6A9F9.7040805@twiddle.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <4DD6A9F9.7040805@twiddle.net>
Sender: Aurelien Jarno <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH 0/6] Implement constant folding and copy
 propagation in TCG
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Richard Henderson <rth@twiddle.net>
Cc: mj.mccormack@samsung.com, qemu-devel@nongnu.org, zhur@ispras.ru, Kirill Batuzov <batuzovk@ispras.ru>

On Fri, May 20, 2011 at 10:50:49AM -0700, Richard Henderson wrote:
> On 05/20/2011 05:39 AM, Kirill Batuzov wrote:
> > This series implements some basic machine-independent optimizations.  They
> > simplify code and allow liveness analysis do it's work better.
> > 
> > Suppose we have following ARM code:
> > 
> >  movw    r12, #0xb6db
> >  movt    r12, #0xdb6d
> > 
> > In TCG before optimizations we'll have:
> > 
> >  movi_i32 tmp8,$0xb6db
> >  mov_i32 r12,tmp8
> >  mov_i32 tmp8,r12
> >  ext16u_i32 tmp8,tmp8
> >  movi_i32 tmp9,$0xdb6d0000
> >  or_i32 tmp8,tmp8,tmp9
> >  mov_i32 r12,tmp8
> > 
> > And after optimizations we'll have this:
> > 
> >  movi_i32 r12,$0xdb6db6db
> > 
> > Here are performance evaluation results on SPEC CPU2000 integer tests in
> > user-mode emulation on x86_64 host.  There were 5 runs of each test on
> > reference data set.  The tables below show runtime in seconds for all these
> > runs.
> 
> I totally agree that this sort of optimization is needed in TCG.  Essentially
> all RISC guests have the same problem.  When emulating one RISC upon another,
> the problem may be exacerbated.  E.g. Sparc on PPC -- sparc will use a 21/11
> bit split of the constant, ppc will use a 16/16 split of the constant, which
> results in 3 insns in the generated code where 2 would do.
> 
> You should be aware of prior work in this area by Aurelien Jarno:
> 
>   git://git.aurel32.net/qemu.git tcg-optimizations
> 
> Given that's now 2 years old, and doesn't seem to be progressing, I hope your
> patch series can get things going again...

I basically stopped working on constant propagation, as while the TCG 
code looked nicer, the resulting code was always slower.

Since the discussion about TCG_AREG0, I have started to work again on
the register allocation (see the first patch series I sent about that),
I hope to have something ready by the end of the week-end.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net