From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:39466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOWhA-0006dg-3L for qemu-devel@nongnu.org; Wed, 17 Oct 2012 12:42:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TOWh8-0000Lw-Fw for qemu-devel@nongnu.org; Wed, 17 Oct 2012 12:42:12 -0400 Received: from hall.aurel32.net ([88.191.126.93]:57016) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TOWh8-0000Kc-A8 for qemu-devel@nongnu.org; Wed, 17 Oct 2012 12:42:10 -0400 Date: Wed, 17 Oct 2012 18:41:54 +0200 From: Aurelien Jarno Message-ID: <20121017164154.GI14078@ohm.aurel32.net> References: <1349202750-16815-1-git-send-email-rth@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1349202750-16815-1-git-send-email-rth@twiddle.net> Subject: Re: [Qemu-devel] [PATCH v2 00/10] Double-word tcg/optimize improvements List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: qemu-devel@nongnu.org On Tue, Oct 02, 2012 at 11:32:20AM -0700, Richard Henderson wrote: > Changes v1->v2: > > * Patch 1 changes the exact swap condition. This helps add2 for e.g. > > add2 tmp4,tmp5,tmp4,tmp5,c1,c2 > > where tmp5, c1, and c2 are all input constants. Since tmp4 is variable, > we cannot constant fold this. But the existing swap condition would give > > add2 tmp4.tmp5,tmp4,c2,c1,tmp5 > > While not incorrect, we do want to prefer "adc $c2,tmp5" on i686. > > * Patch 2 drops the partial constant folding for add2/sub2. It only > does the operand ordering for add2. > > * Patch 4 is new. When writing the code for brcond2 et al, it did seem > silly to do all the gen_args[N] = args[N] copying by hand. I think the > patch makes the code more readable. > > * Patch 5 has the operand typo fixed that Aurelien noticed. > > * Patch 8 is new, adding the extra nop into the opcode stream that > was suggested on the list. With this we fully constant fold add2/sub2. > > * Patch 9 is new. While looking at dumps from x86_64 bios boot, I noticed > that sequences of push/pop insns leave the high-part of %rsp dead. And > in general any 32-bit addition in which the high-part isn't "consumed" > by cc_dst. > > * Patch 10 is new, treating mulu2 similarly to add2. It triggers frequently > during the boot of seabios, and should not be expensive. > > > r~ > > > Richard Henderson (10): > tcg: Split out swap_commutative as a subroutine > tcg: Canonicalize add2 operand ordering > tcg: Swap commutative double-word comparisons > tcg: Use common code when failing to optimize > tcg: Optimize double-word comparisons against zero > tcg: Split out subroutines from do_constant_folding_cond > tcg: Do constant folding on double-word comparisons > tcg: Constant fold add2 and sub2 > tcg: Optimize half-dead add2/sub2 > tcg: Optimize mulu2 > > tcg/optimize.c | 465 ++++++++++++++++++++++++++++++++++++++------------------- > tcg/tcg-op.h | 11 ++ > tcg/tcg.c | 53 ++++++- > 3 files changed, 377 insertions(+), 152 deletions(-) > All applied, after fixing the conflicts in patch 6. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net