From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42720) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TIkqj-0007Ag-8P for qemu-devel@nongnu.org; Mon, 01 Oct 2012 14:36:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TIkqc-0000yc-P1 for qemu-devel@nongnu.org; Mon, 01 Oct 2012 14:36:13 -0400 Received: from mail-pb0-f45.google.com ([209.85.160.45]:39327) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TIkqc-0000yG-Hy for qemu-devel@nongnu.org; Mon, 01 Oct 2012 14:36:06 -0400 Received: by pbbrp2 with SMTP id rp2so8092769pbb.4 for ; Mon, 01 Oct 2012 11:36:05 -0700 (PDT) Sender: Richard Henderson Message-ID: <5069E29E.4060001@twiddle.net> Date: Mon, 01 Oct 2012 11:36:14 -0700 From: Richard Henderson MIME-Version: 1.0 References: <1348766397-20731-1-git-send-email-rth@twiddle.net> <1348766397-20731-3-git-send-email-rth@twiddle.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 2/7] tcg: Optimize add2 + sub2 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: qemu-devel@nongnu.org, Aurelien Jarno On 2012-09-30 00:04, Blue Swirl wrote: >> We can't do complete constant folding because we lack "mov2", >> > or the ability to insert opcodes in the stream. But we can >> > at least canonicalize add2 operand ordering and simplify >> > add2 to add when the lowpart adds a constant 0. > Couldn't we introduce add2_part1 and add2_part2, the latter being nop > for architectures that don't need it? Possibly. It certainly would be easy to model these as addcc + addx on targets like sparc where CC never gets clobbered during moves. I'm a bit worried about i386 though, since loading 0 wants to use xor and clobber the flags. We could possibly work around this by taking care of the constant loading for add2_part2 manually. E.g. { INDEX_op_add2_part2, { "r", "ri", "ri" } } if (args[2] == args[0] && !const_args[2]) { // swap arg1 arg2 } if (const_args[1]) { mov $args[1], args[0] } else { mov args[1], args[0] } adcl args[2], args[0] which means that tcg_out_movi will not have to be called in between. It's all a bit fragile though. That said, I do wonder if having a synthetic mov2{rr,ri,ii} opcodes isn't just easier. That could be broken up into two moves by tcg.c without the backends having to care about it. r~