From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TIk4w-0003YN-Ng for qemu-devel@nongnu.org; Mon, 01 Oct 2012 13:46:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TIk4s-0006ko-Fb for qemu-devel@nongnu.org; Mon, 01 Oct 2012 13:46:50 -0400 Received: from hall.aurel32.net ([88.191.126.93]:34431) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TIk4s-0006k7-9V for qemu-devel@nongnu.org; Mon, 01 Oct 2012 13:46:46 -0400 Date: Mon, 1 Oct 2012 19:46:45 +0200 From: Aurelien Jarno Message-ID: <20121001174644.GA4623@ohm.aurel32.net> References: <1348766397-20731-1-git-send-email-rth@twiddle.net> <1348766397-20731-3-git-send-email-rth@twiddle.net> <20120927232015.GN23819@ohm.aurel32.net> <5064E12F.5040308@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <5064E12F.5040308@twiddle.net> Subject: Re: [Qemu-devel] [PATCH 2/7] tcg: Optimize add2 + sub2 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: qemu-devel@nongnu.org On Thu, Sep 27, 2012 at 04:28:47PM -0700, Richard Henderson wrote: > On 09/27/2012 04:20 PM, Aurelien Jarno wrote: > > I understand that we can't easily insert an instruction, so the > > limitation comes from here, but is it really something happening often? > > It will certainly appear sometimes. E.g. s390x has an add immediate > instruction that does exactly: r1 += imm16 << 32. > > Or did you mean specifically the full constant being folded? That > would happen quite a bit more often. That you can see with most any > 64-bit RISC guest when they attempt to generate a constant from > addition primitives instead of logical primitives. > > For a 32-bit host, we've already decomposed logical primitives to 32-bit > operations. And we can constant-fold through all of those. But when > addition comes into play, we can't constant-fold through add2. > I tried this patch on an i386 host running an x86_64 target, but it even fails to start seabios, there is probably a wrong logic somewhere in the patch. For the first add2 that seemed to have work correctly, this patch optimized 0.2% of them. I am not sure it worth it as is. I think optimizing add2, and in general all *2 ops is a good idea, but we should be able to do more agressive optimization. Maybe, a bit like Blue was suggesting, add2 should always be followed by a nop, so we can do more optimizations? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net