From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NLJpH-0007dA-Ko for qemu-devel@nongnu.org; Thu, 17 Dec 2009 12:07:43 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NLJpG-0007bh-2b for qemu-devel@nongnu.org; Thu, 17 Dec 2009 12:07:43 -0500 Received: from [199.232.76.173] (port=55535 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NLJpF-0007bN-QT for qemu-devel@nongnu.org; Thu, 17 Dec 2009 12:07:41 -0500 Received: from are.twiddle.net ([75.149.56.221]:48966) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NLJpF-0002hO-DO for qemu-devel@nongnu.org; Thu, 17 Dec 2009 12:07:41 -0500 Message-ID: <4B2A655A.3050406@twiddle.net> Date: Thu, 17 Dec 2009 09:07:38 -0800 From: Richard Henderson MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: malc Cc: qemu-devel@nongnu.org On 12/17/2009 07:32 AM, malc wrote: >> These new opcodes are considered "required" by the backend, >> because expanding them at the tcg level breaks the basic block. >> There might be some way to emulate within tcg internals, but >> that doesn't seem worthwhile, as essentially all hosts have >> some form of support for these. ... > c. Historically things like that were made conditional with > a generic fallback (bswap, neg, not, rot, etc) I answered this one above. A generic fallback would break the basic block, which would break TCGs simple register allocation. > b. Documentation for movcond has a typo, t0 is assigned not t1 Oops. Will fix. > d. Documentation for setcond2 is missing Ah, I see that brcond2 is missing as well; I'll fix that too. > It would also be interesting to learn what impact adding those two > has on performance, any results? Hmph, not as much as I would have liked. I suppose Intel is getting pretty darned good with its branch prediction. It shaved about 3 minutes off 183.equake from what I posted earlier this week; that's something around a 7% improvement, assuming it's not just all noise (I havn't run that test enough times to see what the variation is). > + case TCG_COND_NE: > + if (const_arg2) { > + if ((uint16_t) arg2 == arg2) { > + tcg_out32 (s, XORI | RS (arg1) | RA (0) | arg2); > + } > + else { > + tcg_out_movi (s, TCG_TYPE_I32, 0, arg2); > + tcg_out32 (s, XOR | SAB (arg1, 0, 0)); > + } > + } > + else { > + tcg_out32 (s, XOR | SAB (arg1, 0, arg2)); > + } > + > + tcg_out32 (s, ADDIC | RT (arg0) | RA (0) | 0xffff); > + tcg_out32 (s, SUBFE | TAB (arg0, arg0, 0)); > + return; Heh, you know a trick that gcc doesn't for powerpc. It just adds an xor at the end of the EQ sequence. r~