From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42899) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UReYG-00033Y-Dm for qemu-devel@nongnu.org; Mon, 15 Apr 2013 04:14:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UReYD-00043J-RC for qemu-devel@nongnu.org; Mon, 15 Apr 2013 04:14:12 -0400 Received: from hall.aurel32.net ([2001:470:1f15:c4f::1]:49001) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UReYD-000437-Jf for qemu-devel@nongnu.org; Mon, 15 Apr 2013 04:14:09 -0400 Date: Mon, 15 Apr 2013 10:14:03 +0200 From: Aurelien Jarno Message-ID: <20130415081403.GA11917@ohm.aurel32.net> References: <1365116186-19382-1-git-send-email-rth@twiddle.net> <1365116186-19382-34-git-send-email-rth@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1365116186-19382-34-git-send-email-rth@twiddle.net> Subject: Re: [Qemu-devel] [PATCH v4 33/33] tcg-ppc64: Handle deposit of zero List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: av1474@comtv.ru, qemu-devel@nongnu.org On Thu, Apr 04, 2013 at 05:56:26PM -0500, Richard Henderson wrote: > The TCG optimizer does great work when inserting constants, being able > to fold the open-coded deposit expansion to just an AND or an OR. Avoid > a bit the regression caused by having the deposit opcode by expanding > deposit of zero as an AND. > > Signed-off-by: Richard Henderson > --- > tcg/ppc64/tcg-target.c | 22 ++++++++++++++++------ > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c > index 9583cf9..772f4ac 100644 > --- a/tcg/ppc64/tcg-target.c > +++ b/tcg/ppc64/tcg-target.c > @@ -1928,12 +1928,22 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args, > break; > > case INDEX_op_deposit_i32: > - tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3], > - 32 - args[3] - args[4], 31 - args[3]); > + if (const_args[2]) { > + uint32_t mask = ((2u << (args[4] - 1)) - 1) << args[3]; > + tcg_out_andi32(s, args[0], args[0], ~mask); > + } else { > + tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3], > + 32 - args[3] - args[4], 31 - args[3]); > + } > break; > case INDEX_op_deposit_i64: > - tcg_out_rld(s, RLDIMI, args[0], args[2], args[3], > - 64 - args[3] - args[4]); > + if (const_args[2]) { > + uint64_t mask = ((2ull << (args[4] - 1)) - 1) << args[3]; > + tcg_out_andi64(s, args[0], args[0], ~mask); > + } else { > + tcg_out_rld(s, RLDIMI, args[0], args[2], args[3], > + 64 - args[3] - args[4]); > + } > break; > > case INDEX_op_movcond_i32: > @@ -2136,8 +2146,8 @@ static const TCGTargetOpDef ppc_op_defs[] = { > { INDEX_op_bswap32_i64, { "r", "r" } }, > { INDEX_op_bswap64_i64, { "r", "r" } }, > > - { INDEX_op_deposit_i32, { "r", "0", "r" } }, > - { INDEX_op_deposit_i64, { "r", "0", "r" } }, > + { INDEX_op_deposit_i32, { "r", "0", "rZ" } }, > + { INDEX_op_deposit_i64, { "r", "0", "rZ" } }, > > { INDEX_op_add2_i64, { "r", "r", "r", "rI", "r", "rZM" } }, > { INDEX_op_sub2_i64, { "r", "r", "rI", "r", "rZM", "r" } }, I first thought this should go into the middle end, but OTOH it will de-optimize some TCG targets which have a zero register like MIPS. In the long term I think we should allow the middle end to query the backend for constraints, and take the right decision (which would also improve constant propagation on some hosts). In the meantime it looks like the right thing to do, so: Reviewed-by: Aurelien Jarno -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net