From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KhRvG-0003oU-HB for qemu-devel@nongnu.org; Sun, 21 Sep 2008 12:36:34 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KhRvD-0003o6-VS for qemu-devel@nongnu.org; Sun, 21 Sep 2008 12:36:33 -0400 Received: from [199.232.76.173] (port=57532 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KhRvD-0003o3-PF for qemu-devel@nongnu.org; Sun, 21 Sep 2008 12:36:31 -0400 Received: from mail.codesourcery.com ([65.74.133.4]:38331) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KhRvD-0001FS-2A for qemu-devel@nongnu.org; Sun, 21 Sep 2008 12:36:31 -0400 From: Paul Brook Subject: Re: [Qemu-devel] [5281] Use the new concat_i32_i64 op for std and stda Date: Sun, 21 Sep 2008 17:36:25 +0100 References: <200809211601.35191.paul@codesourcery.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200809211736.25590.paul@codesourcery.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Blue Swirl > > It's more efficient to not use concat_i32_i64 on 64-bit targets. > > See patch below. I'll let you decide if you want to apply or ignore it. > > Good point. How about adding instead a concat_i64_i64 starting from this > piece: I'd call it concat32_i64 (c.f. extu_i32_i64 v.s. ext32u_i64). > > + tcg_gen_shli_i64(dest, high, 32); > > + tcg_gen_ext32u_i64(low, low); > > + tcg_gen_or_i64(dest, dest, low); > > and also add defines for concat_tl_i64? That would be both clean and > efficient. Sounds reasonable. The cheat for tl->i64 on 32-bit hosts is ok if it's confined to tcg-opc.h. > I think ldd on i386 would benefit from a reverse operation (64 to two > 32/64 bit words). 64 to 32 should be much easier for TCG to optimize automatically. It's just a a series of copies, and can often be implemented without requiring any extra temporaries. The 32 to 64 case is a trickier because you generally need additional temporaries for type correctness and have to propagate all the zeros through the 64-bit logical or. Paul