From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39439) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZMdli-0000ps-Qa for qemu-devel@nongnu.org; Tue, 04 Aug 2015 11:04:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZMdld-0008EL-3x for qemu-devel@nongnu.org; Tue, 04 Aug 2015 11:04:42 -0400 Received: from mail-qk0-x22a.google.com ([2607:f8b0:400d:c09::22a]:33416) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZMdlc-0008E9-W1 for qemu-devel@nongnu.org; Tue, 04 Aug 2015 11:04:37 -0400 Received: by qkdg63 with SMTP id g63so4227255qkd.0 for ; Tue, 04 Aug 2015 08:04:36 -0700 (PDT) Sender: Richard Henderson References: <55BF9975.7020002@twiddle.net> From: Richard Henderson Message-ID: <55C0D480.2070103@twiddle.net> Date: Tue, 4 Aug 2015 08:04:32 -0700 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Consult] tilegx: About floating point instructions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chen Gang , Chris Metcalf , Peter Maydell , =?UTF-8?Q?Andreas_F=c3=a4rber?= , "walt@tilera.com" Cc: qemu-devel On 08/04/2015 06:56 AM, Chen Gang wrote: > > On 8/4/15 04:47, Chen Gang wrote: >> On 8/4/15 00:40, Richard Henderson wrote: >>> On 08/01/2015 02:47 AM, Chen Gang wrote: >>>> I am just adding floating point instructions (e.g. fsingle_add1), >>>> but for me, I can not find any details about them (the ISA >>>> documents only give a summary description, but not details), e.g. >>> >>> The tilegx splits the four/six cycle arithmetic into multiple >>> black-box instructions. You need only really implement one of the >>> four, with the rest of them being implemented as nops or moves. >>> >>> Looking at what gcc produces gives the hints: >>> >>> fdouble_unpack_min min, srca, srcb fdouble_unpack_max max, srca, >>> srcb fdouble_add_flags flg, srca, srcb fdouble_addsub max, min, flg >>> fdouble_pack1 dst, max, flg fdouble_pack2 dst, max, zero >>> >>> The unpack, addsub, and pack2 insns can be ignored, the add_flags >>> insn can perform the whole operation, the pack1 insn performs a move >>> from "flg" to "dst". >>> >>> Similarly for the single-precision: >>> >>> fsingle_add1 tmp, srca, srcb fsingle_addsub2 tmp, srca, srcb >>> fsingle_pack1 flg, tmp fsingle_pack2 dst, tmp, flg >>> >>> The add1 insn performs the whole operation, the addsub2 and pack1 >>> insns are ignored, and the pack2 insn is a move from tmp to dst. >>> > > After check the tilegx.md completely, for me, we still need implement > each of them precisely, or we can not emulate all cases (e.g. muldf3). No, you can still implement all of muldf3 in fdouble_mul_flags. Again, the fdouble_pack1 copies from the flag input to the output. Yes, there is a 64-bit multiply in there, but the tcg optimizer should be able to delete all of that as unused. Especially if you have the fdouble_unpack* insns store zero into their destinations. Don't get me wrong -- more accurate implementation of the actual insns would be nice, especially for debugging. But if the insns aren't accurately documented I don't see what choice we have. On the good side, implementing the entire operation as part of the "flags" step probably results in faster emulation. r~