From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56654) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xr4Tl-00028X-5r for qemu-devel@nongnu.org; Wed, 19 Nov 2014 07:35:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xr4Tf-0000mr-0a for qemu-devel@nongnu.org; Wed, 19 Nov 2014 07:35:25 -0500 Received: from mail.uni-paderborn.de ([131.234.142.9]:34362) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xr4Te-0000mH-Q8 for qemu-devel@nongnu.org; Wed, 19 Nov 2014 07:35:18 -0500 Message-ID: <546C9CB3.4030509@mail.uni-paderborn.de> Date: Wed, 19 Nov 2014 13:35:47 +0000 From: Bastian Koppelmann MIME-Version: 1.0 References: <1415898767-20461-1-git-send-email-kbastian@mail.uni-paderborn.de> <1415898767-20461-5-git-send-email-kbastian@mail.uni-paderborn.de> <54660622.1020106@twiddle.net> In-Reply-To: <54660622.1020106@twiddle.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RCR opcode format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson , qemu-devel@nongnu.org Cc: peter.maydell@linaro.org On 11/14/2014 01:39 PM, Richard Henderson wrote: > On 11/13/2014 06:12 PM, Bastian Koppelmann wrote: >> + tcg_gen_ext_i32_i64(t3, r3); >> + tcg_gen_concat_i32_i64(t2, r2_low, r2_high); >> + /* extend the sign for r2 to high 64 bits */ >> + tcg_gen_sari_i64(t4, t2, 63); >> + tcg_gen_ext_i32_i64(t1, r1); >> + >> + tcg_gen_muls2_i64(t1, t3, t1, t3); >> + tcg_gen_add2_i64(t1, t3, t2, t4, t1, t3); >> + > I don't believe that you need 128 bit arithemetic for multiply-accumulate, > either here or elsewhere (e.g. msub). > > Looking at unsigned, the maximum result of the multiply is 2*(2^n-1), or 2^(2n) > - 2^(n+1). Which means that the accumulate with a 2^n-1 value cannot overflow > a double-word intermediate result. Madd.u has the following signature 64 + (32 * 32) --> 64, as far as I read the documentation, and would result as you described in a max result of 2^(2n) - 2^(n+1) for the multiplication, but it would accumulate with 2^(2n) -1, which can definitly overflow, with n = 32. However for signed multiply accumulate I don't need 128 bit arithmetic, because only the add/sub operation of those two can overflow. Thanks for the tip! Cheers, Bastian