From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C8BCC67861 for ; Mon, 8 Apr 2024 08:53:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rtkkX-0008Dc-Df; Mon, 08 Apr 2024 04:53:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rtkkV-0008DC-4y for qemu-devel@nongnu.org; Mon, 08 Apr 2024 04:53:07 -0400 Received: from ik1-413-38519.vs.sakura.ne.jp ([153.127.30.23] helo=sakura.ysato.name) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rtkkS-0005P0-UZ for qemu-devel@nongnu.org; Mon, 08 Apr 2024 04:53:06 -0400 Received: from SIOS1075.ysato.ml (ZM005235.ppp.dion.ne.jp [222.8.5.235]) by sakura.ysato.name (Postfix) with ESMTPSA id 12E5D1C102D; Mon, 8 Apr 2024 17:52:59 +0900 (JST) Date: Mon, 08 Apr 2024 17:52:55 +0900 Message-ID: <878r1ock2g.wl-ysato@users.sourceforge.jp> From: Yoshinori Sato To: Zack Buhman Cc: qemu-devel@nongnu.org, peter.maydell@linaro.org Subject: Re: [PATCH v2] sh4: mac.w: implement saturation arithmetic logic In-Reply-To: <20240405233802.29128-3-zack@buhman.org> References: <20240405233802.29128-3-zack@buhman.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Received-SPF: softfail client-ip=153.127.30.23; envelope-from=ysato@users.sourceforge.jp; helo=sakura.ysato.name X-Spam_score_int: -11 X-Spam_score: -1.2 X-Spam_bar: - X-Spam_report: (-1.2 / 5.0 requ) BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Sat, 06 Apr 2024 08:38:04 +0900, Zack Buhman wrote: > > The saturation arithmetic logic in helper_macw is not correct. > > I tested and verified this behavior on a SH7091, the general pattern > is a code sequence such as: > > sets > > mov.l _mach,r2 > lds r2,mach > mov.l _macl,r2 > lds r2,macl > > mova _n,r0 > mov r0,r1 > mova _m,r0 > mac.w @r0+,@r1+ > > _mach: .long 0xffffffff > _macl: .long 0xfffffffe > _m: .word 0x0002 > .word 0 > _n: .word 0x0003 > .word 0 > > test 0: > (mach should not be modified if an overflow did not occur) > > given, prior to saturation mac.l: > mach = 0xffffffff ; macl = 0xfffffffe > @r0 = 0x0002 ; @r1 = 0x0003 > > expected saturation mac.w result: > mach = 0xffffffff (unchanged) > macl = 0x00000004 > > qemu saturation mac.w result (before this commit): > mach = 0x00000001 > macl = 0x80000000 > > In the context of the helper_macw implementation prior to this > commit, initially this appears to be a surprising result. This is > because (prior to unary negation) the C literal `0x80000000` (due to > being outside the range of a `signed int`) is evaluated as an > `unsigned int` whereas the literal `1` (due to being inside the > range of `signed int`) is evaluated as `signed int`, as in: > > static_assert(1 < -0x80000000 == 1); > static_assert(1 < -1 == 0); > > This is because the unary negation of an unsigned int is an > unsigned int. > > In other words, if the `res < -0x80000000` comparison used > infinite-precision literals, the saturation mac.w result would have > been: > > mach = 0x00000000 > macl = 0x00000004 > > Due to this (forgivable) misunderstanding of C literals, the > following behavior also occurs: > > test 1: > (`2 * 3 + 0` is not an overflow) > > given, prior to saturation mac.l: > mach = 0x00000000 ; macl = 0x00000000 > @r0 = 0x0002 ; @r1 = 0x0003 > > expected saturation mac.w result: > mach = 0x00000000 (unchanged) > macl = 0x00000006 > > qemu saturation mac.w result (before this commit): > mach = 0x00000001 > macl = 0x80000000 > > test 2: > (mach should not be accumulated in saturation mode) > (16-bit operands are sign-extended) > > given, prior to saturation mac.l: > mach = 0x12345678 ; macl = 0x7ffffffe > @r0 = 0x0002 ; @r1 = 0xfffd > > expected saturation mac.w result: > mach = 0x12345678 (unchanged) > macl = 0x7ffffff8 > > qemu saturation mac.w result (before this commit): > mach = 0x00000001 > macl = 0x7fffffff > > test 3: > (macl should have the correct saturation value) > > given, prior to saturation mac.l: > mach = 0xabcdef12 ; macl = 0x7ffffffa > @r0 = 0x0002 ; @r1 = 0x0003 > > expected saturation mac.w result: > mach = 0x00000001 (overwritten) > macl = 0x7fffffff > > qemu saturation mac.w result (before this commit): > mach = 0x00000001 > macl = 0x80000000 > > All of the above also matches the description of MAC.W as documented > in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf > > Signed-off-by: Zack Buhman > --- > target/sh4/op_helper.c | 45 ++++++++++++++++++++++++++++++++---------- > 1 file changed, 35 insertions(+), 10 deletions(-) > > diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c > index ee16524083..07ff2cf53d 100644 > --- a/target/sh4/op_helper.c > +++ b/target/sh4/op_helper.c > @@ -187,20 +187,45 @@ void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1) > > void helper_macw(CPUSH4State *env, uint32_t arg0, uint32_t arg1) > { > - int64_t res; > + int16_t value0 = (int16_t)arg0; > + int16_t value1 = (int16_t)arg1; > + int32_t mul = ((int32_t)value0) * ((int32_t)value1); > > - res = ((uint64_t) env->mach << 32) | env->macl; > - res += (int64_t) (int16_t) arg0 *(int64_t) (int16_t) arg1; > - env->mach = (res >> 32) & 0xffffffff; > - env->macl = res & 0xffffffff; > + /* Perform 32-bit saturation arithmetic if the S flag is set */ > if (env->sr & (1u << SR_S)) { > - if (res < -0x80000000) { > - env->mach = 1; > - env->macl = 0x80000000; > - } else if (res > 0x000000007fffffff) { > + const int32_t upper_bound = ((1u << 31) - 1); > + const int32_t lower_bound = -((1u << 31) - 0); > + > + /* > + * In saturation arithmetic mode, the accumulator is 32-bit > + * with carry. MACH is not considered during the addition > + * operation nor the 32-bit saturation logic. > + */ > + int32_t mac = env->macl; > + int32_t result; > + bool overflow = sadd32_overflow(mac, mul, &result); > + if (overflow) { > + result = (mac < 0) ? lower_bound : upper_bound; > + /* MACH is set to 1 to denote overflow */ > + env->macl = result; > env->mach = 1; > - env->macl = 0x7fffffff; > + } else { > + /* > + * If there is no overflow, the result is already inside > + * the saturation bounds. > + * > + * If there was no overflow, MACH is unchanged. > + */ > + env->macl = result; > } > + } else { > + /* In non-saturation arithmetic mode, the accumulator is 64-bit */ > + int64_t mac = (((uint64_t)env->mach) << 32) | env->macl; > + > + /* The carry bit of the 64-bit addition is discarded */ > + int64_t result = mac + (int64_t)mul; > + env->macl = result; > + env->mach = result >> 32; > } > } > > -- > 2.41.0 > Reviewd-by: Yoshinori Sato -- Yosinori Sato