From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AAF6C433EF for ; Fri, 12 Nov 2021 02:32:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8252560EE7 for ; Fri, 12 Nov 2021 02:32:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8252560EE7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlMMt-0003Xl-Jj for qemu-devel@archiver.kernel.org; Thu, 11 Nov 2021 21:32:43 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58052) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlMLy-0001VO-Di; Thu, 11 Nov 2021 21:31:46 -0500 Received: from [2607:f8b0:4864:20::92d] (port=46912 helo=mail-ua1-x92d.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mlMLv-0007le-GR; Thu, 11 Nov 2021 21:31:46 -0500 Received: by mail-ua1-x92d.google.com with SMTP id az37so15902086uab.13; Thu, 11 Nov 2021 18:31:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to:cc; bh=h8hqWgSxiOyVQuIgEVInFGYrEc74065H7+LBeLGubwg=; b=TCIOcIwhUVPWC0ftT1zbCBoHoiKKXVWCuFY9aUlM6ITkUerNsj3pidj0l8AY0p5XKJ Bh8YaEL6Mp8CT+SChVe56B5pg8z/7exTMErW33ddz+XxbTlCxj25lPXl84su1FmypdS8 UOIg9T3fx3BZJrjkyLf//mNdUpE/B2UawPmZWNYBpwbVkUeoXQMGF63Izk7+I6uBCN4e QzU72HnA4H5fdHYUDXgCPPmPc9Eol9eLGJX7uWfD+BSlepTY5X9Texf88BUa8mQ2ZI6Y ryMZiO54j3ZGX43i/AbcxO7vzzj7zCzo0h7/uMuIR3IQWeAzikAEbI9PNgAu0sQ6O/wU C0bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=h8hqWgSxiOyVQuIgEVInFGYrEc74065H7+LBeLGubwg=; b=pMKgNK04UPob0O1ooqWzq6ufVBH+ChJJQ2lOO8au79OhvWLiBajJrqfcHWOZlcQGbW 0SCR7jFc2y7elOUwxPk4Q0R1ENsv88Pqe79+am5d0p4t91IvDjSgqRU4/HuAxUL8rDof +Md23MsVMX62odQrU8jgpUrkgAOLjP8qIvKdJZZbGkFbZhHciVLXdryAOs3TS7J7K9Xw OkJuww7mDiQj1blXMOYb7TDPVkWJUXxC72U3c/h5kU9DuTEc9fgDt1+4d1wS4u4H4gkS 3qE8DwbQjYsn1PyCPUTlFwn8gF8YHOS8GoJ5AKk0/zsp+Ujmyk4HAnc4UIDZ3Z1wAAb4 qoOQ== X-Gm-Message-State: AOAM5307OHvli0IvmDULyz5Kw947OUBzLjBbi2MqY/gu2Ey72lmXJA+m VSbjIdr58ema3sMyWJZKdCTizYNw4xCSmZW8Zjs= X-Google-Smtp-Source: ABdhPJwfZtrfFHC6GINb16vgl+uV4v/YcluRzE0KqzivKk6p94bZD8KOPXLyD5VWP5kWY2HS1Cjbus5d/bAYSZL9C4M= X-Received: by 2002:a05:6102:2924:: with SMTP id cz36mr5239510vsb.33.1636684301512; Thu, 11 Nov 2021 18:31:41 -0800 (PST) MIME-Version: 1.0 References: <20211020125724.78028-1-lucas.araujo@eldorado.org.br> <20211020125724.78028-2-lucas.araujo@eldorado.org.br> In-Reply-To: From: =?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1byk=?= Date: Fri, 12 Nov 2021 10:31:31 +0800 Message-ID: Subject: Re: [PATCH 1/2] target/ppc: Fixed call to deferred exception To: BALATON Zoltan Content-Type: multipart/alternative; boundary="00000000000066b4b705d08e4273" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::92d (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::92d; envelope-from=luoyonggang@gmail.com; helo=mail-ua1-x92d.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: luoyonggang@gmail.com Cc: "Lucas Mateus Castro \(alqotel\)" , Mark Cave-Ayland , danielhb413@gmail.com, Richard Henderson , qemu-level , Lucas Mateus Martins Araujo e Castro , qemu-ppc@nongnu.org, David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --00000000000066b4b705d08e4273 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Nov 11, 2021 at 4:42 AM BALATON Zoltan wrote: > > On Wed, 10 Nov 2021, Lucas Mateus Martins Araujo e Castro wrote: > > On 10/11/2021 05:19, Mark Cave-Ayland wrote: > >> On 20/10/2021 13:57, Lucas Mateus Castro (alqotel) wrote: > >>> From: "Lucas Mateus Castro (alqotel)" > >>> > >>> mtfsf, mtfsfi and mtfsb1 instructions call helper_float_check_status > >>> after updating the value of FPSCR, but helper_float_check_status > >>> checks fp_status and fp_status isn't updated based on FPSCR and > >>> since the value of fp_status is reset earlier in the instruction, > >>> it's always 0. > >>> > >>> Because of this helper_float_check_status would change the FI bit to = 0 > >>> as this bit checks if the last operation was inexact and > >>> float_flag_inexact is always 0. > >>> > >>> These instructions also don't throw exceptions correctly since > >>> helper_float_check_status throw exceptions based on fp_status. > >>> > >>> This commit created a new helper, helper_fpscr_check_status that checks > >>> FPSCR value instead of fp_status and checks for a larger variety of > >>> exceptions than do_float_check_status. > >>> > >>> The hardware used to compare QEMU's behavior to, was a Power9. > >>> > >>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/266 > >>> Signed-off-by: Lucas Mateus Castro (alqotel) > >>> > >>> --- > >>> target/ppc/fpu_helper.c | 41 ++++++++++++++++++++++++++++++ > >>> target/ppc/helper.h | 1 + > >>> target/ppc/translate/fp-impl.c.inc | 6 ++--- > >>> 3 files changed, 45 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c > >>> index c4896cecc8..f086cb503f 100644 > >>> --- a/target/ppc/fpu_helper.c > >>> +++ b/target/ppc/fpu_helper.c > >>> @@ -414,6 +414,47 @@ void helper_store_fpscr(CPUPPCState *env, uint64_t > >>> val, uint32_t nibbles) > >>> ppc_store_fpscr(env, val); > >>> } > >>> > >>> +void helper_fpscr_check_status(CPUPPCState *env) > >>> +{ > >>> + CPUState *cs =3D env_cpu(env); > >>> + target_ulong fpscr =3D env->fpscr; > >>> + int error =3D 0; > >>> + > >>> + if ((fpscr & FP_VXSOFT) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXSOFT; > >>> + } else if ((fpscr & FP_OX) && (fpscr & FP_OE)) { > >>> + error =3D POWERPC_EXCP_FP_OX; > >>> + } else if ((fpscr & FP_UX) && (fpscr & FP_UE)) { > >>> + error =3D POWERPC_EXCP_FP_UX; > >>> + } else if ((fpscr & FP_XX) && (fpscr & FP_XE)) { > >>> + error =3D POWERPC_EXCP_FP_XX; > >>> + } else if ((fpscr & FP_ZX) && (fpscr & FP_ZE)) { > >>> + error =3D POWERPC_EXCP_FP_ZX; > >>> + } else if ((fpscr & FP_VXSNAN) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXSNAN; > >>> + } else if ((fpscr & FP_VXISI) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXISI; > >>> + } else if ((fpscr & FP_VXIDI) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXIDI; > >>> + } else if ((fpscr & FP_VXZDZ) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXZDZ; > >>> + } else if ((fpscr & FP_VXIMZ) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXIMZ; > >>> + } else if ((fpscr & FP_VXVC) && (fpscr_ve !=3D 0)) { > >>> + error =3D POWERPC_EXCP_FP_VXVC; > >>> + } > >>> + > >>> + if (error) { > >>> + cs->exception_index =3D POWERPC_EXCP_PROGRAM; > >>> + env->error_code =3D error | POWERPC_EXCP_FP; > >>> + /* Deferred floating-point exception after target FPSCR update */ > >>> + if (fp_exceptions_enabled(env)) { > >>> + raise_exception_err_ra(env, cs->exception_index, > >>> + env->error_code, GETPC()); > >>> + } > >>> + } > >>> +} > >>> + > >>> static void do_float_check_status(CPUPPCState *env, uintptr_t raddr= ) > >>> { > >>> CPUState *cs =3D env_cpu(env); > >>> diff --git a/target/ppc/helper.h b/target/ppc/helper.h > >>> index 4076aa281e..baa3715e73 100644 > >>> --- a/target/ppc/helper.h > >>> +++ b/target/ppc/helper.h > >>> @@ -61,6 +61,7 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32, > >>> i32) > >>> DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl) > >>> > >>> DEF_HELPER_1(float_check_status, void, env) > >>> +DEF_HELPER_1(fpscr_check_status, void, env) > >>> DEF_HELPER_1(reset_fpstatus, void, env) > >>> DEF_HELPER_2(compute_fprf_float64, void, env, i64) > >>> DEF_HELPER_3(store_fpscr, void, env, i64, i32) > >>> diff --git a/target/ppc/translate/fp-impl.c.inc > >>> b/target/ppc/translate/fp-impl.c.inc > >>> index 9f7868ee28..0a9b1ecc60 100644 > >>> --- a/target/ppc/translate/fp-impl.c.inc > >>> +++ b/target/ppc/translate/fp-impl.c.inc > >>> @@ -782,7 +782,7 @@ static void gen_mtfsb1(DisasContext *ctx) > >>> tcg_gen_shri_i32(cpu_crf[1], cpu_crf[1], FPSCR_OX); > >>> } > >>> /* We can raise a deferred exception */ > >>> - gen_helper_float_check_status(cpu_env); > >>> + gen_helper_fpscr_check_status(cpu_env); > >>> } > >>> > >>> /* mtfsf */ > >>> @@ -818,7 +818,7 @@ static void gen_mtfsf(DisasContext *ctx) > >>> tcg_gen_shri_i32(cpu_crf[1], cpu_crf[1], FPSCR_OX); > >>> } > >>> /* We can raise a deferred exception */ > >>> - gen_helper_float_check_status(cpu_env); > >>> + gen_helper_fpscr_check_status(cpu_env); > >>> tcg_temp_free_i64(t1); > >>> } > >>> > >>> @@ -851,7 +851,7 @@ static void gen_mtfsfi(DisasContext *ctx) > >>> tcg_gen_shri_i32(cpu_crf[1], cpu_crf[1], FPSCR_OX); > >>> } > >>> /* We can raise a deferred exception */ > >>> - gen_helper_float_check_status(cpu_env); > >>> + gen_helper_fpscr_check_status(cpu_env); > >>> } > >>> > >>> /*** Floating-point > >>> load ***/ > >> > >> FWIW the real issue here is that gen_helper_reset_fpstatus() even exists at > >> all: see > >> the comments around enabling hardfloat in the PPC target by Emilio and > >> Richard at > >> https://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg04974.html and > >> https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg00064.html. > >> > >> I have tried a few informal experiments on my MacOS images by completely > >> removing all > >> calls to gen_reset_fpstatus(), and whilst there were a few odd behaviours I > >> was > >> surprised to find that the basic OS was usable. The main issue I had was > >> trying to > >> come up with suitable test cases for the various instructions when my only > >> available > >> hardware is a G4 Mac Mini. > >> > >> So yes this patch fixes one particular use case, but the real issue is that > >> the PPC > >> target floating point flags need a bit of work: however once this is done > >> it should > >> be possible for hardfloat to be enabled via a CPU option on suitable hosts > >> which will > >> bring a noticeable improvement in floating point performance. > >> > > In this case I don't think gen_helper_reset_fpstatus() is the problem, > > fp_status is not updated in the instruction but its value is used in > > helper_float_check_status(), so if the values have not been reset since the > > last instruction it'll contain last instruction's information and if it has > > (either by calling gen_helper_reset_fpstatus(), by automatically doing it > > every instruction or by having every instruction reset it in the end) it'll > > have 0. So there are 3 alternatives to solve this that I can think of: > > > > * Update FPSCR directly, then update fp_status based on FPSCR, for this > > you would either have to call a new helper to do this or update > > helper_store_fpscr to do this, and then expand do_float_check_status to throw > > more exceptions (or create a new helper to do this if expanding > > do_float_check_status could cause problems), > > > > * Just don't use fp_status, update FPSCR directly and do the deferred > > exception using only information from FPSCR (the one I used this patch)= , > > > > * Update only fp_status directly and call either a modified > > do_float_check_status or a new helper that would update FPSCR and throw the > > correct exception based on fp_status, this one I don't see how it would > > feasible in the current implementation as FPSCR has many bits without a= n > > equivalent in fp_status. > > > > So with this I can see how to implement the 1st and 2nd option, I chose not > > to use the 1st one as do_float_check_status updates the FPSCR then throw the > > exception, which seemed unnecessary. Also looking back I should've removed > > gen_reset_fpstatus() as in the way it ended implemented these instructions > > don't interact with fp_status anywhere else, so I'll remove it in the next > > version. > > > > And looking at the suggestions the current implementation could be changed to > > take advantage of the optimization suggested in the discussion you linked, > > specially the parts about checking when exception bits aren't set (but in > > this case it would've to be the MSR exception bits) and the part about > > skipping calculating a flag when marked to 1. > > I haven't followed the discussion but here's another message with some > links I've collected when FPU came up that may be relevant to the topic: > > https://lists.nongnu.org/archive/html/qemu-ppc/2020-04/msg00387.html > > among those is a long thread on patchwork that has some info on the > current situation. As far as I remember the oddity in handling FPU > exceptions is partly because of two bits FI and FR in FPSCR that should That's correct, the most hard part is to simulate FI and FR bits in FPSCR a= s that was not provided on other CPU. > reflect the result of the previous FPU op so has to be updated after ever= y > op which makes it hard to emulate as other CPUs usually don't do this. (W= e > could easily improve it if we did not emulate those bits, most guest code > don't use them anyway, but QEMU prefers accuracy so that way was ruled > out.) Other than that the current code maybe also can be simplified and > maybe optimised via some other ways which were discussed in those threads > but nobody implemented any of the ideas so far. May worth reading through > what was said before as there might be sume useful ideas in there. > > Regards, > BALATON Zoltan -- =E6=AD=A4=E8=87=B4 =E7=A4=BC =E7=BD=97=E5=8B=87=E5=88=9A Yours sincerely, Yonggang Luo --00000000000066b4b705d08e4273 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Nov 11, 2021 at 4:42 AM BALATON Zoltan <= ;balaton@eik.bme.hu> wrote:>
> On Wed, 10 Nov 2021, Lucas Mateus Martins Araujo e Castro wro= te:
> > On 10/11/2021 05:19, Mark Cave-Ayland wrote:
> >&= gt; On 20/10/2021 13:57, Lucas Mateus Castro (alqotel) wrote:
> >&= gt;> From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
&g= t; >>>
> >>> mtfsf, mtfsfi and mtfsb1 instructions = call helper_float_check_status
> >>> after updating the valu= e of FPSCR, but helper_float_check_status
> >>> checks fp_st= atus and fp_status isn't updated based on FPSCR and
> >>>= ; since the value of fp_status is reset earlier in the instruction,
>= >>> it's always 0.
> >>>
> >>> = Because of this helper_float_check_status would change the FI bit to 0
&= gt; >>> as this bit checks if the last operation was inexact and> >>> float_flag_inexact is always 0.
> >>>> >>> These instructions also don't throw exceptions corre= ctly since
> >>> helper_float_check_status throw exceptions = based on fp_status.
> >>>
> >>> This commit c= reated a new helper, helper_fpscr_check_status that checks
> >>= > FPSCR value instead of fp_status and checks for a larger variety of> >>> exceptions than do_float_check_status.
> >>&= gt;
> >>> The hardware used to compare QEMU's behavior t= o, was a Power9.
> >>>
> >>> Resolves: https://gitlab.com/= qemu-project/qemu/-/issues/266
> >>> Signed-off-by: Luca= s Mateus Castro (alqotel)
> >>> <lucas.castro@eldorado.org.br>
> >&g= t;> ---
> >>> =C2=A0 target/ppc/fpu_helper.c =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 41 ++++++++++++++++++++++++++++++
> = >>> =C2=A0 target/ppc/helper.h =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0| =C2=A01 +
> >>> =C2=A0 target/ppc/tran= slate/fp-impl.c.inc | =C2=A06 ++---
> >>> =C2=A0 3 files cha= nged, 45 insertions(+), 3 deletions(-)
> >>>
> >>= ;> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
>= ; >>> index c4896cecc8..f086cb503f 100644
> >>> ---= a/target/ppc/fpu_helper.c
> >>> +++ b/target/ppc/fpu_helper= .c
> >>> @@ -414,6 +414,47 @@ void helper_store_fpscr(CPUPPC= State *env, uint64_t
> >>> val, uint32_t nibbles)
> &g= t;>> =C2=A0 =C2=A0 =C2=A0 ppc_store_fpscr(env, val);
> >>= > =C2=A0 }
> >>>
> >>> +void helper_fpscr_= check_status(CPUPPCState *env)
> >>> +{
> >>>= + =C2=A0 =C2=A0CPUState *cs =3D env_cpu(env);
> >>> + =C2= =A0 =C2=A0target_ulong fpscr =3D env->fpscr;
> >>> + =C2= =A0 =C2=A0int error =3D 0;
> >>> +
> >>> + = =C2=A0 =C2=A0if ((fpscr & FP_VXSOFT) && (fpscr_ve !=3D 0)) {> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_V= XSOFT;
> >>> + =C2=A0 =C2=A0} else if ((fpscr & FP_OX) &= amp;& (fpscr & FP_OE)) {
> >>> + =C2=A0 =C2=A0 =C2= =A0 =C2=A0error =3D POWERPC_EXCP_FP_OX;
> >>> + =C2=A0 =C2= =A0} else if ((fpscr & FP_UX) && (fpscr & FP_UE)) {
>= >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_UX;> >>> + =C2=A0 =C2=A0} else if ((fpscr & FP_XX) &&= (fpscr & FP_XE)) {
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0e= rror =3D POWERPC_EXCP_FP_XX;
> >>> + =C2=A0 =C2=A0} else if = ((fpscr & FP_ZX) && (fpscr & FP_ZE)) {
> >>>= + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_ZX;
> >>= ;> + =C2=A0 =C2=A0} else if ((fpscr & FP_VXSNAN) && (fpscr_v= e !=3D 0)) {
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D PO= WERPC_EXCP_FP_VXSNAN;
> >>> + =C2=A0 =C2=A0} else if ((fpscr= & FP_VXISI) && (fpscr_ve !=3D 0)) {
> >>> + =C2= =A0 =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_VXISI;
> >>&g= t; + =C2=A0 =C2=A0} else if ((fpscr & FP_VXIDI) && (fpscr_ve != =3D 0)) {
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D POWER= PC_EXCP_FP_VXIDI;
> >>> + =C2=A0 =C2=A0} else if ((fpscr &am= p; FP_VXZDZ) && (fpscr_ve !=3D 0)) {
> >>> + =C2=A0 = =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_VXZDZ;
> >>> += =C2=A0 =C2=A0} else if ((fpscr & FP_VXIMZ) && (fpscr_ve !=3D 0= )) {
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D POWERPC_EX= CP_FP_VXIMZ;
> >>> + =C2=A0 =C2=A0} else if ((fpscr & FP= _VXVC) && (fpscr_ve !=3D 0)) {
> >>> + =C2=A0 =C2=A0= =C2=A0 =C2=A0error =3D POWERPC_EXCP_FP_VXVC;
> >>> + =C2=A0= =C2=A0}
> >>> +
> >>> + =C2=A0 =C2=A0if (err= or) {
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0cs->exception_in= dex =3D POWERPC_EXCP_PROGRAM;
> >>> + =C2=A0 =C2=A0 =C2=A0 = =C2=A0env->error_code =3D error | POWERPC_EXCP_FP;
> >>> = + =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Deferred floating-point exception after tar= get FPSCR update */
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (f= p_exceptions_enabled(env)) {
> >>> + =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0raise_exception_err_ra(env, cs->exception_index,
= > >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 env->err= or_code, GETPC());
> >>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0}
&= gt; >>> + =C2=A0 =C2=A0}
> >>> +}
> >>&= gt; +
> >>> =C2=A0 static void do_float_check_status(CPUPPCS= tate *env, uintptr_t raddr)
> >>> =C2=A0 {
> >>&= gt; =C2=A0 =C2=A0 =C2=A0 CPUState *cs =3D env_cpu(env);
> >>>= ; diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> >>&g= t; index 4076aa281e..baa3715e73 100644
> >>> --- a/target/pp= c/helper.h
> >>> +++ b/target/ppc/helper.h
> >>&= gt; @@ -61,6 +61,7 @@ DEF_HELPER_FLAGS_1(cntlzw32, TCG_CALL_NO_RWG_SE, i32,=
> >>> i32)
> >>> =C2=A0 DEF_HELPER_FLAGS_2(b= rinc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >>>
> >>= > =C2=A0 DEF_HELPER_1(float_check_status, void, env)
> >>>= ; +DEF_HELPER_1(fpscr_check_status, void, env)
> >>> =C2=A0 = DEF_HELPER_1(reset_fpstatus, void, env)
> >>> =C2=A0 DEF_HEL= PER_2(compute_fprf_float64, void, env, i64)
> >>> =C2=A0 DEF= _HELPER_3(store_fpscr, void, env, i64, i32)
> >>> diff --git= a/target/ppc/translate/fp-impl.c.inc
> >>> b/target/ppc/tra= nslate/fp-impl.c.inc
> >>> index 9f7868ee28..0a9b1ecc60 1006= 44
> >>> --- a/target/ppc/translate/fp-impl.c.inc
> &g= t;>> +++ b/target/ppc/translate/fp-impl.c.inc
> >>> @@= -782,7 +782,7 @@ static void gen_mtfsb1(DisasContext *ctx)
> >>= ;> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tcg_gen_shri_i32(cpu_crf[1], cpu_c= rf[1], FPSCR_OX);
> >>> =C2=A0 =C2=A0 =C2=A0 }
> >&= gt;> =C2=A0 =C2=A0 =C2=A0 /* We can raise a deferred exception */
>= ; >>> - =C2=A0 =C2=A0gen_helper_float_check_status(cpu_env);
&g= t; >>> + =C2=A0 =C2=A0gen_helper_fpscr_check_status(cpu_env);
&= gt; >>> =C2=A0 }
> >>>
> >>> =C2=A0 = /* mtfsf */
> >>> @@ -818,7 +818,7 @@ static void gen_mtfsf(= DisasContext *ctx)
> >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = tcg_gen_shri_i32(cpu_crf[1], cpu_crf[1], FPSCR_OX);
> >>> = =C2=A0 =C2=A0 =C2=A0 }
> >>> =C2=A0 =C2=A0 =C2=A0 /* We can = raise a deferred exception */
> >>> - =C2=A0 =C2=A0gen_helpe= r_float_check_status(cpu_env);
> >>> + =C2=A0 =C2=A0gen_help= er_fpscr_check_status(cpu_env);
> >>> =C2=A0 =C2=A0 =C2=A0 t= cg_temp_free_i64(t1);
> >>> =C2=A0 }
> >>>> >>> @@ -851,7 +851,7 @@ static void gen_mtfsfi(DisasContext = *ctx)
> >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tcg_gen_shri_= i32(cpu_crf[1], cpu_crf[1], FPSCR_OX);
> >>> =C2=A0 =C2=A0 = =C2=A0 }
> >>> =C2=A0 =C2=A0 =C2=A0 /* We can raise a deferr= ed exception */
> >>> - =C2=A0 =C2=A0gen_helper_float_check_= status(cpu_env);
> >>> + =C2=A0 =C2=A0gen_helper_fpscr_check= _status(cpu_env);
> >>> =C2=A0 }
> >>>
>= ; >>> =C2=A0 /*** =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Floating-point
> >>> loa= d =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 ***/
> >>
> >> FWIW the real issu= e here is that gen_helper_reset_fpstatus() even exists at
> >> = all: see
> >> the comments around enabling hardfloat in the PPC= target by Emilio and
> >> Richard at
> >> https://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg04974.html= and
> >> https://lists.gnu.org/archive/html/qemu-devel/20= 20-05/msg00064.html.
> >>
> >> I have tried a f= ew informal experiments on my MacOS images by completely
> >> r= emoving all
> >> calls to gen_reset_fpstatus(), and whilst ther= e were a few odd behaviours I
> >> was
> >> surpris= ed to find that the basic OS was usable. The main issue I had was
> &= gt;> trying to
> >> come up with suitable test cases for the= various instructions when my only
> >> available
> >&= gt; hardware is a G4 Mac Mini.
> >>
> >> So yes thi= s patch fixes one particular use case, but the real issue is that
> &= gt;> the PPC
> >> target floating point flags need a bit of = work: however once this is done
> >> it should
> >>= be possible for hardfloat to be enabled via a CPU option on suitable hosts=
> >> which will
> >> bring a noticeable improvemen= t in floating point performance.
> >>
> > In this case= I don't think gen_helper_reset_fpstatus() is the problem,
> >= fp_status is not updated in the instruction but its value is used in
&g= t; > helper_float_check_status(), so if the values have not been reset s= ince the
> > last instruction it'll contain last instruction&#= 39;s information and if it has
> > (either by calling gen_helper_r= eset_fpstatus(), by automatically doing it
> > every instruction o= r by having every instruction reset it in the end) it'll
> > h= ave 0. So there are 3 alternatives to solve this that I can think of:
&g= t; >
> > =C2=A0 =C2=A0 * Update FPSCR directly, then update fp_= status based on FPSCR, for this
> > you would either have to call = a new helper to do this or update
> > helper_store_fpscr to do thi= s, and then expand do_float_check_status to throw
> > more excepti= ons (or create a new helper to do this if expanding
> > do_float_c= heck_status could cause problems),
> >
> > =C2=A0 =C2=A0 = * Just don't use fp_status, update FPSCR directly and do the deferred> > exception using only information from FPSCR (the one I used thi= s patch),
> >
> > =C2=A0 =C2=A0 * Update only fp_status d= irectly and call either a modified
> > do_float_check_status or a = new helper that would update FPSCR and throw the
> > correct excep= tion based on fp_status, this one I don't see how it would
> >= feasible in the current implementation as FPSCR has many bits without an> > equivalent in fp_status.
> >
> > So with this= I can see how to implement the 1st and 2nd option, I chose not
> >= ; to use the 1st one as do_float_check_status updates the FPSCR then throw = the
> > exception, which seemed unnecessary. Also looking back I s= hould've removed
> > gen_reset_fpstatus() as in the way it end= ed implemented these instructions
> > don't interact with fp_s= tatus anywhere else, so I'll remove it in the next
> > version= .
> >
> > And looking at the suggestions the current impl= ementation could be changed to
> > take advantage of the optimizat= ion suggested in the discussion you linked,
> > specially the part= s about checking when exception bits aren't set (but in
> > th= is case it would've to be the MSR exception bits) and the part about> > skipping calculating a flag when marked to 1.
>
> I = haven't followed the discussion but here's another message with som= e
> links I've collected when FPU came up that may be relevant to= the topic:
>
> https://lists.nongnu.org/archive/html/qem= u-ppc/2020-04/msg00387.html
>
> among those is a long threa= d on patchwork that has some info on the
> current situation. As far = as I remember the oddity in handling FPU
> exceptions is partly becau= se of two bits FI and FR in FPSCR that should

That's= correct, the most hard part is to simulate FI and FR bits in FPSCR as
that was not provided on other CPU.

> reflect the re= sult of the previous FPU op so has to be updated after every
> op whi= ch makes it hard to emulate as other CPUs usually don't do this. (We> could easily improve it if we did not emulate those bits, most guest = code
> don't use them anyway, but QEMU prefers accuracy so that w= ay was ruled
> out.) Other than that the current code maybe also can = be simplified and
> maybe optimised via some other ways which were di= scussed in those threads
> but nobody implemented any of the ideas so= far. May worth reading through
> what was said before as there might= be sume useful ideas in there.
>
> Regards,
> BALATON Zo= ltan



--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E6=AD=A4=E8=87= =B4
=E7=A4=BC
=E7=BD=97=E5=8B=87=E5=88=9A
Yours
=C2=A0 =C2=A0 s= incerely,
Yonggang Luo
--00000000000066b4b705d08e4273--