From: Richard Henderson <richard.henderson@linaro.org>
To: "Paolo Bonzini" <pbonzini@redhat.com>,
"BALATON Zoltan" <balaton@eik.bme.hu>,
"Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>,
Peter Maydell <peter.maydell@linaro.org>
Subject: Re: [RFC PATCH] softfloat: use QEMU_FLATTEN to avoid mistaken isra inlining
Date: Thu, 25 May 2023 06:30:31 -0700 [thread overview]
Message-ID: <085fece3-862c-60d4-0baf-82cc5267e7fd@linaro.org> (raw)
In-Reply-To: <77189848-d618-9f55-4ae9-92756e635371@redhat.com>
On 5/25/23 06:22, Paolo Bonzini wrote:
> On 5/23/23 16:33, Richard Henderson wrote:
>>
>>
>> The tests are poorly ordered, testing many unlikely things before the most likely thing
>> (normal). A better ordering would be
>>
>> if (likely(tp##_is_normal(arg))) {
>> } else if (tp##_is_zero(arg)) {
>> } else if (tp##_is_zero_or_denormal(arg)) {
>> } else if (tp##_is_infinity(arg)) {
>> } else {
>> // nan case
>> }
>>
>> Secondly, we compute the classify bitmask, and then deconstruct the mask again in
>> set_fprf_from_class. Since we don't use the classify bitmask for anything else, better
>> would be to compute the fprf value directly in the if-ladder.
>
> So something like this:
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index a66e16c2128c..daed97ca178e 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -141,62 +141,30 @@ static inline int ppc_float64_get_unbiased_exp(float64 f)
> return ((f >> 52) & 0x7FF) - 1023;
> }
>
> -/* Classify a floating-point number. */
> -enum {
> - is_normal = 1,
> - is_zero = 2,
> - is_denormal = 4,
> - is_inf = 8,
> - is_qnan = 16,
> - is_snan = 32,
> - is_neg = 64,
> -};
> -
> -#define COMPUTE_CLASS(tp) \
> -static int tp##_classify(tp arg) \
> -{ \
> - int ret = tp##_is_neg(arg) * is_neg; \
> - if (unlikely(tp##_is_any_nan(arg))) { \
> - float_status dummy = { }; /* snan_bit_is_one = 0 */ \
> - ret |= (tp##_is_signaling_nan(arg, &dummy) \
> - ? is_snan : is_qnan); \
> - } else if (unlikely(tp##_is_infinity(arg))) { \
> - ret |= is_inf; \
> - } else if (tp##_is_zero(arg)) { \
> - ret |= is_zero; \
> - } else if (tp##_is_zero_or_denormal(arg)) { \
> - ret |= is_denormal; \
> - } else { \
> - ret |= is_normal; \
> - } \
> - return ret; \
> -}
> -
> -COMPUTE_CLASS(float16)
> -COMPUTE_CLASS(float32)
> -COMPUTE_CLASS(float64)
> -COMPUTE_CLASS(float128)
> -
> -static void set_fprf_from_class(CPUPPCState *env, int class)
> +static void set_fprf(CPUPPCState *env, uint8_t ret)
> {
> - static const uint8_t fprf[6][2] = {
> - { 0x04, 0x08 }, /* normalized */
> - { 0x02, 0x12 }, /* zero */
> - { 0x14, 0x18 }, /* denormalized */
> - { 0x05, 0x09 }, /* infinity */
> - { 0x11, 0x11 }, /* qnan */
> - { 0x00, 0x00 }, /* snan -- flags are undefined */
> - };
> - bool isneg = class & is_neg;
> -
> env->fpscr &= ~FP_FPRF;
> - env->fpscr |= fprf[ctz32(class)][isneg] << FPSCR_FPRF;
> + env->fpscr |= ret << FPSCR_FPRF;
> }
>
> -#define COMPUTE_FPRF(tp) \
> -void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
> -{ \
> - set_fprf_from_class(env, tp##_classify(arg)); \
> +#define COMPUTE_FPRF(tp) \
> +void helper_compute_fprf_##tp(CPUPPCState *env, tp arg) \
> +{ \
> + int ret; \
> + if (tp##_is_normal(arg)) { \
> + ret = 0x0408; \
> + } else if (tp##_is_zero(arg)) { \
> + ret = 0x0212; \
> + } else if (tp##_is_zero_or_denormal(arg)) { \
> + ret = 0x1418; \
> + } else if (unlikely(tp##_is_infinity(arg))) { \
> + ret = 0x0509; \
> + } else { \
> + float_status dummy = { }; /* snan_bit_is_one = 0 */ \
> + ret = (tp##_is_signaling_nan(arg, &dummy) \
> + ? 0x0000 : 0x1111); \
> + } \
> + set_fprf(env, tp##_is_neg(arg) ? (uint8_t)ret : ret >> 8); \
> }
>
> COMPUTE_FPRF(float16)
>
>
> Not tested beyond compilation, but if Zoltan reports that it helps
> I can write a commit message and submit it.
See
https://patchew.org/QEMU/20230523202507.688859-1-richard.henderson@linaro.org/
and follow-up. It drops this one function in the profile, but only helps very slightly vs
wall clock.
r~
next prev parent reply other threads:[~2023-05-25 13:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 13:11 [RFC PATCH] softfloat: use QEMU_FLATTEN to avoid mistaken isra inlining Alex Bennée
2023-05-23 13:57 ` BALATON Zoltan
2023-05-23 14:33 ` Richard Henderson
2023-05-23 17:51 ` BALATON Zoltan
2023-05-25 13:22 ` Paolo Bonzini
2023-05-25 13:30 ` Richard Henderson [this message]
2023-05-25 23:15 ` BALATON Zoltan
2023-05-25 13:30 ` Paolo Bonzini
2023-05-25 23:19 ` BALATON Zoltan
2023-05-26 11:56 ` BALATON Zoltan
2023-06-22 20:55 ` BALATON Zoltan
2023-06-23 5:50 ` Richard Henderson
2023-05-23 14:18 ` Philippe Mathieu-Daudé
2023-05-23 15:34 ` Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=085fece3-862c-60d4-0baf-82cc5267e7fd@linaro.org \
--to=richard.henderson@linaro.org \
--cc=alex.bennee@linaro.org \
--cc=aurelien@aurel32.net \
--cc=balaton@eik.bme.hu \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).