From: Richard Henderson <richard.henderson@linaro.org>
To: "BALATON Zoltan" <balaton@eik.bme.hu>,
"Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>,
Peter Maydell <peter.maydell@linaro.org>
Subject: Re: [RFC PATCH] softfloat: use QEMU_FLATTEN to avoid mistaken isra inlining
Date: Fri, 23 Jun 2023 07:50:39 +0200 [thread overview]
Message-ID: <644f6d2e-0c6c-e97e-6930-706d36af24f6@linaro.org> (raw)
In-Reply-To: <5082a19d-0fc2-a140-eeb7-8c608b33e410@eik.bme.hu>
On 6/22/23 22:55, BALATON Zoltan wrote:
> Hello,
>
> What happened to this patch? Will this be merged by somebody?
Thanks for the reminder. Queued to tcg-next.
r~
>
> Regards,
> BALATON Zoltan
>
> On Tue, 23 May 2023, BALATON Zoltan wrote:
>> On Tue, 23 May 2023, Alex Bennée wrote:
>>> Balton discovered that asserts for the extract/deposit calls had a
>>
>> Missing an a in my name and my given name is Zoltan. (First name and last name is in the
>> other way in Hungarian.) Maybe just add a Reported-by instead of here if you want to
>> record it.
>>
>>> significant impact on a lame benchmark on qemu-ppc. Replicating with:
>>>
>>> ./qemu-ppc64 ~/lsrc/tests/lame.git-svn/builds/ppc64/frontend/lame \
>>> -h pts-trondheim-3.wav pts-trondheim-3.mp3
>>>
>>> showed up the pack/unpack routines not eliding the assert checks as it
>>> should have done causing them to prominently figure in the profile:
>>>
>>> 11.44% qemu-ppc64 qemu-ppc64 [.] unpack_raw64.isra.0
>>> 11.03% qemu-ppc64 qemu-ppc64 [.] parts64_uncanon_normal
>>> 8.26% qemu-ppc64 qemu-ppc64 [.] helper_compute_fprf_float64
>>> 6.75% qemu-ppc64 qemu-ppc64 [.] do_float_check_status
>>> 5.34% qemu-ppc64 qemu-ppc64 [.] parts64_muladd
>>> 4.75% qemu-ppc64 qemu-ppc64 [.] pack_raw64.isra.0
>>> 4.38% qemu-ppc64 qemu-ppc64 [.] parts64_canonicalize
>>> 3.62% qemu-ppc64 qemu-ppc64 [.] float64r32_round_pack_canonical
>>>
>>> After this patch the same test runs 31 seconds faster with a profile
>>> where the generated code dominates more:
>>>
>>> + 14.12% 0.00% qemu-ppc64 [unknown] [.] 0x0000004000619420
>>> + 13.30% 0.00% qemu-ppc64 [unknown] [.] 0x0000004000616850
>>> + 12.58% 12.19% qemu-ppc64 qemu-ppc64 [.] parts64_uncanon_normal
>>> + 10.62% 0.00% qemu-ppc64 [unknown] [.] 0x000000400061bf70
>>> + 9.91% 9.73% qemu-ppc64 qemu-ppc64 [.] helper_compute_fprf_float64
>>> + 7.84% 7.82% qemu-ppc64 qemu-ppc64 [.] do_float_check_status
>>> + 6.47% 5.78% qemu-ppc64 qemu-ppc64 [.]
>>> parts64_canonicalize.constprop.0
>>> + 6.46% 0.00% qemu-ppc64 [unknown] [.] 0x0000004000620130
>>> + 6.42% 0.00% qemu-ppc64 [unknown] [.] 0x0000004000619400
>>> + 6.17% 6.04% qemu-ppc64 qemu-ppc64 [.] parts64_muladd
>>> + 5.85% 0.00% qemu-ppc64 [unknown] [.] 0x00000040006167e0
>>> + 5.74% 0.00% qemu-ppc64 [unknown] [.] 0x0000b693fcffffd3
>>> + 5.45% 4.78% qemu-ppc64 qemu-ppc64 [.]
>>> float64r32_round_pack_canonical
>>>
>>> Suggested-by: Richard Henderson <richard.henderson@linaro.org>
>>> Message-Id: <ec9cfe5a-d5f2-466d-34dc-c35817e7e010@linaro.org>
>>> [AJB: Patchified rth's suggestion]
>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>> Cc: BALATON Zoltan <balaton@eik.bme.hu>
>>
>> Replace Cc: with
>> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
>>
>> This solves the softfloat related usages, the rest probably are lower overhead, I could
>> not measure any more improvement with removing asserts on top of this patch. I still
>> have these functions high in my profiling result:
>>
>> children self command symbol
>> 11.40% 10.86% qemu-system-ppc helper_compute_fprf_float64
>> 11.25% 0.61% qemu-system-ppc helper_fmadds
>> 10.01% 3.23% qemu-system-ppc float64r32_round_pack_canonical
>> 8.59% 1.80% qemu-system-ppc helper_float_check_status
>> 8.34% 7.23% qemu-system-ppc parts64_muladd
>> 8.16% 0.67% qemu-system-ppc helper_fmuls
>> 8.08% 0.43% qemu-system-ppc parts64_uncanon
>> 7.49% 1.78% qemu-system-ppc float64r32_mul
>> 7.32% 7.32% qemu-system-ppc parts64_uncanon_normal
>> 6.48% 0.52% qemu-system-ppc helper_fadds
>> 6.31% 6.31% qemu-system-ppc do_float_check_status
>> 5.99% 1.14% qemu-system-ppc float64r32_add
>>
>> Any idea on those?
>>
>> Unrelated to this patch I also started to see random crashes with a DSI on a dcbz
>> instruction now which did not happen before (or not frequently enough for me to notice).
>> I did not bisect that as it happens randomly but I wonder if it could be related to
>> recent unaligned access changes or some other TCG change? Any idea what to check?
>>
>> Regards,
>> BALATON Zoltan
>>
>>> ---
>>> fpu/softfloat.c | 22 +++++++++++-----------
>>> 1 file changed, 11 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>>> index 108f9cb224..42e6c188b4 100644
>>> --- a/fpu/softfloat.c
>>> +++ b/fpu/softfloat.c
>>> @@ -593,27 +593,27 @@ static void unpack_raw64(FloatParts64 *r, const FloatFmt *fmt,
>>> uint64_t raw)
>>> };
>>> }
>>>
>>> -static inline void float16_unpack_raw(FloatParts64 *p, float16 f)
>>> +static void QEMU_FLATTEN float16_unpack_raw(FloatParts64 *p, float16 f)
>>> {
>>> unpack_raw64(p, &float16_params, f);
>>> }
>>>
>>> -static inline void bfloat16_unpack_raw(FloatParts64 *p, bfloat16 f)
>>> +static void QEMU_FLATTEN bfloat16_unpack_raw(FloatParts64 *p, bfloat16 f)
>>> {
>>> unpack_raw64(p, &bfloat16_params, f);
>>> }
>>>
>>> -static inline void float32_unpack_raw(FloatParts64 *p, float32 f)
>>> +static void QEMU_FLATTEN float32_unpack_raw(FloatParts64 *p, float32 f)
>>> {
>>> unpack_raw64(p, &float32_params, f);
>>> }
>>>
>>> -static inline void float64_unpack_raw(FloatParts64 *p, float64 f)
>>> +static void QEMU_FLATTEN float64_unpack_raw(FloatParts64 *p, float64 f)
>>> {
>>> unpack_raw64(p, &float64_params, f);
>>> }
>>>
>>> -static void floatx80_unpack_raw(FloatParts128 *p, floatx80 f)
>>> +static void QEMU_FLATTEN floatx80_unpack_raw(FloatParts128 *p, floatx80 f)
>>> {
>>> *p = (FloatParts128) {
>>> .cls = float_class_unclassified,
>>> @@ -623,7 +623,7 @@ static void floatx80_unpack_raw(FloatParts128 *p, floatx80 f)
>>> };
>>> }
>>>
>>> -static void float128_unpack_raw(FloatParts128 *p, float128 f)
>>> +static void QEMU_FLATTEN float128_unpack_raw(FloatParts128 *p, float128 f)
>>> {
>>> const int f_size = float128_params.frac_size - 64;
>>> const int e_size = float128_params.exp_size;
>>> @@ -650,27 +650,27 @@ static uint64_t pack_raw64(const FloatParts64 *p, const FloatFmt
>>> *fmt)
>>> return ret;
>>> }
>>>
>>> -static inline float16 float16_pack_raw(const FloatParts64 *p)
>>> +static float16 QEMU_FLATTEN float16_pack_raw(const FloatParts64 *p)
>>> {
>>> return make_float16(pack_raw64(p, &float16_params));
>>> }
>>>
>>> -static inline bfloat16 bfloat16_pack_raw(const FloatParts64 *p)
>>> +static bfloat16 QEMU_FLATTEN bfloat16_pack_raw(const FloatParts64 *p)
>>> {
>>> return pack_raw64(p, &bfloat16_params);
>>> }
>>>
>>> -static inline float32 float32_pack_raw(const FloatParts64 *p)
>>> +static float32 QEMU_FLATTEN float32_pack_raw(const FloatParts64 *p)
>>> {
>>> return make_float32(pack_raw64(p, &float32_params));
>>> }
>>>
>>> -static inline float64 float64_pack_raw(const FloatParts64 *p)
>>> +static float64 QEMU_FLATTEN float64_pack_raw(const FloatParts64 *p)
>>> {
>>> return make_float64(pack_raw64(p, &float64_params));
>>> }
>>>
>>> -static float128 float128_pack_raw(const FloatParts128 *p)
>>> +static float128 QEMU_FLATTEN float128_pack_raw(const FloatParts128 *p)
>>> {
>>> const int f_size = float128_params.frac_size - 64;
>>> const int e_size = float128_params.exp_size;
>>
next prev parent reply other threads:[~2023-06-23 5:51 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 13:11 [RFC PATCH] softfloat: use QEMU_FLATTEN to avoid mistaken isra inlining Alex Bennée
2023-05-23 13:57 ` BALATON Zoltan
2023-05-23 14:33 ` Richard Henderson
2023-05-23 17:51 ` BALATON Zoltan
2023-05-25 13:22 ` Paolo Bonzini
2023-05-25 13:30 ` Richard Henderson
2023-05-25 23:15 ` BALATON Zoltan
2023-05-25 13:30 ` Paolo Bonzini
2023-05-25 23:19 ` BALATON Zoltan
2023-05-26 11:56 ` BALATON Zoltan
2023-06-22 20:55 ` BALATON Zoltan
2023-06-23 5:50 ` Richard Henderson [this message]
2023-05-23 14:18 ` Philippe Mathieu-Daudé
2023-05-23 15:34 ` Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=644f6d2e-0c6c-e97e-6930-706d36af24f6@linaro.org \
--to=richard.henderson@linaro.org \
--cc=alex.bennee@linaro.org \
--cc=aurelien@aurel32.net \
--cc=balaton@eik.bme.hu \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).