qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: "Lucas Mateus Castro(alqotel)" <lucas.araujo@eldorado.org.br>,
	qemu-ppc@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Greg Kurz" <groug@kaod.org>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Aurelien Jarno" <aurelien@aurel32.net>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH 5/7] target/ppc: Implemented xvf16ger*
Date: Tue, 26 Apr 2022 17:26:25 -0700	[thread overview]
Message-ID: <e5abeb2f-892f-af8d-0457-c9f8e66ddeb6@linaro.org> (raw)
In-Reply-To: <20220426125028.18844-6-lucas.araujo@eldorado.org.br>

On 4/26/22 05:50, Lucas Mateus Castro(alqotel) wrote:
> +#define VSXGER16(NAME, ORIG_T, OR_EL)                                   \
> +    void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r,             \
> +              uint32_t  at_r, uint32_t mask, uint32_t packed_flags)     \
> +    {                                                                   \
> +        ppc_vsr_t *at;                                                  \
> +        float32 psum, aux_acc, va, vb, vc, vd;                          \
> +        int i, j, xmsk_bit, ymsk_bit;                                   \
> +        uint8_t xmsk = mask & 0x0F;                                     \
> +        uint8_t ymsk = (mask >> 4) & 0x0F;                              \
> +        uint8_t pmsk = (mask >> 8) & 0x3;                               \
> +        ppc_vsr_t *b = cpu_vsr_ptr(env, b_r);                           \
> +        ppc_vsr_t *a = cpu_vsr_ptr(env, a_r);                           \
> +        float_status *excp_ptr = &env->fp_status;                       \
> +        bool acc = ger_acc_flag(packed_flags);                          \
> +        bool neg_acc = ger_neg_acc_flag(packed_flags);                  \
> +        bool neg_mul = ger_neg_mul_flag(packed_flags);                  \
> +        for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) {    \
> +            at = cpu_vsr_ptr(env, at_r + i);                            \
> +            for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) {\
> +                if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) {           \
> +                    va = !(pmsk & 2) ? float32_zero :                   \
> +                                       GET_VSR(Vsr##OR_EL, a,           \
> +                                               2 * i, ORIG_T, float32); \
> +                    vb = !(pmsk & 2) ? float32_zero :                   \
> +                                       GET_VSR(Vsr##OR_EL, b,           \
> +                                               2 * j, ORIG_T, float32); \
> +                    vc = !(pmsk & 1) ? float32_zero :                   \
> +                                       GET_VSR(Vsr##OR_EL, a,           \
> +                                            2 * i + 1, ORIG_T, float32);\
> +                    vd = !(pmsk & 1) ? float32_zero :                   \
> +                                       GET_VSR(Vsr##OR_EL, b,           \
> +                                            2 * j + 1, ORIG_T, float32);\
> +                    psum = float32_mul(va, vb, excp_ptr);               \
> +                    psum = float32_muladd(vc, vd, psum, 0, excp_ptr);   \

This isn't correct -- the intermediate 'prod' (the first multiply) is not rounded.  I 
think the correct way to implement this (barring new softfloat functions) is to compute 
the intermediate product as float64 with float_round_to_odd, then float64r32_muladd into 
the correct rounding mode to finish.

> +                    if (acc) {                                          \
> +                        if (neg_mul) {                                  \
> +                            psum = float32_neg(psum);                   \
> +                        }                                               \
> +                        if (neg_acc) {                                  \
> +                            aux_acc = float32_neg(at->VsrSF(j));        \
> +                        } else {                                        \
> +                            aux_acc = at->VsrSF(j);                     \
> +                        }                                               \
> +                        at->VsrSF(j) = float32_add(psum, aux_acc,       \
> +                                                   excp_ptr);           \

This one, thankfully, uses the rounded intermediate result 'msum', so is ok.

Please do convert this from a macro.  Given that float16 and bfloat16 are addressed the 
same, I think the only callback you need is the conversion from float16_to_float64.  Drop 
the bf16 accessor to ppc_vsr_t.


r~


  reply	other threads:[~2022-04-27  0:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26 12:50 [RFC PATCH 0/7] VSX MMA Implementation Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 1/7] target/ppc: Implement xxm[tf]acc and xxsetaccz Lucas Mateus Castro(alqotel)
2022-04-26 22:59   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 23:40   ` Richard Henderson
2022-04-27 20:24     ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:28       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 3/7] target/ppc: Implemented pmxvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 4/7] target/ppc: Implemented xvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:09   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 5/7] target/ppc: Implemented xvf16ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:26   ` Richard Henderson [this message]
2022-04-27 21:11     ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:30       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 6/7] target/ppc: Implemented pmxvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:33   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 7/7] target/ppc: Implemented [pm]xvbf16ger2* Lucas Mateus Castro(alqotel)
2022-04-27  6:21 ` [RFC PATCH 0/7] VSX MMA Implementation Joel Stanley
2022-04-27  7:10   ` Cédric Le Goater
2022-05-05  6:06     ` Joel Stanley
2022-04-28 14:05 ` Lucas Mateus Martins Araujo e Castro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5abeb2f-892f-af8d-0457-c9f8e66ddeb6@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=aurelien@aurel32.net \
    --cc=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=lucas.araujo@eldorado.org.br \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).