Re: [RFC PATCH 4/7] target/ppc: Implemented xvf*ger*

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: "Lucas Mateus Castro(alqotel)" <lucas.araujo@eldorado.org.br>,
	qemu-ppc@nongnu.org
Cc: "open list:All patches CC here" <qemu-devel@nongnu.org>,
	"Greg Kurz" <groug@kaod.org>,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH 4/7] target/ppc: Implemented xvf*ger*
Date: Tue, 26 Apr 2022 17:09:44 -0700	[thread overview]
Message-ID: <1570d433-5e96-54e9-b4aa-db39c39b5691@linaro.org> (raw)
In-Reply-To: <20220426125028.18844-5-lucas.araujo@eldorado.org.br>

On 4/26/22 05:50, Lucas Mateus Castro(alqotel) wrote:
> +#define VSXGER(NAME, TYPE, EL)                                          \
> +    void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r,             \
> +              uint32_t  at_r, uint32_t mask, uint32_t packed_flags)     \
> +    {                                                                   \
> +        ppc_vsr_t *a, *b, *at;                                          \
> +        TYPE aux_acc, va, vb;                                           \
> +        int i, j, xmsk_bit, ymsk_bit, op_flags;                         \
> +        uint8_t xmsk = mask & 0x0F;                                     \
> +        uint8_t ymsk = (mask >> 4) & 0x0F;                              \
> +        int ymax = MIN(4, 128 / (sizeof(TYPE) * 8));                    \
> +        b = cpu_vsr_ptr(env, b_r);                                      \
> +        float_status *excp_ptr = &env->fp_status;                       \
> +        bool acc = ger_acc_flag(packed_flags);                          \
> +        bool neg_acc = ger_neg_acc_flag(packed_flags);                  \
> +        bool neg_mul = ger_neg_mul_flag(packed_flags);                  \
> +        helper_reset_fpstatus(env);                                     \
> +        for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) {    \
> +            a = cpu_vsr_ptr(env, a_r + i / ymax);                       \
> +            at = cpu_vsr_ptr(env, at_r + i);                            \
> +            for (j = 0, ymsk_bit = 1 << (ymax - 1); j < ymax;           \
> +                 j++, ymsk_bit >>= 1) {                                 \
> +                if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) {           \
> +                    op_flags = (neg_acc ^ neg_mul) ?                    \
> +                                          float_muladd_negate_c : 0;    \
> +                    op_flags |= (neg_mul) ?                             \
> +                                     float_muladd_negate_result : 0;    \

There's no need to compute op_flags in the inner loop.
Indeed, probably better to compute it in translation.

This macro is trickier than the integer to turn into a function, however,

> +                    va = a->Vsr##EL(i % ymax);                          \
> +                    vb = b->Vsr##EL(j);                                 \
> +                    aux_acc = at->Vsr##EL(j);                           \
> +                    if (acc) {                                          \
> +                        at->Vsr##EL(j) = TYPE##_muladd(va, vb, aux_acc, \
> +                                                       op_flags,        \
> +                                                       excp_ptr);       \
> +                    } else {                                            \
> +                        at->Vsr##EL(j) = TYPE##_mul(va, vb, excp_ptr);  \
> +                    }                                                   \
> +                } else {                                                \
> +                    at->Vsr##EL(j) = 0;                                 \
> +                }                                                       \

static void vsxger_zero_f(ppc_vsr_t *a, int j)
{
     a->VsrSF(i) = float32_zero;
}

static uint64_t vsxger_mul_f(ppc_vsr_t *d, ppc_vsr_t *a, ppc_vsr_t *b,
                              int i, int j, int flags, float_status *s)
{
     float32 af = a->VsrSF(i);
     float32 bf = b->VsrSF(j);
     d->VsrSF(j) = float32_mul(af, bf, s);
}

static uint64_t vsxger_mac_f(ppc_vsr_t *d, ppc_vsr_t *a, ppc_vsr_t *b,
                              int i, int j, int flags, float_status *s)
{
     float32 af = a->VsrSF(i);
     float32 bf = b->VsrSF(j);
     float32 cf = d->VsrSF(j);
     d->VsrSF(j) = float32_muladd(af, bf, cf, flags, s);
}

is probably a good place to start for callbacks.


r~

next prev parent reply	other threads:[~2022-04-27  0:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26 12:50 [RFC PATCH 0/7] VSX MMA Implementation Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 1/7] target/ppc: Implement xxm[tf]acc and xxsetaccz Lucas Mateus Castro(alqotel)
2022-04-26 22:59   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 23:40   ` Richard Henderson
2022-04-27 20:24     ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:28       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 3/7] target/ppc: Implemented pmxvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 4/7] target/ppc: Implemented xvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:09   ` Richard Henderson [this message]
2022-04-26 12:50 ` [RFC PATCH 5/7] target/ppc: Implemented xvf16ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:26   ` Richard Henderson
2022-04-27 21:11     ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:30       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 6/7] target/ppc: Implemented pmxvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:33   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 7/7] target/ppc: Implemented [pm]xvbf16ger2* Lucas Mateus Castro(alqotel)
2022-04-27  6:21 ` [RFC PATCH 0/7] VSX MMA Implementation Joel Stanley
2022-04-27  7:10   ` Cédric Le Goater
2022-05-05  6:06     ` Joel Stanley
2022-04-28 14:05 ` Lucas Mateus Martins Araujo e Castro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1570d433-5e96-54e9-b4aa-db39c39b5691@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=lucas.araujo@eldorado.org.br \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).