[PATCH] target/ppc: Change VSX instructions behavior to fill with zeros

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros
@ 2022-02-04 18:19 Víctor Colombo
  2022-02-06  0:36 ` Richard Henderson
  2022-02-09  7:45 ` Cédric Le Goater
  0 siblings, 2 replies; 3+ messages in thread
From: Víctor Colombo @ 2022-02-04 18:19 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, groug, victor.colombo, clg, matheus.ferst, david

ISA v3.1 changed some VSX instructions behavior by changing what the
other words/doubleword in the result should contain when the result is
only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and
saves the result also in doubleword 0.
Before, the second doubleword result was undefined according to the
ISA, but now it's stated that it should be zeroed.

Even tough the result was undefined before, hardware implementing these
instructions already filled these fields with 0s. Changing every ISA
version in QEMU to this behavior makes the results match what happens
in hardware.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>

---

This patch is a proposal on a previous RFC I sent on this topic [1].
I preferred this approach because it makes QEMU behavior closer to the
real hardware.

The affected instructions have the following note in the ISA:

"""
Previous versions of the architecture allowed the
contents of doubleword 1 of the result register to be
undefined. However, all processors that support
this instruction write 0s into doubleword 1 of the
result register, as is required by this version of the
architecture.
"""

This patch is not exhaustive as
1. some instructions that had its behavior changed are not 'corrected'
    by this patch (mostly multiply-add instructions);
2. some instructions changed the behavior to also replicate the result
    in the other word from the doubleword if the result is a single
    word e.g. xscvdpuxws. So, this patch only focus on the 'zeroing'
    part, not the replication;

Best regards,

-- Víctor

[1] https://lists.gnu.org/archive/html/qemu-ppc/2021-12/msg00198.html
---
 target/ppc/fpu_helper.c             | 26 +++++++++++++-------------
 target/ppc/translate/vsx-impl.c.inc |  4 +++-
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index e5c29b53b8..bd76bee7f1 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -1696,7 +1696,7 @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2)
 void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,                          \
                    ppc_vsr_t *xa, ppc_vsr_t *xb)                             \
 {                                                                            \
-    ppc_vsr_t t = *xt;                                                       \
+    ppc_vsr_t t = { };                                                       \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -1772,7 +1772,7 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode,
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,                            \
                  ppc_vsr_t *xa, ppc_vsr_t *xb)                               \
 {                                                                            \
-    ppc_vsr_t t = *xt;                                                       \
+    ppc_vsr_t t = { };                                                       \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -1843,7 +1843,7 @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode,
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,                             \
                  ppc_vsr_t *xa, ppc_vsr_t *xb)                                \
 {                                                                             \
-    ppc_vsr_t t = *xt;                                                        \
+    ppc_vsr_t t = { };                                                        \
     int i;                                                                    \
                                                                               \
     helper_reset_fpstatus(env);                                               \
@@ -1919,7 +1919,7 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode,
 #define VSX_RE(op, nels, tp, fld, sfprf, r2sp)                                \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)              \
 {                                                                             \
-    ppc_vsr_t t = *xt;                                                        \
+    ppc_vsr_t t = { };                                                        \
     int i;                                                                    \
                                                                               \
     helper_reset_fpstatus(env);                                               \
@@ -1959,7 +1959,7 @@ VSX_RE(xvresp, 4, float32, VsrW(i), 0, 0)
 #define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp)                             \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)             \
 {                                                                            \
-    ppc_vsr_t t = *xt;                                                       \
+    ppc_vsr_t t = { };                                                       \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -2004,7 +2004,7 @@ VSX_SQRT(xvsqrtsp, 4, float32, VsrW(i), 0, 0)
 #define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp)                           \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)             \
 {                                                                            \
-    ppc_vsr_t t = *xt;                                                       \
+    ppc_vsr_t t = { };                                                       \
     int i;                                                                   \
                                                                              \
     helper_reset_fpstatus(env);                                              \
@@ -2472,7 +2472,7 @@ void helper_xscmpuqp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xa,
 void helper_##name(CPUPPCState *env, ppc_vsr_t *xt,                           \
                    ppc_vsr_t *xa, ppc_vsr_t *xb)                              \
 {                                                                             \
-    ppc_vsr_t t = *xt;                                                        \
+    ppc_vsr_t t = { };                                                        \
     int i;                                                                    \
                                                                               \
     for (i = 0; i < nels; i++) {                                              \
@@ -2498,7 +2498,7 @@ VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
 void helper_##name(CPUPPCState *env,                                          \
                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)               \
 {                                                                             \
-    ppc_vsr_t t = *xt;                                                        \
+    ppc_vsr_t t = { };                                                        \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
     if (unlikely(float64_is_any_nan(xa->VsrD(0)) ||                           \
@@ -2533,7 +2533,7 @@ VSX_MAX_MINC(xsmincdp, 0);
 void helper_##name(CPUPPCState *env,                                          \
                    ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)               \
 {                                                                             \
-    ppc_vsr_t t = *xt;                                                        \
+    ppc_vsr_t t = { };                                                        \
     bool vxsnan_flag = false, vex_flag = false;                               \
                                                                               \
     if (unlikely(float64_is_any_nan(xa->VsrD(0)))) {                          \
@@ -2654,7 +2654,7 @@ VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0)
 #define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf)    \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)   \
 {                                                                  \
-    ppc_vsr_t t = *xt;                                             \
+    ppc_vsr_t t = { };                                             \
     int i;                                                         \
                                                                    \
     for (i = 0; i < nels; i++) {                                   \
@@ -2833,7 +2833,7 @@ uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)             \
 {                                                                            \
     int all_flags = env->fp_status.float_exception_flags, flags;             \
-    ppc_vsr_t t = *xt;                                                       \
+    ppc_vsr_t t = { };                                                       \
     int i;                                                                   \
                                                                              \
     for (i = 0; i < nels; i++) {                                             \
@@ -2917,7 +2917,7 @@ VSX_CVT_FP_TO_INT_VECTOR(xscvqpuwz, float128, uint32, f128, VsrD(0), 0x0ULL)
 #define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf, r2sp)  \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)        \
 {                                                                       \
-    ppc_vsr_t t = *xt;                                                  \
+    ppc_vsr_t t = { };                                                  \
     int i;                                                              \
                                                                         \
     for (i = 0; i < nels; i++) {                                        \
@@ -2990,7 +2990,7 @@ VSX_CVT_INT_TO_FP_VECTOR(xscvudqp, uint64, float128, VsrD(0), f128)
 #define VSX_ROUND(op, nels, tp, fld, rmode, sfprf)                     \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)       \
 {                                                                      \
-    ppc_vsr_t t = *xt;                                                 \
+    ppc_vsr_t t = { };                                                 \
     int i;                                                             \
     FloatRoundMode curr_rounding_mode;                                 \
                                                                        \
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index c636e38164..128968b5e7 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -747,6 +747,7 @@ static void glue(gen_, name)(DisasContext *ctx)                   \
             }                                                     \
         }                                                         \
         set_cpu_vsr(xT(ctx->opcode), xb, true);                   \
+        set_cpu_vsr(xT(ctx->opcode), tcg_constant_i64(0), false); \
         tcg_temp_free_i64(xb);                                    \
         tcg_temp_free_i64(sgm);                                   \
     }
@@ -1073,6 +1074,7 @@ static void gen_##name(DisasContext *ctx)                     \
     get_cpu_vsr(t0, xB(ctx->opcode), true);                   \
     gen_helper_##name(t1, cpu_env, t0);                       \
     set_cpu_vsr(xT(ctx->opcode), t1, true);                   \
+    set_cpu_vsr(xT(ctx->opcode), tcg_constant_i64(0), false); \
     tcg_temp_free_i64(t0);                                    \
     tcg_temp_free_i64(t1);                                    \
 }
@@ -1700,7 +1702,7 @@ static void gen_xsiexpdp(DisasContext *ctx)
     tcg_gen_shli_i64(t0, t0, 52);
     tcg_gen_or_i64(xth, xth, t0);
     set_cpu_vsr(xT(ctx->opcode), xth, true);
-    /* dword[1] is undefined */
+    set_cpu_vsr(xT(ctx->opcode), tcg_constant_i64(0), false);
     tcg_temp_free_i64(t0);
     tcg_temp_free_i64(xth);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros
  2022-02-04 18:19 [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros Víctor Colombo
@ 2022-02-06  0:36 ` Richard Henderson
  2022-02-09  7:45 ` Cédric Le Goater
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2022-02-06  0:36 UTC (permalink / raw)
  To: Víctor Colombo, qemu-devel, qemu-ppc
  Cc: danielhb413, matheus.ferst, groug, david, clg

On 2/5/22 05:19, Víctor Colombo wrote:
> ISA v3.1 changed some VSX instructions behavior by changing what the
> other words/doubleword in the result should contain when the result is
> only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and
> saves the result also in doubleword 0.
> Before, the second doubleword result was undefined according to the
> ISA, but now it's stated that it should be zeroed.
> 
> Even tough the result was undefined before, hardware implementing these
> instructions already filled these fields with 0s. Changing every ISA
> version in QEMU to this behavior makes the results match what happens
> in hardware.
> 
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> 
> ---
> 
> This patch is a proposal on a previous RFC I sent on this topic [1].
> I preferred this approach because it makes QEMU behavior closer to the
> real hardware.
> 
> The affected instructions have the following note in the ISA:
> 
> """
> Previous versions of the architecture allowed the
> contents of doubleword 1 of the result register to be
> undefined. However, all processors that support
> this instruction write 0s into doubleword 1 of the
> result register, as is required by this version of the
> architecture.
> """
> 
> This patch is not exhaustive as
> 1. some instructions that had its behavior changed are not 'corrected'
>      by this patch (mostly multiply-add instructions);
> 2. some instructions changed the behavior to also replicate the result
>      in the other word from the doubleword if the result is a single
>      word e.g. xscvdpuxws. So, this patch only focus on the 'zeroing'
>      part, not the replication;
> 
> Best regards,
> 
> -- Víctor
> 
> [1]https://lists.gnu.org/archive/html/qemu-ppc/2021-12/msg00198.html
> ---
>   target/ppc/fpu_helper.c             | 26 +++++++++++++-------------
>   target/ppc/translate/vsx-impl.c.inc |  4 +++-
>   2 files changed, 16 insertions(+), 14 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros
  2022-02-04 18:19 [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros Víctor Colombo
  2022-02-06  0:36 ` Richard Henderson
@ 2022-02-09  7:45 ` Cédric Le Goater
  1 sibling, 0 replies; 3+ messages in thread
From: Cédric Le Goater @ 2022-02-09  7:45 UTC (permalink / raw)
  To: Víctor Colombo, qemu-devel, qemu-ppc
  Cc: danielhb413, matheus.ferst, groug, david

On 2/4/22 19:19, Víctor Colombo wrote:
> ISA v3.1 changed some VSX instructions behavior by changing what the
> other words/doubleword in the result should contain when the result is
> only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and
> saves the result also in doubleword 0.
> Before, the second doubleword result was undefined according to the
> ISA, but now it's stated that it should be zeroed.
> 
> Even tough the result was undefined before, hardware implementing these
> instructions already filled these fields with 0s. Changing every ISA
> version in QEMU to this behavior makes the results match what happens
> in hardware.
> 
> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>


Applied to ppc-7.0.

Thanks,

C.




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-09  9:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-04 18:19 [PATCH] target/ppc: Change VSX instructions behavior to fill with zeros Víctor Colombo
2022-02-06  0:36 ` Richard Henderson
2022-02-09  7:45 ` Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).