[Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat
@ 2012-01-07 20:09 Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno

Since commit 347ac8e35661eff1c2b5ec74d11ee152f2a61856 which switched
target-i386 to softfloat, a few SSE instructions are not working 
correctly anymore. It's especially noticeable on linux/x86-64 as SSE is 
used default for floating point computation. For example GDM from Debian
Lenny is not usable anymore, it displays all the graphical elements at
the wrong place.

This patch series is an attempt to fix that, and it's probably a good
idea to apply it to the stable branch.

Aurelien Jarno (4):
  target-i386: fix {min,max}{pd,ps,sd,ss} SSE2 instructions
  target-i386: fix round{pd,ps,sd,ss} SSE2 instructions
  target-i386: fix dpps and dppd SSE2 instructions
  target-i386: fix SSE rounding and flush to zero

 target-i386/TODO        |    1 -
 target-i386/helper.h    |    1 +
 target-i386/op_helper.c |   64 +++++++++++++++++++++++++++++++++++++++-------
 target-i386/ops_sse.h   |   48 +++++++++++++++++-----------------
 target-i386/translate.c |    2 +-
 5 files changed, 80 insertions(+), 36 deletions(-)

-- 
1.7.7.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
  2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
  2012-01-07 20:22   ` Peter Maydell
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, " Aurelien Jarno
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno

minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
instructions have been broken when switching target-i386 to softfloat.
It's not possible to use comparison instructions on float types anymore
to softfloat, so use the floatXX_min anf floatXX_max functions instead.

As a bonus it implements the correct NaNs behaviour, so let's remove
this from the TODO.

It fixes GDM screen display on Debian Lenny.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/TODO      |    1 -
 target-i386/ops_sse.h |    4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/target-i386/TODO b/target-i386/TODO
index c8ada07..a8d69cf 100644
--- a/target-i386/TODO
+++ b/target-i386/TODO
@@ -15,7 +15,6 @@ Correctness issues:
 - DRx register support
 - CR0.AC emulation
 - SSE alignment checks
-- fix SSE min/max with nans
 
 Optimizations/Features:
 
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 47dde78..a743c85 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -584,8 +584,8 @@ void helper_ ## name ## sd (Reg *d, Reg *s)\
 #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
 #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
 #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_MIN(size, a, b) (a) < (b) ? (a) : (b)
-#define FPU_MAX(size, a, b) (a) > (b) ? (a) : (b)
+#define FPU_MIN(size, a, b) float ## size ## _min(a, b, &env->sse_status)
+#define FPU_MAX(size, a, b) float ## size ## _max(a, b, &env->sse_status)
 #define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
 
 SSE_HELPER_S(add, FPU_ADD)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, ps, sd, ss} SSE2 instructions
  2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd " Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
  3 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno

roundps and roundss SSE2 instructions have been broken when switching
target-i386 to softfloat. They use float64_round_to_int to convert a
float32, and while the implicit conversion from float32 to float64 was
correct for softfloat-native, it is not for pure softfloat. Fix that by
using the correct registers and correct functions.

Also fix roundpd and roundsd implementation at the same time, even if
these functions are behaving correctly.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a743c85..a185bfb 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -1648,10 +1648,10 @@ void glue(helper_roundps, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
             break;
         }
 
-    d->L(0) = float64_round_to_int(s->L(0), &env->sse_status);
-    d->L(1) = float64_round_to_int(s->L(1), &env->sse_status);
-    d->L(2) = float64_round_to_int(s->L(2), &env->sse_status);
-    d->L(3) = float64_round_to_int(s->L(3), &env->sse_status);
+    d->XMM_S(0) = float32_round_to_int(s->XMM_S(0), &env->sse_status);
+    d->XMM_S(1) = float32_round_to_int(s->XMM_S(1), &env->sse_status);
+    d->XMM_S(2) = float32_round_to_int(s->XMM_S(2), &env->sse_status);
+    d->XMM_S(3) = float32_round_to_int(s->XMM_S(3), &env->sse_status);
 
 #if 0 /* TODO */
     if (mode & (1 << 3))
@@ -1684,8 +1684,8 @@ void glue(helper_roundpd, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
             break;
         }
 
-    d->Q(0) = float64_round_to_int(s->Q(0), &env->sse_status);
-    d->Q(1) = float64_round_to_int(s->Q(1), &env->sse_status);
+    d->XMM_D(0) = float64_round_to_int(s->XMM_D(0), &env->sse_status);
+    d->XMM_D(1) = float64_round_to_int(s->XMM_D(1), &env->sse_status);
 
 #if 0 /* TODO */
     if (mode & (1 << 3))
@@ -1718,7 +1718,7 @@ void glue(helper_roundss, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
             break;
         }
 
-    d->L(0) = float64_round_to_int(s->L(0), &env->sse_status);
+    d->XMM_S(0) = float32_round_to_int(s->XMM_S(0), &env->sse_status);
 
 #if 0 /* TODO */
     if (mode & (1 << 3))
@@ -1751,7 +1751,7 @@ void glue(helper_roundsd, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
             break;
         }
 
-    d->Q(0) = float64_round_to_int(s->Q(0), &env->sse_status);
+    d->XMM_D(0) = float64_round_to_int(s->XMM_D(0), &env->sse_status);
 
 #if 0 /* TODO */
     if (mode & (1 << 3))
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd SSE2 instructions
  2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, " Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
  3 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno

The helpers implemented dpps and dppd SSE instructions are not passing
the correct argument types to the softfloat functions. While they do
work anyway providing a correct behaviour, this patch fixes that.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a185bfb..adfe822 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -1770,44 +1770,44 @@ SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
 
 void glue(helper_dpps, SUFFIX) (Reg *d, Reg *s, uint32_t mask)
 {
-    float32 iresult = 0 /*float32_zero*/;
+    float32 iresult = float32_zero;
 
     if (mask & (1 << 4))
         iresult = float32_add(iresult,
-                        float32_mul(d->L(0), s->L(0), &env->sse_status),
+                        float32_mul(d->XMM_S(0), s->XMM_S(0), &env->sse_status),
                         &env->sse_status);
     if (mask & (1 << 5))
         iresult = float32_add(iresult,
-                        float32_mul(d->L(1), s->L(1), &env->sse_status),
+                        float32_mul(d->XMM_S(1), s->XMM_S(1), &env->sse_status),
                         &env->sse_status);
     if (mask & (1 << 6))
         iresult = float32_add(iresult,
-                        float32_mul(d->L(2), s->L(2), &env->sse_status),
+                        float32_mul(d->XMM_S(2), s->XMM_S(2), &env->sse_status),
                         &env->sse_status);
     if (mask & (1 << 7))
         iresult = float32_add(iresult,
-                        float32_mul(d->L(3), s->L(3), &env->sse_status),
+                        float32_mul(d->XMM_S(3), s->XMM_S(3), &env->sse_status),
                         &env->sse_status);
-    d->L(0) = (mask & (1 << 0)) ? iresult : 0 /*float32_zero*/;
-    d->L(1) = (mask & (1 << 1)) ? iresult : 0 /*float32_zero*/;
-    d->L(2) = (mask & (1 << 2)) ? iresult : 0 /*float32_zero*/;
-    d->L(3) = (mask & (1 << 3)) ? iresult : 0 /*float32_zero*/;
+    d->XMM_S(0) = (mask & (1 << 0)) ? iresult : float32_zero;
+    d->XMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
+    d->XMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;
+    d->XMM_S(3) = (mask & (1 << 3)) ? iresult : float32_zero;
 }
 
 void glue(helper_dppd, SUFFIX) (Reg *d, Reg *s, uint32_t mask)
 {
-    float64 iresult = 0 /*float64_zero*/;
+    float64 iresult = float64_zero;
 
     if (mask & (1 << 4))
         iresult = float64_add(iresult,
-                        float64_mul(d->Q(0), s->Q(0), &env->sse_status),
+                        float64_mul(d->XMM_D(0), s->XMM_D(0), &env->sse_status),
                         &env->sse_status);
     if (mask & (1 << 5))
         iresult = float64_add(iresult,
-                        float64_mul(d->Q(1), s->Q(1), &env->sse_status),
+                        float64_mul(d->XMM_D(1), s->XMM_D(1), &env->sse_status),
                         &env->sse_status);
-    d->Q(0) = (mask & (1 << 0)) ? iresult : 0 /*float64_zero*/;
-    d->Q(1) = (mask & (1 << 1)) ? iresult : 0 /*float64_zero*/;
+    d->XMM_D(0) = (mask & (1 << 0)) ? iresult : float64_zero;
+    d->XMM_D(1) = (mask & (1 << 1)) ? iresult : float64_zero;
 }
 
 void glue(helper_mpsadbw, SUFFIX) (Reg *d, Reg *s, uint32_t offset)
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
  2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
                   ` (2 preceding siblings ...)
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd " Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
  2012-01-12  5:37   ` Dong Xu Wang
  3 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno

SSE rounding and flush to zero control has never been implemented. However
given that softfloat-native was using a single state for FPU and SSE and
given that glibc is setting both FPU and SSE state in fesetround(), this
was working correctly up to the switch to softfloat.

Fix that by adding an update_sse_status() function similar to
update_fpu_status(), and callin git on write to mxcsr.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/helper.h    |    1 +
 target-i386/op_helper.c |   64 +++++++++++++++++++++++++++++++++++++++-------
 target-i386/translate.c |    2 +-
 3 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/target-i386/helper.h b/target-i386/helper.h
index 6b518ad..761954e 100644
--- a/target-i386/helper.h
+++ b/target-i386/helper.h
@@ -197,6 +197,7 @@ DEF_HELPER_2(lzcnt, tl, tl, int)
 
 /* MMX/SSE */
 
+DEF_HELPER_1(ldmxcsr, void, i32)
 DEF_HELPER_0(enter_mmx, void)
 DEF_HELPER_0(emms, void)
 DEF_HELPER_2(movq, void, ptr, ptr)
diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
index c89e4a4..2aea71b 100644
--- a/target-i386/op_helper.c
+++ b/target-i386/op_helper.c
@@ -52,11 +52,11 @@ static inline target_long lshift(target_long x, int n)
     }
 }
 
-#define RC_MASK         0xc00
-#define RC_NEAR         0x000
-#define RC_DOWN         0x400
-#define RC_UP           0x800
-#define RC_CHOP         0xc00
+#define FPU_RC_MASK         0xc00
+#define FPU_RC_NEAR         0x000
+#define FPU_RC_DOWN         0x400
+#define FPU_RC_UP           0x800
+#define FPU_RC_CHOP         0xc00
 
 #define MAXTAN 9223372036854775808.0
 
@@ -4024,18 +4024,18 @@ static void update_fp_status(void)
     int rnd_type;
 
     /* set rounding mode */
-    switch(env->fpuc & RC_MASK) {
+    switch(env->fpuc & FPU_RC_MASK) {
     default:
-    case RC_NEAR:
+    case FPU_RC_NEAR:
         rnd_type = float_round_nearest_even;
         break;
-    case RC_DOWN:
+    case FPU_RC_DOWN:
         rnd_type = float_round_down;
         break;
-    case RC_UP:
+    case FPU_RC_UP:
         rnd_type = float_round_up;
         break;
-    case RC_CHOP:
+    case FPU_RC_CHOP:
         rnd_type = float_round_to_zero;
         break;
     }
@@ -5629,6 +5629,50 @@ void helper_vmexit(uint32_t exit_code, uint64_t exit_info_1)
 
 /* MMX/SSE */
 /* XXX: optimize by storing fptt and fptags in the static cpu state */
+
+#define SSE_DAZ             0x0040
+#define SSE_RC_MASK         0x6000
+#define SSE_RC_NEAR         0x0000
+#define SSE_RC_DOWN         0x2000
+#define SSE_RC_UP           0x4000
+#define SSE_RC_CHOP         0x6000
+#define SSE_FZ              0x8000
+
+static void update_sse_status(void)
+{
+    int rnd_type;
+
+    /* set rounding mode */
+    switch(env->mxcsr & SSE_RC_MASK) {
+    default:
+    case SSE_RC_NEAR:
+        rnd_type = float_round_nearest_even;
+        break;
+    case SSE_RC_DOWN:
+        rnd_type = float_round_down;
+        break;
+    case SSE_RC_UP:
+        rnd_type = float_round_up;
+        break;
+    case SSE_RC_CHOP:
+        rnd_type = float_round_to_zero;
+        break;
+    }
+    set_float_rounding_mode(rnd_type, &env->sse_status);
+
+    /* set denormals are zero */
+    set_flush_inputs_to_zero((env->mxcsr & SSE_DAZ) ? 1 : 0, &env->sse_status);
+
+    /* set flush to zero */
+    set_flush_to_zero((env->mxcsr & SSE_FZ) ? 1 : 0, &env->fp_status);
+}
+
+void helper_ldmxcsr(uint32_t val)
+{
+    env->mxcsr = val;
+    update_sse_status();
+}
+
 void helper_enter_mmx(void)
 {
     env->fpstt = 0;
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 8321bf3..b9839c5 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7544,7 +7544,7 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
             gen_lea_modrm(s, modrm, &reg_addr, &offset_addr);
             if (op == 2) {
                 gen_op_ld_T0_A0(OT_LONG + s->mem_index);
-                tcg_gen_st32_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
+                gen_helper_ldmxcsr(cpu_T[0]);
             } else {
                 tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
                 gen_op_st_T0_A0(OT_LONG + s->mem_index);
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
@ 2012-01-07 20:22   ` Peter Maydell
  2012-01-07 21:24     ` [Qemu-devel] [PATCH 1/4 v2] " Aurelien Jarno
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2012-01-07 20:22 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, qemu-stable

On 7 January 2012 20:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
> instructions have been broken when switching target-i386 to softfloat.
> It's not possible to use comparison instructions on float types anymore
> to softfloat, so use the floatXX_min anf floatXX_max functions instead.

Nope, this gets the x86 special cases wrong. This has been discussed
here before:

http://www.mail-archive.com/qemu-devel@nongnu.org/msg85557.html
has the right implementation (from Jason Wessell) and a comment
(from me) about why it's right.

-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH 1/4 v2] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
  2012-01-07 20:22   ` Peter Maydell
@ 2012-01-07 21:24     ` Aurelien Jarno
  0 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 21:24 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-stable

On Sat, Jan 07, 2012 at 08:22:53PM +0000, Peter Maydell wrote:
> On 7 January 2012 20:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
> > instructions have been broken when switching target-i386 to softfloat.
> > It's not possible to use comparison instructions on float types anymore
> > to softfloat, so use the floatXX_min anf floatXX_max functions instead.
> 
> Nope, this gets the x86 special cases wrong. This has been discussed
> here before:
> 
> http://www.mail-archive.com/qemu-devel@nongnu.org/msg85557.html
> has the right implementation (from Jason Wessell) and a comment
> (from me) about why it's right.
> 

Good catch, the patch below should implement the correct behaviour.

target-i386: fix {min,max}{pd,ps,sd,ss} SSE2 instructions

minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
instructions have been broken when switching target-i386 to softfloat.
It's not possible to use comparison instructions on float types anymore
to softfloat, so use the floatXX_lt function instead, as the
float_XX_min and float_XX_max functions can't be used due to the Intel
specific behaviour.

As it implements the correct NaNs behaviour, let's remove the
corresponding entry from the TODO.

It fixes GDM screen display on Debian Lenny.

Thanks to Peter Maydell and Jason Wessel for their analysis of the
problem.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/TODO      |    1 -
 target-i386/ops_sse.h |    9 +++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/target-i386/TODO b/target-i386/TODO
index c8ada07..a8d69cf 100644
--- a/target-i386/TODO
+++ b/target-i386/TODO
@@ -15,7 +15,6 @@ Correctness issues:
 - DRx register support
 - CR0.AC emulation
 - SSE alignment checks
-- fix SSE min/max with nans
 
 Optimizations/Features:
 
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 47dde78..8ed231d 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -584,10 +584,15 @@ void helper_ ## name ## sd (Reg *d, Reg *s)\
 #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
 #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
 #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_MIN(size, a, b) (a) < (b) ? (a) : (b)
-#define FPU_MAX(size, a, b) (a) > (b) ? (a) : (b)
 #define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
 
+/* Note that the choice of comparison op here is important to get the
+ * special cases right: for min and max Intel specifies that (-0,0),
+ * (NaN, anything) and (anything, NaN) return the second argument.
+ */
+#define FPU_MIN(size, a, b) float ## size ## _lt(a, b, &env->sse_status) ? (a) : (b)
+#define FPU_MAX(size, a, b) float ## size ## _lt(b, a, &env->sse_status) ? (a) : (b)
+
 SSE_HELPER_S(add, FPU_ADD)
 SSE_HELPER_S(sub, FPU_SUB)
 SSE_HELPER_S(mul, FPU_MUL)
-- 
1.7.7.3


-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
  2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
@ 2012-01-12  5:37   ` Dong Xu Wang
  2012-01-13  9:40     ` Markus Armbruster
  2012-01-13 16:07     ` [Qemu-devel] " Aurelien Jarno
  0 siblings, 2 replies; 11+ messages in thread
From: Dong Xu Wang @ 2012-01-12  5:37 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, qemu-stable

After applied this patch, while I was compiling on my lap, there will
be an error:

 ./configure --enable-kvm --target-list=x86_64-softmmu && make
CC    x86_64-softmmu/translate.o
/qemu/target-i386/translate.c: In function ‘disas_insn’:
/qemu/target-i386/translate.c:7547:17: error: incompatible type for
argument 1 of ‘gen_helper_ldmxcsr’
/qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
argument is of type ‘TCGv_i64’
make[1]: *** [translate.o] Error 1
make: *** [subdir-x86_64-softmmu] Error 2


On Sun, Jan 8, 2012 at 04:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> SSE rounding and flush to zero control has never been implemented. However
> given that softfloat-native was using a single state for FPU and SSE and
> given that glibc is setting both FPU and SSE state in fesetround(), this
> was working correctly up to the switch to softfloat.
>
> Fix that by adding an update_sse_status() function similar to
> update_fpu_status(), and callin git on write to mxcsr.
>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/helper.h    |    1 +
>  target-i386/op_helper.c |   64 +++++++++++++++++++++++++++++++++++++++-------
>  target-i386/translate.c |    2 +-
>  3 files changed, 56 insertions(+), 11 deletions(-)
>
> diff --git a/target-i386/helper.h b/target-i386/helper.h
> index 6b518ad..761954e 100644
> --- a/target-i386/helper.h
> +++ b/target-i386/helper.h
> @@ -197,6 +197,7 @@ DEF_HELPER_2(lzcnt, tl, tl, int)
>
>  /* MMX/SSE */
>
> +DEF_HELPER_1(ldmxcsr, void, i32)
>  DEF_HELPER_0(enter_mmx, void)
>  DEF_HELPER_0(emms, void)
>  DEF_HELPER_2(movq, void, ptr, ptr)
> diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
> index c89e4a4..2aea71b 100644
> --- a/target-i386/op_helper.c
> +++ b/target-i386/op_helper.c
> @@ -52,11 +52,11 @@ static inline target_long lshift(target_long x, int n)
>     }
>  }
>
> -#define RC_MASK         0xc00
> -#define RC_NEAR         0x000
> -#define RC_DOWN         0x400
> -#define RC_UP           0x800
> -#define RC_CHOP         0xc00
> +#define FPU_RC_MASK         0xc00
> +#define FPU_RC_NEAR         0x000
> +#define FPU_RC_DOWN         0x400
> +#define FPU_RC_UP           0x800
> +#define FPU_RC_CHOP         0xc00
>
>  #define MAXTAN 9223372036854775808.0
>
> @@ -4024,18 +4024,18 @@ static void update_fp_status(void)
>     int rnd_type;
>
>     /* set rounding mode */
> -    switch(env->fpuc & RC_MASK) {
> +    switch(env->fpuc & FPU_RC_MASK) {
>     default:
> -    case RC_NEAR:
> +    case FPU_RC_NEAR:
>         rnd_type = float_round_nearest_even;
>         break;
> -    case RC_DOWN:
> +    case FPU_RC_DOWN:
>         rnd_type = float_round_down;
>         break;
> -    case RC_UP:
> +    case FPU_RC_UP:
>         rnd_type = float_round_up;
>         break;
> -    case RC_CHOP:
> +    case FPU_RC_CHOP:
>         rnd_type = float_round_to_zero;
>         break;
>     }
> @@ -5629,6 +5629,50 @@ void helper_vmexit(uint32_t exit_code, uint64_t exit_info_1)
>
>  /* MMX/SSE */
>  /* XXX: optimize by storing fptt and fptags in the static cpu state */
> +
> +#define SSE_DAZ             0x0040
> +#define SSE_RC_MASK         0x6000
> +#define SSE_RC_NEAR         0x0000
> +#define SSE_RC_DOWN         0x2000
> +#define SSE_RC_UP           0x4000
> +#define SSE_RC_CHOP         0x6000
> +#define SSE_FZ              0x8000
> +
> +static void update_sse_status(void)
> +{
> +    int rnd_type;
> +
> +    /* set rounding mode */
> +    switch(env->mxcsr & SSE_RC_MASK) {
> +    default:
> +    case SSE_RC_NEAR:
> +        rnd_type = float_round_nearest_even;
> +        break;
> +    case SSE_RC_DOWN:
> +        rnd_type = float_round_down;
> +        break;
> +    case SSE_RC_UP:
> +        rnd_type = float_round_up;
> +        break;
> +    case SSE_RC_CHOP:
> +        rnd_type = float_round_to_zero;
> +        break;
> +    }
> +    set_float_rounding_mode(rnd_type, &env->sse_status);
> +
> +    /* set denormals are zero */
> +    set_flush_inputs_to_zero((env->mxcsr & SSE_DAZ) ? 1 : 0, &env->sse_status);
> +
> +    /* set flush to zero */
> +    set_flush_to_zero((env->mxcsr & SSE_FZ) ? 1 : 0, &env->fp_status);
> +}
> +
> +void helper_ldmxcsr(uint32_t val)
> +{
> +    env->mxcsr = val;
> +    update_sse_status();
> +}
> +
>  void helper_enter_mmx(void)
>  {
>     env->fpstt = 0;
> diff --git a/target-i386/translate.c b/target-i386/translate.c
> index 8321bf3..b9839c5 100644
> --- a/target-i386/translate.c
> +++ b/target-i386/translate.c
> @@ -7544,7 +7544,7 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
>             gen_lea_modrm(s, modrm, &reg_addr, &offset_addr);
>             if (op == 2) {
>                 gen_op_ld_T0_A0(OT_LONG + s->mem_index);
> -                tcg_gen_st32_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
> +                gen_helper_ldmxcsr(cpu_T[0]);
>             } else {
>                 tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
>                 gen_op_st_T0_A0(OT_LONG + s->mem_index);
> --
> 1.7.7.3
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
  2012-01-12  5:37   ` Dong Xu Wang
@ 2012-01-13  9:40     ` Markus Armbruster
  2012-01-13 15:14       ` [Qemu-devel] [Qemu-stable] " Justin M. Forbes
  2012-01-13 16:07     ` [Qemu-devel] " Aurelien Jarno
  1 sibling, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2012-01-13  9:40 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: qemu-devel, Aurelien Jarno, qemu-stable

Dong Xu Wang <wdongxu@linux.vnet.ibm.com> writes:

> After applied this patch, while I was compiling on my lap, there will
> be an error:
>
>  ./configure --enable-kvm --target-list=x86_64-softmmu && make
> CC    x86_64-softmmu/translate.o
> /qemu/target-i386/translate.c: In function ‘disas_insn’:
> /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> argument 1 of ‘gen_helper_ldmxcsr’
> /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> argument is of type ‘TCGv_i64’
> make[1]: *** [translate.o] Error 1
> make: *** [subdir-x86_64-softmmu] Error 2

I see this, too.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
  2012-01-13  9:40     ` Markus Armbruster
@ 2012-01-13 15:14       ` Justin M. Forbes
  0 siblings, 0 replies; 11+ messages in thread
From: Justin M. Forbes @ 2012-01-13 15:14 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Dong Xu Wang, qemu-devel, qemu-stable

On Fri, 2012-01-13 at 10:40 +0100, Markus Armbruster wrote:
> Dong Xu Wang <wdongxu@linux.vnet.ibm.com> writes:
> 
> > After applied this patch, while I was compiling on my lap, there will
> > be an error:
> >
> >  ./configure --enable-kvm --target-list=x86_64-softmmu && make
> > CC    x86_64-softmmu/translate.o
> > /qemu/target-i386/translate.c: In function ‘disas_insn’:
> > /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> > argument 1 of ‘gen_helper_ldmxcsr’
> > /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> > argument is of type ‘TCGv_i64’
> > make[1]: *** [translate.o] Error 1
> > make: *** [subdir-x86_64-softmmu] Error 2
> 
> I see this, too.


I will take a look, I am not seeing it right now with the full stable
tree, though I am doing a much larger config.

Justin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
  2012-01-12  5:37   ` Dong Xu Wang
  2012-01-13  9:40     ` Markus Armbruster
@ 2012-01-13 16:07     ` Aurelien Jarno
  1 sibling, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-13 16:07 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: qemu-devel, qemu-stable

On Thu, Jan 12, 2012 at 01:37:59PM +0800, Dong Xu Wang wrote:
> After applied this patch, while I was compiling on my lap, there will
> be an error:
> 
>  ./configure --enable-kvm --target-list=x86_64-softmmu && make
> CC    x86_64-softmmu/translate.o
> /qemu/target-i386/translate.c: In function ‘disas_insn’:
> /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> argument 1 of ‘gen_helper_ldmxcsr’
> /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> argument is of type ‘TCGv_i64’
> make[1]: *** [translate.o] Error 1
> make: *** [subdir-x86_64-softmmu] Error 2

Sorry about that, I have pushed the following patch which solves the
problem.


target-i386: fix compilation with --enable-debug-tcg

Commit 2355c16e74ffa4d14e7fc2b4a23b055565ac0221 introduced a new ldmxcsr
helper taking an i32 argument, but the helper is actually passed a long.
Fix that by truncating the long to i32.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/translate.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index b9839c5..860b4a3 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7544,7 +7544,8 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
             gen_lea_modrm(s, modrm, &reg_addr, &offset_addr);
             if (op == 2) {
                 gen_op_ld_T0_A0(OT_LONG + s->mem_index);
-                gen_helper_ldmxcsr(cpu_T[0]);
+                tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]);
+                gen_helper_ldmxcsr(cpu_tmp2_i32);
             } else {
                 tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
                 gen_op_st_T0_A0(OT_LONG + s->mem_index);


-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-01-13 16:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
2012-01-07 20:22   ` Peter Maydell
2012-01-07 21:24     ` [Qemu-devel] [PATCH 1/4 v2] " Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, " Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd " Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
2012-01-12  5:37   ` Dong Xu Wang
2012-01-13  9:40     ` Markus Armbruster
2012-01-13 15:14       ` [Qemu-devel] [Qemu-stable] " Justin M. Forbes
2012-01-13 16:07     ` [Qemu-devel] " Aurelien Jarno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).