* [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
2012-01-07 20:22 ` Peter Maydell
2012-01-07 20:09 ` [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, " Aurelien Jarno
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno
minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
instructions have been broken when switching target-i386 to softfloat.
It's not possible to use comparison instructions on float types anymore
to softfloat, so use the floatXX_min anf floatXX_max functions instead.
As a bonus it implements the correct NaNs behaviour, so let's remove
this from the TODO.
It fixes GDM screen display on Debian Lenny.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/TODO | 1 -
target-i386/ops_sse.h | 4 ++--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/target-i386/TODO b/target-i386/TODO
index c8ada07..a8d69cf 100644
--- a/target-i386/TODO
+++ b/target-i386/TODO
@@ -15,7 +15,6 @@ Correctness issues:
- DRx register support
- CR0.AC emulation
- SSE alignment checks
-- fix SSE min/max with nans
Optimizations/Features:
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 47dde78..a743c85 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -584,8 +584,8 @@ void helper_ ## name ## sd (Reg *d, Reg *s)\
#define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
#define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
#define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_MIN(size, a, b) (a) < (b) ? (a) : (b)
-#define FPU_MAX(size, a, b) (a) > (b) ? (a) : (b)
+#define FPU_MIN(size, a, b) float ## size ## _min(a, b, &env->sse_status)
+#define FPU_MAX(size, a, b) float ## size ## _max(a, b, &env->sse_status)
#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
SSE_HELPER_S(add, FPU_ADD)
--
1.7.7.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
@ 2012-01-07 20:22 ` Peter Maydell
2012-01-07 21:24 ` [Qemu-devel] [PATCH 1/4 v2] " Aurelien Jarno
0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2012-01-07 20:22 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel, qemu-stable
On 7 January 2012 20:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
> instructions have been broken when switching target-i386 to softfloat.
> It's not possible to use comparison instructions on float types anymore
> to softfloat, so use the floatXX_min anf floatXX_max functions instead.
Nope, this gets the x86 special cases wrong. This has been discussed
here before:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg85557.html
has the right implementation (from Jason Wessell) and a comment
(from me) about why it's right.
-- PMM
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 1/4 v2] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions
2012-01-07 20:22 ` Peter Maydell
@ 2012-01-07 21:24 ` Aurelien Jarno
0 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 21:24 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, qemu-stable
On Sat, Jan 07, 2012 at 08:22:53PM +0000, Peter Maydell wrote:
> On 7 January 2012 20:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
> > instructions have been broken when switching target-i386 to softfloat.
> > It's not possible to use comparison instructions on float types anymore
> > to softfloat, so use the floatXX_min anf floatXX_max functions instead.
>
> Nope, this gets the x86 special cases wrong. This has been discussed
> here before:
>
> http://www.mail-archive.com/qemu-devel@nongnu.org/msg85557.html
> has the right implementation (from Jason Wessell) and a comment
> (from me) about why it's right.
>
Good catch, the patch below should implement the correct behaviour.
target-i386: fix {min,max}{pd,ps,sd,ss} SSE2 instructions
minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2
instructions have been broken when switching target-i386 to softfloat.
It's not possible to use comparison instructions on float types anymore
to softfloat, so use the floatXX_lt function instead, as the
float_XX_min and float_XX_max functions can't be used due to the Intel
specific behaviour.
As it implements the correct NaNs behaviour, let's remove the
corresponding entry from the TODO.
It fixes GDM screen display on Debian Lenny.
Thanks to Peter Maydell and Jason Wessel for their analysis of the
problem.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/TODO | 1 -
target-i386/ops_sse.h | 9 +++++++--
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/target-i386/TODO b/target-i386/TODO
index c8ada07..a8d69cf 100644
--- a/target-i386/TODO
+++ b/target-i386/TODO
@@ -15,7 +15,6 @@ Correctness issues:
- DRx register support
- CR0.AC emulation
- SSE alignment checks
-- fix SSE min/max with nans
Optimizations/Features:
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 47dde78..8ed231d 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -584,10 +584,15 @@ void helper_ ## name ## sd (Reg *d, Reg *s)\
#define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status)
#define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status)
#define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status)
-#define FPU_MIN(size, a, b) (a) < (b) ? (a) : (b)
-#define FPU_MAX(size, a, b) (a) > (b) ? (a) : (b)
#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status)
+/* Note that the choice of comparison op here is important to get the
+ * special cases right: for min and max Intel specifies that (-0,0),
+ * (NaN, anything) and (anything, NaN) return the second argument.
+ */
+#define FPU_MIN(size, a, b) float ## size ## _lt(a, b, &env->sse_status) ? (a) : (b)
+#define FPU_MAX(size, a, b) float ## size ## _lt(b, a, &env->sse_status) ? (a) : (b)
+
SSE_HELPER_S(add, FPU_ADD)
SSE_HELPER_S(sub, FPU_SUB)
SSE_HELPER_S(mul, FPU_MUL)
--
1.7.7.3
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, ps, sd, ss} SSE2 instructions
2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd " Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
3 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno
roundps and roundss SSE2 instructions have been broken when switching
target-i386 to softfloat. They use float64_round_to_int to convert a
float32, and while the implicit conversion from float32 to float64 was
correct for softfloat-native, it is not for pure softfloat. Fix that by
using the correct registers and correct functions.
Also fix roundpd and roundsd implementation at the same time, even if
these functions are behaving correctly.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/ops_sse.h | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a743c85..a185bfb 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -1648,10 +1648,10 @@ void glue(helper_roundps, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
break;
}
- d->L(0) = float64_round_to_int(s->L(0), &env->sse_status);
- d->L(1) = float64_round_to_int(s->L(1), &env->sse_status);
- d->L(2) = float64_round_to_int(s->L(2), &env->sse_status);
- d->L(3) = float64_round_to_int(s->L(3), &env->sse_status);
+ d->XMM_S(0) = float32_round_to_int(s->XMM_S(0), &env->sse_status);
+ d->XMM_S(1) = float32_round_to_int(s->XMM_S(1), &env->sse_status);
+ d->XMM_S(2) = float32_round_to_int(s->XMM_S(2), &env->sse_status);
+ d->XMM_S(3) = float32_round_to_int(s->XMM_S(3), &env->sse_status);
#if 0 /* TODO */
if (mode & (1 << 3))
@@ -1684,8 +1684,8 @@ void glue(helper_roundpd, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
break;
}
- d->Q(0) = float64_round_to_int(s->Q(0), &env->sse_status);
- d->Q(1) = float64_round_to_int(s->Q(1), &env->sse_status);
+ d->XMM_D(0) = float64_round_to_int(s->XMM_D(0), &env->sse_status);
+ d->XMM_D(1) = float64_round_to_int(s->XMM_D(1), &env->sse_status);
#if 0 /* TODO */
if (mode & (1 << 3))
@@ -1718,7 +1718,7 @@ void glue(helper_roundss, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
break;
}
- d->L(0) = float64_round_to_int(s->L(0), &env->sse_status);
+ d->XMM_S(0) = float32_round_to_int(s->XMM_S(0), &env->sse_status);
#if 0 /* TODO */
if (mode & (1 << 3))
@@ -1751,7 +1751,7 @@ void glue(helper_roundsd, SUFFIX) (Reg *d, Reg *s, uint32_t mode)
break;
}
- d->Q(0) = float64_round_to_int(s->Q(0), &env->sse_status);
+ d->XMM_D(0) = float64_round_to_int(s->XMM_D(0), &env->sse_status);
#if 0 /* TODO */
if (mode & (1 << 3))
--
1.7.7.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd SSE2 instructions
2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 1/4] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 2/4] target-i386: fix round{pd, " Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
3 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno
The helpers implemented dpps and dppd SSE instructions are not passing
the correct argument types to the softfloat functions. While they do
work anyway providing a correct behaviour, this patch fixes that.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/ops_sse.h | 28 ++++++++++++++--------------
1 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a185bfb..adfe822 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -1770,44 +1770,44 @@ SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP)
void glue(helper_dpps, SUFFIX) (Reg *d, Reg *s, uint32_t mask)
{
- float32 iresult = 0 /*float32_zero*/;
+ float32 iresult = float32_zero;
if (mask & (1 << 4))
iresult = float32_add(iresult,
- float32_mul(d->L(0), s->L(0), &env->sse_status),
+ float32_mul(d->XMM_S(0), s->XMM_S(0), &env->sse_status),
&env->sse_status);
if (mask & (1 << 5))
iresult = float32_add(iresult,
- float32_mul(d->L(1), s->L(1), &env->sse_status),
+ float32_mul(d->XMM_S(1), s->XMM_S(1), &env->sse_status),
&env->sse_status);
if (mask & (1 << 6))
iresult = float32_add(iresult,
- float32_mul(d->L(2), s->L(2), &env->sse_status),
+ float32_mul(d->XMM_S(2), s->XMM_S(2), &env->sse_status),
&env->sse_status);
if (mask & (1 << 7))
iresult = float32_add(iresult,
- float32_mul(d->L(3), s->L(3), &env->sse_status),
+ float32_mul(d->XMM_S(3), s->XMM_S(3), &env->sse_status),
&env->sse_status);
- d->L(0) = (mask & (1 << 0)) ? iresult : 0 /*float32_zero*/;
- d->L(1) = (mask & (1 << 1)) ? iresult : 0 /*float32_zero*/;
- d->L(2) = (mask & (1 << 2)) ? iresult : 0 /*float32_zero*/;
- d->L(3) = (mask & (1 << 3)) ? iresult : 0 /*float32_zero*/;
+ d->XMM_S(0) = (mask & (1 << 0)) ? iresult : float32_zero;
+ d->XMM_S(1) = (mask & (1 << 1)) ? iresult : float32_zero;
+ d->XMM_S(2) = (mask & (1 << 2)) ? iresult : float32_zero;
+ d->XMM_S(3) = (mask & (1 << 3)) ? iresult : float32_zero;
}
void glue(helper_dppd, SUFFIX) (Reg *d, Reg *s, uint32_t mask)
{
- float64 iresult = 0 /*float64_zero*/;
+ float64 iresult = float64_zero;
if (mask & (1 << 4))
iresult = float64_add(iresult,
- float64_mul(d->Q(0), s->Q(0), &env->sse_status),
+ float64_mul(d->XMM_D(0), s->XMM_D(0), &env->sse_status),
&env->sse_status);
if (mask & (1 << 5))
iresult = float64_add(iresult,
- float64_mul(d->Q(1), s->Q(1), &env->sse_status),
+ float64_mul(d->XMM_D(1), s->XMM_D(1), &env->sse_status),
&env->sse_status);
- d->Q(0) = (mask & (1 << 0)) ? iresult : 0 /*float64_zero*/;
- d->Q(1) = (mask & (1 << 1)) ? iresult : 0 /*float64_zero*/;
+ d->XMM_D(0) = (mask & (1 << 0)) ? iresult : float64_zero;
+ d->XMM_D(1) = (mask & (1 << 1)) ? iresult : float64_zero;
}
void glue(helper_mpsadbw, SUFFIX) (Reg *d, Reg *s, uint32_t offset)
--
1.7.7.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
2012-01-07 20:09 [Qemu-devel] [PATCH 0/4] target-i386: Fix regressions introduced by the switch to softfloat Aurelien Jarno
` (2 preceding siblings ...)
2012-01-07 20:09 ` [Qemu-devel] [PATCH 3/4] target-i386: fix dpps and dppd " Aurelien Jarno
@ 2012-01-07 20:09 ` Aurelien Jarno
2012-01-12 5:37 ` Dong Xu Wang
3 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-07 20:09 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-stable, Aurelien Jarno
SSE rounding and flush to zero control has never been implemented. However
given that softfloat-native was using a single state for FPU and SSE and
given that glibc is setting both FPU and SSE state in fesetround(), this
was working correctly up to the switch to softfloat.
Fix that by adding an update_sse_status() function similar to
update_fpu_status(), and callin git on write to mxcsr.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/helper.h | 1 +
target-i386/op_helper.c | 64 +++++++++++++++++++++++++++++++++++++++-------
target-i386/translate.c | 2 +-
3 files changed, 56 insertions(+), 11 deletions(-)
diff --git a/target-i386/helper.h b/target-i386/helper.h
index 6b518ad..761954e 100644
--- a/target-i386/helper.h
+++ b/target-i386/helper.h
@@ -197,6 +197,7 @@ DEF_HELPER_2(lzcnt, tl, tl, int)
/* MMX/SSE */
+DEF_HELPER_1(ldmxcsr, void, i32)
DEF_HELPER_0(enter_mmx, void)
DEF_HELPER_0(emms, void)
DEF_HELPER_2(movq, void, ptr, ptr)
diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
index c89e4a4..2aea71b 100644
--- a/target-i386/op_helper.c
+++ b/target-i386/op_helper.c
@@ -52,11 +52,11 @@ static inline target_long lshift(target_long x, int n)
}
}
-#define RC_MASK 0xc00
-#define RC_NEAR 0x000
-#define RC_DOWN 0x400
-#define RC_UP 0x800
-#define RC_CHOP 0xc00
+#define FPU_RC_MASK 0xc00
+#define FPU_RC_NEAR 0x000
+#define FPU_RC_DOWN 0x400
+#define FPU_RC_UP 0x800
+#define FPU_RC_CHOP 0xc00
#define MAXTAN 9223372036854775808.0
@@ -4024,18 +4024,18 @@ static void update_fp_status(void)
int rnd_type;
/* set rounding mode */
- switch(env->fpuc & RC_MASK) {
+ switch(env->fpuc & FPU_RC_MASK) {
default:
- case RC_NEAR:
+ case FPU_RC_NEAR:
rnd_type = float_round_nearest_even;
break;
- case RC_DOWN:
+ case FPU_RC_DOWN:
rnd_type = float_round_down;
break;
- case RC_UP:
+ case FPU_RC_UP:
rnd_type = float_round_up;
break;
- case RC_CHOP:
+ case FPU_RC_CHOP:
rnd_type = float_round_to_zero;
break;
}
@@ -5629,6 +5629,50 @@ void helper_vmexit(uint32_t exit_code, uint64_t exit_info_1)
/* MMX/SSE */
/* XXX: optimize by storing fptt and fptags in the static cpu state */
+
+#define SSE_DAZ 0x0040
+#define SSE_RC_MASK 0x6000
+#define SSE_RC_NEAR 0x0000
+#define SSE_RC_DOWN 0x2000
+#define SSE_RC_UP 0x4000
+#define SSE_RC_CHOP 0x6000
+#define SSE_FZ 0x8000
+
+static void update_sse_status(void)
+{
+ int rnd_type;
+
+ /* set rounding mode */
+ switch(env->mxcsr & SSE_RC_MASK) {
+ default:
+ case SSE_RC_NEAR:
+ rnd_type = float_round_nearest_even;
+ break;
+ case SSE_RC_DOWN:
+ rnd_type = float_round_down;
+ break;
+ case SSE_RC_UP:
+ rnd_type = float_round_up;
+ break;
+ case SSE_RC_CHOP:
+ rnd_type = float_round_to_zero;
+ break;
+ }
+ set_float_rounding_mode(rnd_type, &env->sse_status);
+
+ /* set denormals are zero */
+ set_flush_inputs_to_zero((env->mxcsr & SSE_DAZ) ? 1 : 0, &env->sse_status);
+
+ /* set flush to zero */
+ set_flush_to_zero((env->mxcsr & SSE_FZ) ? 1 : 0, &env->fp_status);
+}
+
+void helper_ldmxcsr(uint32_t val)
+{
+ env->mxcsr = val;
+ update_sse_status();
+}
+
void helper_enter_mmx(void)
{
env->fpstt = 0;
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 8321bf3..b9839c5 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7544,7 +7544,7 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
if (op == 2) {
gen_op_ld_T0_A0(OT_LONG + s->mem_index);
- tcg_gen_st32_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
+ gen_helper_ldmxcsr(cpu_T[0]);
} else {
tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
gen_op_st_T0_A0(OT_LONG + s->mem_index);
--
1.7.7.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
2012-01-07 20:09 ` [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero Aurelien Jarno
@ 2012-01-12 5:37 ` Dong Xu Wang
2012-01-13 9:40 ` Markus Armbruster
2012-01-13 16:07 ` [Qemu-devel] " Aurelien Jarno
0 siblings, 2 replies; 11+ messages in thread
From: Dong Xu Wang @ 2012-01-12 5:37 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel, qemu-stable
After applied this patch, while I was compiling on my lap, there will
be an error:
./configure --enable-kvm --target-list=x86_64-softmmu && make
CC x86_64-softmmu/translate.o
/qemu/target-i386/translate.c: In function ‘disas_insn’:
/qemu/target-i386/translate.c:7547:17: error: incompatible type for
argument 1 of ‘gen_helper_ldmxcsr’
/qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
argument is of type ‘TCGv_i64’
make[1]: *** [translate.o] Error 1
make: *** [subdir-x86_64-softmmu] Error 2
On Sun, Jan 8, 2012 at 04:09, Aurelien Jarno <aurelien@aurel32.net> wrote:
> SSE rounding and flush to zero control has never been implemented. However
> given that softfloat-native was using a single state for FPU and SSE and
> given that glibc is setting both FPU and SSE state in fesetround(), this
> was working correctly up to the switch to softfloat.
>
> Fix that by adding an update_sse_status() function similar to
> update_fpu_status(), and callin git on write to mxcsr.
>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
> target-i386/helper.h | 1 +
> target-i386/op_helper.c | 64 +++++++++++++++++++++++++++++++++++++++-------
> target-i386/translate.c | 2 +-
> 3 files changed, 56 insertions(+), 11 deletions(-)
>
> diff --git a/target-i386/helper.h b/target-i386/helper.h
> index 6b518ad..761954e 100644
> --- a/target-i386/helper.h
> +++ b/target-i386/helper.h
> @@ -197,6 +197,7 @@ DEF_HELPER_2(lzcnt, tl, tl, int)
>
> /* MMX/SSE */
>
> +DEF_HELPER_1(ldmxcsr, void, i32)
> DEF_HELPER_0(enter_mmx, void)
> DEF_HELPER_0(emms, void)
> DEF_HELPER_2(movq, void, ptr, ptr)
> diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
> index c89e4a4..2aea71b 100644
> --- a/target-i386/op_helper.c
> +++ b/target-i386/op_helper.c
> @@ -52,11 +52,11 @@ static inline target_long lshift(target_long x, int n)
> }
> }
>
> -#define RC_MASK 0xc00
> -#define RC_NEAR 0x000
> -#define RC_DOWN 0x400
> -#define RC_UP 0x800
> -#define RC_CHOP 0xc00
> +#define FPU_RC_MASK 0xc00
> +#define FPU_RC_NEAR 0x000
> +#define FPU_RC_DOWN 0x400
> +#define FPU_RC_UP 0x800
> +#define FPU_RC_CHOP 0xc00
>
> #define MAXTAN 9223372036854775808.0
>
> @@ -4024,18 +4024,18 @@ static void update_fp_status(void)
> int rnd_type;
>
> /* set rounding mode */
> - switch(env->fpuc & RC_MASK) {
> + switch(env->fpuc & FPU_RC_MASK) {
> default:
> - case RC_NEAR:
> + case FPU_RC_NEAR:
> rnd_type = float_round_nearest_even;
> break;
> - case RC_DOWN:
> + case FPU_RC_DOWN:
> rnd_type = float_round_down;
> break;
> - case RC_UP:
> + case FPU_RC_UP:
> rnd_type = float_round_up;
> break;
> - case RC_CHOP:
> + case FPU_RC_CHOP:
> rnd_type = float_round_to_zero;
> break;
> }
> @@ -5629,6 +5629,50 @@ void helper_vmexit(uint32_t exit_code, uint64_t exit_info_1)
>
> /* MMX/SSE */
> /* XXX: optimize by storing fptt and fptags in the static cpu state */
> +
> +#define SSE_DAZ 0x0040
> +#define SSE_RC_MASK 0x6000
> +#define SSE_RC_NEAR 0x0000
> +#define SSE_RC_DOWN 0x2000
> +#define SSE_RC_UP 0x4000
> +#define SSE_RC_CHOP 0x6000
> +#define SSE_FZ 0x8000
> +
> +static void update_sse_status(void)
> +{
> + int rnd_type;
> +
> + /* set rounding mode */
> + switch(env->mxcsr & SSE_RC_MASK) {
> + default:
> + case SSE_RC_NEAR:
> + rnd_type = float_round_nearest_even;
> + break;
> + case SSE_RC_DOWN:
> + rnd_type = float_round_down;
> + break;
> + case SSE_RC_UP:
> + rnd_type = float_round_up;
> + break;
> + case SSE_RC_CHOP:
> + rnd_type = float_round_to_zero;
> + break;
> + }
> + set_float_rounding_mode(rnd_type, &env->sse_status);
> +
> + /* set denormals are zero */
> + set_flush_inputs_to_zero((env->mxcsr & SSE_DAZ) ? 1 : 0, &env->sse_status);
> +
> + /* set flush to zero */
> + set_flush_to_zero((env->mxcsr & SSE_FZ) ? 1 : 0, &env->fp_status);
> +}
> +
> +void helper_ldmxcsr(uint32_t val)
> +{
> + env->mxcsr = val;
> + update_sse_status();
> +}
> +
> void helper_enter_mmx(void)
> {
> env->fpstt = 0;
> diff --git a/target-i386/translate.c b/target-i386/translate.c
> index 8321bf3..b9839c5 100644
> --- a/target-i386/translate.c
> +++ b/target-i386/translate.c
> @@ -7544,7 +7544,7 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
> gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
> if (op == 2) {
> gen_op_ld_T0_A0(OT_LONG + s->mem_index);
> - tcg_gen_st32_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
> + gen_helper_ldmxcsr(cpu_T[0]);
> } else {
> tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
> gen_op_st_T0_A0(OT_LONG + s->mem_index);
> --
> 1.7.7.3
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
2012-01-12 5:37 ` Dong Xu Wang
@ 2012-01-13 9:40 ` Markus Armbruster
2012-01-13 15:14 ` [Qemu-devel] [Qemu-stable] " Justin M. Forbes
2012-01-13 16:07 ` [Qemu-devel] " Aurelien Jarno
1 sibling, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2012-01-13 9:40 UTC (permalink / raw)
To: Dong Xu Wang; +Cc: qemu-devel, Aurelien Jarno, qemu-stable
Dong Xu Wang <wdongxu@linux.vnet.ibm.com> writes:
> After applied this patch, while I was compiling on my lap, there will
> be an error:
>
> ./configure --enable-kvm --target-list=x86_64-softmmu && make
> CC x86_64-softmmu/translate.o
> /qemu/target-i386/translate.c: In function ‘disas_insn’:
> /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> argument 1 of ‘gen_helper_ldmxcsr’
> /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> argument is of type ‘TCGv_i64’
> make[1]: *** [translate.o] Error 1
> make: *** [subdir-x86_64-softmmu] Error 2
I see this, too.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [Qemu-stable] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
2012-01-13 9:40 ` Markus Armbruster
@ 2012-01-13 15:14 ` Justin M. Forbes
0 siblings, 0 replies; 11+ messages in thread
From: Justin M. Forbes @ 2012-01-13 15:14 UTC (permalink / raw)
To: Markus Armbruster; +Cc: Dong Xu Wang, qemu-devel, qemu-stable
On Fri, 2012-01-13 at 10:40 +0100, Markus Armbruster wrote:
> Dong Xu Wang <wdongxu@linux.vnet.ibm.com> writes:
>
> > After applied this patch, while I was compiling on my lap, there will
> > be an error:
> >
> > ./configure --enable-kvm --target-list=x86_64-softmmu && make
> > CC x86_64-softmmu/translate.o
> > /qemu/target-i386/translate.c: In function ‘disas_insn’:
> > /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> > argument 1 of ‘gen_helper_ldmxcsr’
> > /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> > argument is of type ‘TCGv_i64’
> > make[1]: *** [translate.o] Error 1
> > make: *** [subdir-x86_64-softmmu] Error 2
>
> I see this, too.
I will take a look, I am not seeing it right now with the full stable
tree, though I am doing a much larger config.
Justin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] target-i386: fix SSE rounding and flush to zero
2012-01-12 5:37 ` Dong Xu Wang
2012-01-13 9:40 ` Markus Armbruster
@ 2012-01-13 16:07 ` Aurelien Jarno
1 sibling, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2012-01-13 16:07 UTC (permalink / raw)
To: Dong Xu Wang; +Cc: qemu-devel, qemu-stable
On Thu, Jan 12, 2012 at 01:37:59PM +0800, Dong Xu Wang wrote:
> After applied this patch, while I was compiling on my lap, there will
> be an error:
>
> ./configure --enable-kvm --target-list=x86_64-softmmu && make
> CC x86_64-softmmu/translate.o
> /qemu/target-i386/translate.c: In function ‘disas_insn’:
> /qemu/target-i386/translate.c:7547:17: error: incompatible type for
> argument 1 of ‘gen_helper_ldmxcsr’
> /qemu/target-i386/helper.h:200:1: note: expected ‘TCGv_i32’ but
> argument is of type ‘TCGv_i64’
> make[1]: *** [translate.o] Error 1
> make: *** [subdir-x86_64-softmmu] Error 2
Sorry about that, I have pushed the following patch which solves the
problem.
target-i386: fix compilation with --enable-debug-tcg
Commit 2355c16e74ffa4d14e7fc2b4a23b055565ac0221 introduced a new ldmxcsr
helper taking an i32 argument, but the helper is actually passed a long.
Fix that by truncating the long to i32.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
target-i386/translate.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/target-i386/translate.c b/target-i386/translate.c
index b9839c5..860b4a3 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7544,7 +7544,8 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
if (op == 2) {
gen_op_ld_T0_A0(OT_LONG + s->mem_index);
- gen_helper_ldmxcsr(cpu_T[0]);
+ tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]);
+ gen_helper_ldmxcsr(cpu_tmp2_i32);
} else {
tcg_gen_ld32u_tl(cpu_T[0], cpu_env, offsetof(CPUX86State, mxcsr));
gen_op_st_T0_A0(OT_LONG + s->mem_index);
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply related [flat|nested] 11+ messages in thread