qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2
@ 2013-03-26 19:01 Aurelien Jarno
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction Aurelien Jarno
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

SSE4.1 and SSE4.2 instruction sets are partly broken, at least enough to
render a recent glibc with ifunc enabled unusable.

This patch series fixes the issues, it has been tested with the valgrind
testsuite in user mode and by booting x86 and x86-64 guests with a
recent glibc in system mode. 

Aurelien Jarno (10):
  target-i386: SSE4.1: fix pinsrb instruction
  target-i386: SSE4.2: fix pcmpgtq instruction
  target-i386: SSE4.2: fix pcmpXstri instructions
  target-i386: SSE4.2: fix pcmpXstrm instructions
  target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode
  target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode
  target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode
  target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity
  target-i386: enable SSE4.1 and SSE4.2 in TCG mode
  target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel

 target-i386/cpu.c        |   13 +++++-----
 target-i386/fpu_helper.c |    1 +
 target-i386/ops_sse.h    |   64 +++++++++++++---------------------------------
 target-i386/translate.c  |    4 +--
 4 files changed, 28 insertions(+), 54 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:03   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction Aurelien Jarno
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

gen_op_mov_TN_reg() loads the value in cpu_T[0], so this temporary should
be used instead of cpu_tmp0.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/translate.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index 7239696..7596a90 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4404,9 +4404,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     if (mod == 3)
                         gen_op_mov_TN_reg(OT_LONG, 0, rm);
                     else
-                        tcg_gen_qemu_ld8u(cpu_tmp0, cpu_A0,
+                        tcg_gen_qemu_ld8u(cpu_T[0], cpu_A0,
                                         (s->mem_index >> 2) - 1);
-                    tcg_gen_st8_tl(cpu_tmp0, cpu_env, offsetof(CPUX86State,
+                    tcg_gen_st8_tl(cpu_T[0], cpu_env, offsetof(CPUX86State,
                                             xmm_regs[reg].XMM_B(val & 15)));
                     break;
                 case 0x21: /* insertps */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:03   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions Aurelien Jarno
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

The "Intel 64 and IA-32 Architectures Software Developer's Manual" (at
least recent versions) clearly says that the comparison is signed.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index cad9d75..0136df9 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -1933,8 +1933,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
 }
 
 /* SSE4.2 op helpers */
-/* it's unclear whether signed or unsigned */
-#define FCMPGTQ(d, s) (d > s ? -1 : 0)
+#define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0)
 SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ)
 
 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction Aurelien Jarno
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:10   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions Aurelien Jarno
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

ffs1 returns the first bit set to one starting counting from the most
significant bit.

pcmpXstri returns the most significant bit set to one, starting counting
from the least significant bit.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 0136df9..0667c87 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2099,7 +2099,7 @@ void glue(helper_pcmpestri, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                  pcmp_elen(env, R_EAX, ctrl));
 
     if (res) {
-        env->regs[R_ECX] = ((ctrl & (1 << 6)) ? rffs1 : ffs1)(res) - 1;
+        env->regs[R_ECX] = (ctrl & (1 << 6)) ? rffs1(res) - 1 : 32 - ffs1(res);
     } else {
         env->regs[R_ECX] = 16 >> (ctrl & (1 << 0));
     }
@@ -2137,7 +2137,7 @@ void glue(helper_pcmpistri, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                  pcmp_ilen(d, ctrl));
 
     if (res) {
-        env->regs[R_ECX] = ((ctrl & (1 << 6)) ? rffs1 : ffs1)(res) - 1;
+        env->regs[R_ECX] = (ctrl & (1 << 6)) ? rffs1(res) - 1 : 32 - ffs1(res);
     } else {
         env->regs[R_ECX] = 16 >> (ctrl & (1 << 0));
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (2 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:10   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode Aurelien Jarno
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

pcmpXstrm instructions returns their result in the XMM0 register and
not in the first operand.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 0667c87..4a95f41 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2116,16 +2116,16 @@ void glue(helper_pcmpestrm, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     if ((ctrl >> 6) & 1) {
         if (ctrl & 1) {
             for (i = 0; i < 8; i++, res >>= 1) {
-                d->W(i) = (res & 1) ? ~0 : 0;
+                env->xmm_regs[0].W(i) = (res & 1) ? ~0 : 0;
             }
         } else {
             for (i = 0; i < 16; i++, res >>= 1) {
-                d->B(i) = (res & 1) ? ~0 : 0;
+                env->xmm_regs[0].B(i) = (res & 1) ? ~0 : 0;
             }
         }
     } else {
-        d->Q(1) = 0;
-        d->Q(0) = res;
+        env->xmm_regs[0].Q(1) = 0;
+        env->xmm_regs[0].Q(0) = res;
     }
 }
 
@@ -2154,16 +2154,16 @@ void glue(helper_pcmpistrm, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     if ((ctrl >> 6) & 1) {
         if (ctrl & 1) {
             for (i = 0; i < 8; i++, res >>= 1) {
-                d->W(i) = (res & 1) ? ~0 : 0;
+                env->xmm_regs[0].W(i) = (res & 1) ? ~0 : 0;
             }
         } else {
             for (i = 0; i < 16; i++, res >>= 1) {
-                d->B(i) = (res & 1) ? ~0 : 0;
+                env->xmm_regs[0].B(i) = (res & 1) ? ~0 : 0;
             }
         }
     } else {
-        d->Q(1) = 0;
-        d->Q(0) = res;
+        env->xmm_regs[0].Q(1) = 0;
+        env->xmm_regs[0].Q(0) = res;
     }
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (3 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:14   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode Aurelien Jarno
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Fix the order of the of the comparisons to match the "Intel 64 and
IA-32 Architectures Software Developer's Manual".

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 4a95f41..51c5fc9 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2019,8 +2019,8 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
             res <<= 1;
             v = pcmp_val(s, ctrl, j);
             for (i = ((validd - 1) | 1); i >= 0; i -= 2) {
-                res |= (pcmp_val(d, ctrl, i - 0) <= v &&
-                        pcmp_val(d, ctrl, i - 1) >= v);
+                res |= (pcmp_val(d, ctrl, i - 0) >= v &&
+                        pcmp_val(d, ctrl, i - 1) <= v);
             }
         }
         break;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (4 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:15   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode Aurelien Jarno
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

pcmpXstrX instructions in "Equal each" mode force both invalid element
pair to true. It means (upper - MAX(valids, validd)) bits should be set
to 1, not (upper - MAX(valids, validd) + 1).

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 51c5fc9..2fc5fdd 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2025,7 +2025,7 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
         }
         break;
     case 2:
-        res = (2 << (upper - MAX(valids, validd))) - 1;
+        res = (1 << (upper - MAX(valids, validd))) - 1;
         res <<= MAX(valids, validd) - MIN(valids, validd);
         for (i = MIN(valids, validd); i >= 0; i--) {
             res <<= 1;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (5 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:19   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity Aurelien Jarno
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

The inner loop should only change the current bit of the result, instead
of the whole result.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 2fc5fdd..77ab410 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2036,10 +2036,11 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
     case 3:
         for (j = valids - validd; j >= 0; j--) {
             res <<= 1;
-            res |= 1;
+            v = 1;
             for (i = MIN(upper - j, validd); i >= 0; i--) {
-                res &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
+                v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
             }
+            res |= v;
         }
         break;
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (6 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:20   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode Aurelien Jarno
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel Aurelien Jarno
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

valids can equals to -1 if the reg/mem string is empty. Change the
expression to have an empty xor mask in that case.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/ops_sse.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 77ab410..a0bac07 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2050,7 +2050,7 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
         res ^= (2 << upper) - 1;
         break;
     case 3:
-        res ^= (2 << valids) - 1;
+        res ^= (1 << (valids + 1)) - 1;
         break;
     }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (7 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:22   ` Richard Henderson
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel Aurelien Jarno
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/cpu.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index a0640db..4b43759 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -388,16 +388,17 @@ typedef struct x86_def_t {
           /* missing:
           CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */
 #define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | \
-          CPUID_EXT_SSSE3 | CPUID_EXT_CX16 | CPUID_EXT_POPCNT | \
-          CPUID_EXT_MOVBE | CPUID_EXT_HYPERVISOR)
+          CPUID_EXT_SSSE3 | CPUID_EXT_CX16 | CPUID_EXT_SSE41 | \
+          CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | CPUID_EXT_MOVBE | \
+          CPUID_EXT_HYPERVISOR)
           /* missing:
           CPUID_EXT_PCLMULQDQ, CPUID_EXT_DTES64, CPUID_EXT_DSCPL,
           CPUID_EXT_VMX, CPUID_EXT_SMX, CPUID_EXT_EST, CPUID_EXT_TM2,
           CPUID_EXT_CID, CPUID_EXT_FMA, CPUID_EXT_XTPR, CPUID_EXT_PDCM,
-          CPUID_EXT_PCID, CPUID_EXT_DCA, CPUID_EXT_SSE41, CPUID_EXT_SSE42,
-          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AES,
-          CPUID_EXT_XSAVE, CPUID_EXT_OSXSAVE, CPUID_EXT_AVX,
-          CPUID_EXT_F16C, CPUID_EXT_RDRAND */
+          CPUID_EXT_PCID, CPUID_EXT_DCA, CPUID_EXT_X2APIC,
+          CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AES, CPUID_EXT_XSAVE,
+          CPUID_EXT_OSXSAVE, CPUID_EXT_AVX, CPUID_EXT_F16C,
+          CPUID_EXT_RDRAND */
 #define TCG_EXT2_FEATURES ((TCG_FEATURES & CPUID_EXT2_AMD_ALIASES) | \
           CPUID_EXT2_NX | CPUID_EXT2_MMXEXT | CPUID_EXT2_RDTSCP | \
           CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel
  2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
                   ` (8 preceding siblings ...)
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode Aurelien Jarno
@ 2013-03-26 19:01 ` Aurelien Jarno
  2013-03-27 20:22   ` Richard Henderson
  9 siblings, 1 reply; 21+ messages in thread
From: Aurelien Jarno @ 2013-03-26 19:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/fpu_helper.c |    1 +
 target-i386/ops_sse.h    |   32 ++------------------------------
 2 files changed, 3 insertions(+), 30 deletions(-)

diff --git a/target-i386/fpu_helper.c b/target-i386/fpu_helper.c
index 44f3d27..29a8fb6 100644
--- a/target-i386/fpu_helper.c
+++ b/target-i386/fpu_helper.c
@@ -20,6 +20,7 @@
 #include <math.h>
 #include "cpu.h"
 #include "helper.h"
+#include "qemu/host-utils.h"
 
 #if !defined(CONFIG_USER_ONLY)
 #include "exec/softmmu_exec.h"
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a0bac07..a11dba1 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2064,34 +2064,6 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
     return res;
 }
 
-static inline int rffs1(unsigned int val)
-{
-    int ret = 1, hi;
-
-    for (hi = sizeof(val) * 4; hi; hi /= 2) {
-        if (val >> hi) {
-            val >>= hi;
-            ret += hi;
-        }
-    }
-
-    return ret;
-}
-
-static inline int ffs1(unsigned int val)
-{
-    int ret = 1, hi;
-
-    for (hi = sizeof(val) * 4; hi; hi /= 2) {
-        if (val << hi) {
-            val <<= hi;
-            ret += hi;
-        }
-    }
-
-    return ret;
-}
-
 void glue(helper_pcmpestri, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                     uint32_t ctrl)
 {
@@ -2100,7 +2072,7 @@ void glue(helper_pcmpestri, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                  pcmp_elen(env, R_EAX, ctrl));
 
     if (res) {
-        env->regs[R_ECX] = (ctrl & (1 << 6)) ? rffs1(res) - 1 : 32 - ffs1(res);
+        env->regs[R_ECX] = (ctrl & (1 << 6)) ? 31 - clz32(res) : ctz32(res);
     } else {
         env->regs[R_ECX] = 16 >> (ctrl & (1 << 0));
     }
@@ -2138,7 +2110,7 @@ void glue(helper_pcmpistri, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
                                  pcmp_ilen(d, ctrl));
 
     if (res) {
-        env->regs[R_ECX] = (ctrl & (1 << 6)) ? rffs1(res) - 1 : 32 - ffs1(res);
+        env->regs[R_ECX] = (ctrl & (1 << 6)) ? 31 - clz32(res) : ctz32(res);
     } else {
         env->regs[R_ECX] = 16 >> (ctrl & (1 << 0));
     }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction Aurelien Jarno
@ 2013-03-27 20:03   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:03 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> gen_op_mov_TN_reg() loads the value in cpu_T[0], so this temporary should
> be used instead of cpu_tmp0.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/translate.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction Aurelien Jarno
@ 2013-03-27 20:03   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:03 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> The "Intel 64 and IA-32 Architectures Software Developer's Manual" (at
> least recent versions) clearly says that the comparison is signed.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions Aurelien Jarno
@ 2013-03-27 20:10   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:10 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> ffs1 returns the first bit set to one starting counting from the most
> significant bit.
> 
> pcmpXstri returns the most significant bit set to one, starting counting
> from the least significant bit.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

I wonder if this ought not just be squashed with patch 10.
It would have made it easier to review, actually.  That said,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions Aurelien Jarno
@ 2013-03-27 20:10   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:10 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> pcmpXstrm instructions returns their result in the XMM0 register and
> not in the first operand.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode Aurelien Jarno
@ 2013-03-27 20:14   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:14 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> Fix the order of the of the comparisons to match the "Intel 64 and
> IA-32 Architectures Software Developer's Manual".
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode Aurelien Jarno
@ 2013-03-27 20:15   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:15 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> pcmpXstrX instructions in "Equal each" mode force both invalid element
> pair to true. It means (upper - MAX(valids, validd)) bits should be set
> to 1, not (upper - MAX(valids, validd) + 1).
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode Aurelien Jarno
@ 2013-03-27 20:19   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:19 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> The inner loop should only change the current bit of the result, instead
> of the whole result.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity Aurelien Jarno
@ 2013-03-27 20:20   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:20 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> valids can equals to -1 if the reg/mem string is empty. Change the
> expression to have an empty xor mask in that case.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/ops_sse.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode Aurelien Jarno
@ 2013-03-27 20:22   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:22 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/cpu.c |   13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel
  2013-03-26 19:01 ` [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel Aurelien Jarno
@ 2013-03-27 20:22   ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2013-03-27 20:22 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 03/26/2013 12:01 PM, Aurelien Jarno wrote:
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/fpu_helper.c |    1 +
>  target-i386/ops_sse.h    |   32 ++------------------------------
>  2 files changed, 3 insertions(+), 30 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-03-27 20:22 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-26 19:01 [Qemu-devel] [PATCH 00/10] target-i386: fix and enable SSE4.1 and SSE4.2 Aurelien Jarno
2013-03-26 19:01 ` [Qemu-devel] [PATCH 01/10] target-i386: SSE4.1: fix pinsrb instruction Aurelien Jarno
2013-03-27 20:03   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 02/10] target-i386: SSE4.2: fix pcmpgtq instruction Aurelien Jarno
2013-03-27 20:03   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 03/10] target-i386: SSE4.2: fix pcmpXstri instructions Aurelien Jarno
2013-03-27 20:10   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 04/10] target-i386: SSE4.2: fix pcmpXstrm instructions Aurelien Jarno
2013-03-27 20:10   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 05/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" mode Aurelien Jarno
2013-03-27 20:14   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 06/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" mode Aurelien Jarno
2013-03-27 20:15   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 07/10] target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" mode Aurelien Jarno
2013-03-27 20:19   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 08/10] target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarity Aurelien Jarno
2013-03-27 20:20   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 09/10] target-i386: enable SSE4.1 and SSE4.2 in TCG mode Aurelien Jarno
2013-03-27 20:22   ` Richard Henderson
2013-03-26 19:01 ` [Qemu-devel] [PATCH 10/10] target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheel Aurelien Jarno
2013-03-27 20:22   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).