qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] Impove alpha cmpbge
@ 2015-08-17 22:03 Richard Henderson
  2015-08-17 22:03 ` [Qemu-devel] [PATCH 1/2] target-alpha: Rewrite helper_cmpbge using bit tests Richard Henderson
  2015-08-17 22:03 ` [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero Richard Henderson
  0 siblings, 2 replies; 4+ messages in thread
From: Richard Henderson @ 2015-08-17 22:03 UTC (permalink / raw)
  To: qemu-devel

This is practically as fast as the SSE version I posted a
while ago, without actually using vector data types.


r~


Richard Henderson (2):
  target-alpha: Rewrite helper_cmpbge using bit tests
  target-alpha: Special case cmpbge with zero

 target-alpha/helper.h     |  1 +
 target-alpha/int_helper.c | 50 +++++++++++++++++++++++++++++++++++------------
 target-alpha/translate.c  |  7 ++++++-
 3 files changed, 45 insertions(+), 13 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 1/2] target-alpha: Rewrite helper_cmpbge using bit tests
  2015-08-17 22:03 [Qemu-devel] [PATCH 0/2] Impove alpha cmpbge Richard Henderson
@ 2015-08-17 22:03 ` Richard Henderson
  2015-08-17 22:03 ` [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero Richard Henderson
  1 sibling, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2015-08-17 22:03 UTC (permalink / raw)
  To: qemu-devel

Not quite as good as using a proper host vector compare,
but certainly better than a loop.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/int_helper.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/target-alpha/int_helper.c b/target-alpha/int_helper.c
index 74f38cb..4a6e955 100644
--- a/target-alpha/int_helper.c
+++ b/target-alpha/int_helper.c
@@ -58,20 +58,33 @@ uint64_t helper_zap(uint64_t val, uint64_t mask)
     return helper_zapnot(val, ~mask);
 }
 
-uint64_t helper_cmpbge(uint64_t op1, uint64_t op2)
+uint64_t helper_cmpbge(uint64_t a, uint64_t b)
 {
-    uint8_t opa, opb, res;
-    int i;
-
-    res = 0;
-    for (i = 0; i < 8; i++) {
-        opa = op1 >> (i * 8);
-        opb = op2 >> (i * 8);
-        if (opa >= opb) {
-            res |= 1 << i;
-        }
-    }
-    return res;
+    uint64_t mask = 0x00ff00ff00ff00ffULL;
+    uint64_t test = 0x0100010001000100ULL;
+    uint64_t al, ah, bl, bh, cl, ch;
+
+    /* Separate the bytes to avoid false positives.  */
+    al = a & mask;
+    bl = b & mask;
+    ah = (a >> 8) & mask;
+    bh = (b >> 8) & mask;
+
+    /* "Compare".  If a byte in B is greater than a byte in A,
+       it will clear the test bit.  */
+    cl = ((al | test) - bl) & test;
+    ch = ((ah | test) - bh) & test;
+
+    /* Fold all of the test bits into a contiguous set.  */
+    /* ch=.......a...............c...............e...............g........ */
+    /* cl=.......b...............d...............f...............h........ */
+    cl += ch << 1;
+    /* cl=......ab..............cd..............ef..............gh........ */
+    cl |= cl << 14;
+    /* cl=......abcd............cdef............efgh............gh........ */
+    cl |= cl << 28;
+    /* cl=......abcdefgh........cdefgh..........efgh............gh........ */
+    return cl >> 50;
 }
 
 uint64_t helper_minub8(uint64_t op1, uint64_t op2)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero
  2015-08-17 22:03 [Qemu-devel] [PATCH 0/2] Impove alpha cmpbge Richard Henderson
  2015-08-17 22:03 ` [Qemu-devel] [PATCH 1/2] target-alpha: Rewrite helper_cmpbge using bit tests Richard Henderson
@ 2015-08-17 22:03 ` Richard Henderson
  2015-08-18 16:14   ` Richard Henderson
  1 sibling, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2015-08-17 22:03 UTC (permalink / raw)
  To: qemu-devel

Knowing the comparator is zero leads to a simpler operation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/helper.h     |  1 +
 target-alpha/int_helper.c | 13 +++++++++++++
 target-alpha/translate.c  |  7 ++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/target-alpha/helper.h b/target-alpha/helper.h
index d221f0d..83cbe2a 100644
--- a/target-alpha/helper.h
+++ b/target-alpha/helper.h
@@ -10,6 +10,7 @@ DEF_HELPER_FLAGS_1(cttz, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_2(zap, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(zapnot, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 
+DEF_HELPER_FLAGS_1(cmpbe0, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_2(cmpbge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 
 DEF_HELPER_FLAGS_2(minub8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target-alpha/int_helper.c b/target-alpha/int_helper.c
index 4a6e955..900a7c6 100644
--- a/target-alpha/int_helper.c
+++ b/target-alpha/int_helper.c
@@ -58,6 +58,19 @@ uint64_t helper_zap(uint64_t val, uint64_t mask)
     return helper_zapnot(val, ~mask);
 }
 
+uint64_t helper_cmpbe0(uint64_t a)
+{
+    uint64_t c = (a - 0x0101010101010101ULL) & ~a & 0x8080808080808080ULL;
+    /* a.......b.......c.......d.......e.......f.......g.......h....... */
+    c |= c << 7;
+    /* ab......bc......cd......de......ef......fg......gh......h....... */
+    c |= c << 14;
+    /* abcd....bcde....cdef....defg....efgh....fgh.....gh......h....... */
+    c |= c << 28;
+    /* abcdefghbcdefgh.cdefgh..defgh...efgh....fgh.....gh......h....... */
+    return c >> 56;
+}
+
 uint64_t helper_cmpbge(uint64_t a, uint64_t b)
 {
     uint64_t mask = 0x00ff00ff00ff00ffULL;
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 81d4ff8..b766ae3 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -1507,7 +1507,12 @@ static ExitStatus translate_one(DisasContext *ctx, uint32_t insn)
             break;
         case 0x0F:
             /* CMPBGE */
-            gen_helper_cmpbge(vc, va, vb);
+            if (ra == 31) {
+                /* Special case 0 >= X as X == 0.  */
+                gen_helper_cmpbe0(vc, vb);
+            } else {
+                gen_helper_cmpbge(vc, va, vb);
+            }
             break;
         case 0x12:
             /* S8ADDL */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero
  2015-08-17 22:03 ` [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero Richard Henderson
@ 2015-08-18 16:14   ` Richard Henderson
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2015-08-18 16:14 UTC (permalink / raw)
  To: qemu-devel

On 08/17/2015 03:03 PM, Richard Henderson wrote:
> +    uint64_t c = (a - 0x0101010101010101ULL) & ~a & 0x8080808080808080ULL;

Ho hum.  I was mislead.  This formulation is good for noticing *some* zero in a
word, but not which particular bytes contain zeros.  This difference is hard to
spot in how alpha tends to use this instruction, but this did lead to
bizzare filesystem behaviour.

Failure occurs when a zero preceeds a byte with a one, e.g.

  000000010022656d
  0100016462687773

Avoiding the problem requires one extra operation.  E.g.

    uint64_t m = 0x7f7f7f7f7f7f7f7fULL;
    uint64_t c = ~(((a & m) + m) | a | m);

Which equates to (1) clear high bit (2) carry non-zero into high bit (3)
remerge high bit from source, (4) set low bits and invert to make high bit
positive for found zero.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-18 16:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-17 22:03 [Qemu-devel] [PATCH 0/2] Impove alpha cmpbge Richard Henderson
2015-08-17 22:03 ` [Qemu-devel] [PATCH 1/2] target-alpha: Rewrite helper_cmpbge using bit tests Richard Henderson
2015-08-17 22:03 ` [Qemu-devel] [PATCH 2/2] target-alpha: Special case cmpbge with zero Richard Henderson
2015-08-18 16:14   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).