qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
@ 2012-09-06 15:00 Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

This patch series improves the TCG optimizer, based on patterns found
while executing various guest. The brcond ad setcond constant folding
are useful especially useful when they are used to avoid some argument
values (e.g. division by 0), and thus can be optimized when this argument
is a constant.

This bring around 0.5% improvement on openssl like benchmarks.

Aurelien Jarno (8):
  tcg: improve profiler
  tcg/optimize: split expression simplification
  tcg/optimize: simplify or/xor r, a, 0 cases
  tcg/optimize: simplify and r, a, 0 cases
  tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
  tcg/optimize: swap brcond/setcond arguments when possible
  tcg/optimize: add constant folding for setcond
  tcg/optimize: add constant folding for brcond

 tcg/optimize.c |  161 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 tcg/tcg.c      |   18 ++++---
 tcg/tcg.h      |    2 +-
 3 files changed, 170 insertions(+), 11 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/8] tcg: improve profiler
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification Aurelien Jarno
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Now that there are two passes of optimization (optimize.c, liveness)
there is no point of outputing the statistics of the liveness part
only. Update the code to take into account both optimizations.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/tcg.c |   18 ++++++++++--------
 tcg/tcg.h |    2 +-
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8386b70..8907b9f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2059,22 +2059,23 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf,
     }
 #endif
 
+#ifdef CONFIG_PROFILER
+    s->opt_time -= profile_getclock();
+#endif
+
 #ifdef USE_TCG_OPTIMIZATIONS
     gen_opparam_ptr =
         tcg_optimize(s, gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
 #endif
-
-#ifdef CONFIG_PROFILER
-    s->la_time -= profile_getclock();
-#endif
     tcg_liveness_analysis(s);
+
 #ifdef CONFIG_PROFILER
-    s->la_time += profile_getclock();
+    s->opt_time += profile_getclock();
 #endif
 
 #ifdef DEBUG_DISAS
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP_OPT))) {
-        qemu_log("OP after liveness analysis:\n");
+        qemu_log("OP after optimization:\n");
         tcg_dump_ops(s);
         qemu_log("\n");
     }
@@ -2241,8 +2242,9 @@ void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
                 (double)s->interm_time / tot * 100.0);
     cpu_fprintf(f, "  gen_code time     %0.1f%%\n", 
                 (double)s->code_time / tot * 100.0);
-    cpu_fprintf(f, "liveness/code time  %0.1f%%\n", 
-                (double)s->la_time / (s->code_time ? s->code_time : 1) * 100.0);
+    cpu_fprintf(f, "optim./code time    %0.1f%%\n",
+                (double)s->opt_time / (s->code_time ? s->code_time : 1)
+                * 100.0);
     cpu_fprintf(f, "cpu_restore count   %" PRId64 "\n",
                 s->restore_count);
     cpu_fprintf(f, "  avg cycles        %0.1f\n",
diff --git a/tcg/tcg.h b/tcg/tcg.h
index d710694..7d63db5 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -381,7 +381,7 @@ struct TCGContext {
     int64_t code_out_len;
     int64_t interm_time;
     int64_t code_time;
-    int64_t la_time;
+    int64_t opt_time;
     int64_t restore_count;
     int64_t restore_time;
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases Aurelien Jarno
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Split expression simplification in multiple parts so that a given op
can appear multiple times. This patch should not change anything.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9c65474..63f970d 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -322,7 +322,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
-        /* Simplify expression if possible. */
+        /* Simplify expression for "op r, a, 0 => mov r, a" cases */
         switch (op) {
         CASE_OP_32_64(add):
         CASE_OP_32_64(sub):
@@ -352,6 +352,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 continue;
             }
             break;
+        default:
+            break;
+        }
+
+        /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
+        switch (op) {
         CASE_OP_32_64(mul):
             if ((temps[args[2]].state == TCG_TEMP_CONST
                 && temps[args[2]].val == 0)) {
@@ -362,6 +368,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 continue;
             }
             break;
+        default:
+            break;
+        }
+
+        /* Simplify expression for "op r, a, a => mov r, a" cases */
+        switch (op) {
         CASE_OP_32_64(or):
         CASE_OP_32_64(and):
             if (args[1] == args[2]) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and " Aurelien Jarno
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

or/xor r, a, 0 is equivalent to a mov r, a.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 63f970d..0db849e 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -331,6 +331,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         CASE_OP_32_64(sar):
         CASE_OP_32_64(rotl):
         CASE_OP_32_64(rotr):
+        CASE_OP_32_64(or):
+        CASE_OP_32_64(xor):
             if (temps[args[1]].state == TCG_TEMP_CONST) {
                 /* Proceed with possible constant folding. */
                 break;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and r, a, 0 cases
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (2 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, " Aurelien Jarno
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

and r, a, 0 is equivalent to a movi r, 0.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 0db849e..c12cb2b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -360,6 +360,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
 
         /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
         switch (op) {
+        CASE_OP_32_64(and):
         CASE_OP_32_64(mul):
             if ((temps[args[2]].state == TCG_TEMP_CONST
                 && temps[args[2]].val == 0)) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (3 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and " Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible Aurelien Jarno
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

shift/rot r, 0, a is equivalent to movi r, 0.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index c12cb2b..1698ba3 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -322,6 +322,26 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
+        /* Simplify expressions for "shift/rot r, 0, a => movi r, 0" */
+        switch (op) {
+        CASE_OP_32_64(shl):
+        CASE_OP_32_64(shr):
+        CASE_OP_32_64(sar):
+        CASE_OP_32_64(rotl):
+        CASE_OP_32_64(rotr):
+            if (temps[args[1]].state == TCG_TEMP_CONST
+                && temps[args[1]].val == 0) {
+                gen_opc_buf[op_index] = op_to_movi(op);
+                tcg_opt_gen_movi(gen_args, args[0], 0, nb_temps, nb_globals);
+                args += 3;
+                gen_args += 2;
+                continue;
+            }
+            break;
+        default:
+            break;
+        }
+
         /* Simplify expression for "op r, a, 0 => mov r, a" cases */
         switch (op) {
         CASE_OP_32_64(add):
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (4 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, " Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

brcond and setcond ops are not commutative, but it's easy to compute the
new condition after swapping the arguments. Try to always put the constant
argument in second position like for commutative ops, to help backends to
generate better code.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 1698ba3..7debc8a 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -318,6 +318,24 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 args[2] = tmp;
             }
             break;
+        CASE_OP_32_64(brcond):
+            if (temps[args[0]].state == TCG_TEMP_CONST
+                && temps[args[1]].state != TCG_TEMP_CONST) {
+                tmp = args[0];
+                args[0] = args[1];
+                args[1] = tmp;
+                args[2] = tcg_swap_cond(args[2]);
+            }
+            break;
+        CASE_OP_32_64(setcond):
+            if (temps[args[1]].state == TCG_TEMP_CONST
+                && temps[args[2]].state != TCG_TEMP_CONST) {
+                tmp = args[1];
+                args[1] = args[2];
+                args[2] = tmp;
+                args[3] = tcg_swap_cond(args[3]);
+            }
+            break;
         default:
             break;
         }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (5 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 16:40   ` Richard Henderson
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond Aurelien Jarno
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 7debc8a..c4af1e8 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
     return res;
 }
 
+static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
+                                       TCGArg y, TCGCond c)
+{
+    switch (op_bits(op)) {
+    case 32:
+        switch (c) {
+        case TCG_COND_EQ:
+            return (uint32_t)x == (uint32_t)y;
+        case TCG_COND_NE:
+            return (uint32_t)x != (uint32_t)y;
+        case TCG_COND_LT:
+            return (int32_t)x < (int32_t)y;
+        case TCG_COND_GE:
+            return (int32_t)x >= (int32_t)y;
+        case TCG_COND_LE:
+            return (int32_t)x <= (int32_t)y;
+        case TCG_COND_GT:
+            return (int32_t)x > (int32_t)y;
+        case TCG_COND_LTU:
+            return (uint32_t)x < (uint32_t)y;
+        case TCG_COND_GEU:
+            return (uint32_t)x >= (uint32_t)y;
+        case TCG_COND_LEU:
+            return (uint32_t)x <= (uint32_t)y;
+        case TCG_COND_GTU:
+            return (uint32_t)x > (uint32_t)y;
+    }
+    case 64:
+        switch (c) {
+        case TCG_COND_EQ:
+            return (uint64_t)x == (uint64_t)y;
+        case TCG_COND_NE:
+            return (uint64_t)x != (uint64_t)y;
+        case TCG_COND_LT:
+            return (int64_t)x < (int64_t)y;
+        case TCG_COND_GE:
+            return (int64_t)x >= (int64_t)y;
+        case TCG_COND_LE:
+            return (int64_t)x <= (int64_t)y;
+        case TCG_COND_GT:
+            return (int64_t)x > (int64_t)y;
+        case TCG_COND_LTU:
+            return (uint64_t)x < (uint64_t)y;
+        case TCG_COND_GEU:
+            return (uint64_t)x >= (uint64_t)y;
+        case TCG_COND_LEU:
+            return (uint64_t)x <= (uint64_t)y;
+        case TCG_COND_GTU:
+            return (uint64_t)x > (uint64_t)y;
+    }
+    default:
+        fprintf(stderr,
+                "Unrecognized bitness %d or condition %d in "
+                "do_constant_folding_cond.\n", op_bits(op), c);
+        tcg_abort();
+    }
+}
+
+
 /* Propagate constants and copies, fold constant expressions. */
 static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                                     TCGArg *args, TCGOpDef *tcg_op_defs)
@@ -522,6 +581,26 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 args += 3;
                 break;
             }
+        CASE_OP_32_64(setcond):
+            if (temps[args[1]].state == TCG_TEMP_CONST
+                && temps[args[2]].state == TCG_TEMP_CONST) {
+                gen_opc_buf[op_index] = op_to_movi(op);
+                tmp = do_constant_folding_cond(op, temps[args[1]].val,
+                                               temps[args[2]].val, args[3]);
+                tcg_opt_gen_movi(gen_args, args[0], tmp, nb_temps, nb_globals);
+                gen_args += 2;
+                args += 4;
+                break;
+            } else {
+                reset_temp(args[0], nb_temps, nb_globals);
+                gen_args[0] = args[0];
+                gen_args[1] = args[1];
+                gen_args[2] = args[2];
+                gen_args[3] = args[3];
+                gen_args += 4;
+                args += 4;
+                break;
+            }
         case INDEX_op_call:
             nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
             if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (6 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
  2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
  2012-09-07 12:34 ` Peter Maydell
  9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/optimize.c |   27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index c4af1e8..1221b8b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -601,6 +601,32 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 args += 4;
                 break;
             }
+        CASE_OP_32_64(brcond):
+            if (temps[args[0]].state == TCG_TEMP_CONST
+                && temps[args[1]].state == TCG_TEMP_CONST) {
+                if (do_constant_folding_cond(op, temps[args[0]].val,
+                                             temps[args[1]].val, args[2])) {
+                    memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                    gen_opc_buf[op_index] = INDEX_op_br;
+                    gen_args[0] = args[3];
+                    gen_args += 1;
+                    args += 4;
+                } else {
+                    gen_opc_buf[op_index] = INDEX_op_nop;
+                    args += 4;
+                }
+                break;
+            } else {
+                memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                reset_temp(args[0], nb_temps, nb_globals);
+                gen_args[0] = args[0];
+                gen_args[1] = args[1];
+                gen_args[2] = args[2];
+                gen_args[3] = args[3];
+                gen_args += 4;
+                args += 4;
+                break;
+            }
         case INDEX_op_call:
             nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
             if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {
@@ -622,7 +648,6 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         case INDEX_op_set_label:
         case INDEX_op_jmp:
         case INDEX_op_br:
-        CASE_OP_32_64(brcond):
             memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
             for (i = 0; i < def->nb_args; i++) {
                 *gen_args = *args;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
@ 2012-09-06 16:40   ` Richard Henderson
  2012-09-07 10:05     ` Aurelien Jarno
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2012-09-06 16:40 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  tcg/optimize.c |   79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 7debc8a..c4af1e8 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
>      return res;
>  }
>  
> +static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
> +                                       TCGArg y, TCGCond c)
> +{
> +    switch (op_bits(op)) {
> +    case 32:
> +        switch (c) {
> +        case TCG_COND_EQ:
> +            return (uint32_t)x == (uint32_t)y;
> +        case TCG_COND_NE:
> +            return (uint32_t)x != (uint32_t)y;
> +        case TCG_COND_LT:
> +            return (int32_t)x < (int32_t)y;
> +        case TCG_COND_GE:
> +            return (int32_t)x >= (int32_t)y;
> +        case TCG_COND_LE:
> +            return (int32_t)x <= (int32_t)y;
> +        case TCG_COND_GT:
> +            return (int32_t)x > (int32_t)y;
> +        case TCG_COND_LTU:
> +            return (uint32_t)x < (uint32_t)y;
> +        case TCG_COND_GEU:
> +            return (uint32_t)x >= (uint32_t)y;
> +        case TCG_COND_LEU:
> +            return (uint32_t)x <= (uint32_t)y;
> +        case TCG_COND_GTU:
> +            return (uint32_t)x > (uint32_t)y;
> +    }
> +    case 64:
> +        switch (c) {
> +        case TCG_COND_EQ:
> +            return (uint64_t)x == (uint64_t)y;
> +        case TCG_COND_NE:
> +            return (uint64_t)x != (uint64_t)y;
> +        case TCG_COND_LT:
> +            return (int64_t)x < (int64_t)y;
> +        case TCG_COND_GE:
> +            return (int64_t)x >= (int64_t)y;
> +        case TCG_COND_LE:
> +            return (int64_t)x <= (int64_t)y;
> +        case TCG_COND_GT:
> +            return (int64_t)x > (int64_t)y;
> +        case TCG_COND_LTU:
> +            return (uint64_t)x < (uint64_t)y;
> +        case TCG_COND_GEU:
> +            return (uint64_t)x >= (uint64_t)y;
> +        case TCG_COND_LEU:
> +            return (uint64_t)x <= (uint64_t)y;
> +        case TCG_COND_GTU:
> +            return (uint64_t)x > (uint64_t)y;
> +    }
> +    default:
> +        fprintf(stderr,
> +                "Unrecognized bitness %d or condition %d in "
> +                "do_constant_folding_cond.\n", op_bits(op), c);
> +        tcg_abort();
> +    }

You probably don't want the default here, but the statements after
the outer switch, and with proper breaks between the two cases.
Otherwise the error doesn't do what you wanted it to do.



r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (7 preceding siblings ...)
  2012-09-06 15:00 ` [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond Aurelien Jarno
@ 2012-09-06 16:43 ` Richard Henderson
  2012-09-07 10:06   ` Aurelien Jarno
  2012-09-07 12:34 ` Peter Maydell
  9 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2012-09-06 16:43 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> This patch series improves the TCG optimizer, based on patterns found
> while executing various guest. The brcond ad setcond constant folding
> are useful especially useful when they are used to avoid some argument
> values (e.g. division by 0), and thus can be optimized when this argument
> is a constant.
> 
> This bring around 0.5% improvement on openssl like benchmarks.
> 
> Aurelien Jarno (8):
>   tcg: improve profiler
>   tcg/optimize: split expression simplification
>   tcg/optimize: simplify or/xor r, a, 0 cases
>   tcg/optimize: simplify and r, a, 0 cases
>   tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
>   tcg/optimize: swap brcond/setcond arguments when possible
>   tcg/optimize: add constant folding for setcond
>   tcg/optimize: add constant folding for brcond

Patches 1-6,8:
Reviewed-by: Richard Henderson <rth@twiddle.net>

Patch 7 contains a trivial error.  With that fixed it could
also bear my Reviewed-by mark.


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
  2012-09-06 16:40   ` Richard Henderson
@ 2012-09-07 10:05     ` Aurelien Jarno
  0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 10:05 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Thu, Sep 06, 2012 at 09:40:57AM -0700, Richard Henderson wrote:
> On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> > ---
> >  tcg/optimize.c |   79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 79 insertions(+)
> > 
> > diff --git a/tcg/optimize.c b/tcg/optimize.c
> > index 7debc8a..c4af1e8 100644
> > --- a/tcg/optimize.c
> > +++ b/tcg/optimize.c
> > @@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
> >      return res;
> >  }
> >  
> > +static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
> > +                                       TCGArg y, TCGCond c)
> > +{
> > +    switch (op_bits(op)) {
> > +    case 32:
> > +        switch (c) {
> > +        case TCG_COND_EQ:
> > +            return (uint32_t)x == (uint32_t)y;
> > +        case TCG_COND_NE:
> > +            return (uint32_t)x != (uint32_t)y;
> > +        case TCG_COND_LT:
> > +            return (int32_t)x < (int32_t)y;
> > +        case TCG_COND_GE:
> > +            return (int32_t)x >= (int32_t)y;
> > +        case TCG_COND_LE:
> > +            return (int32_t)x <= (int32_t)y;
> > +        case TCG_COND_GT:
> > +            return (int32_t)x > (int32_t)y;
> > +        case TCG_COND_LTU:
> > +            return (uint32_t)x < (uint32_t)y;
> > +        case TCG_COND_GEU:
> > +            return (uint32_t)x >= (uint32_t)y;
> > +        case TCG_COND_LEU:
> > +            return (uint32_t)x <= (uint32_t)y;
> > +        case TCG_COND_GTU:
> > +            return (uint32_t)x > (uint32_t)y;
> > +    }
> > +    case 64:
> > +        switch (c) {
> > +        case TCG_COND_EQ:
> > +            return (uint64_t)x == (uint64_t)y;
> > +        case TCG_COND_NE:
> > +            return (uint64_t)x != (uint64_t)y;
> > +        case TCG_COND_LT:
> > +            return (int64_t)x < (int64_t)y;
> > +        case TCG_COND_GE:
> > +            return (int64_t)x >= (int64_t)y;
> > +        case TCG_COND_LE:
> > +            return (int64_t)x <= (int64_t)y;
> > +        case TCG_COND_GT:
> > +            return (int64_t)x > (int64_t)y;
> > +        case TCG_COND_LTU:
> > +            return (uint64_t)x < (uint64_t)y;
> > +        case TCG_COND_GEU:
> > +            return (uint64_t)x >= (uint64_t)y;
> > +        case TCG_COND_LEU:
> > +            return (uint64_t)x <= (uint64_t)y;
> > +        case TCG_COND_GTU:
> > +            return (uint64_t)x > (uint64_t)y;
> > +    }
> > +    default:
> > +        fprintf(stderr,
> > +                "Unrecognized bitness %d or condition %d in "
> > +                "do_constant_folding_cond.\n", op_bits(op), c);
> > +        tcg_abort();
> > +    }
> 
> You probably don't want the default here, but the statements after
> the outer switch, and with proper breaks between the two cases.
> Otherwise the error doesn't do what you wanted it to do.
> 

Good catch, i'll fix that in version 2.


-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
  2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
@ 2012-09-07 10:06   ` Aurelien Jarno
  0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 10:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Thu, Sep 06, 2012 at 09:43:08AM -0700, Richard Henderson wrote:
> On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> > This patch series improves the TCG optimizer, based on patterns found
> > while executing various guest. The brcond ad setcond constant folding
> > are useful especially useful when they are used to avoid some argument
> > values (e.g. division by 0), and thus can be optimized when this argument
> > is a constant.
> > 
> > This bring around 0.5% improvement on openssl like benchmarks.
> > 
> > Aurelien Jarno (8):
> >   tcg: improve profiler
> >   tcg/optimize: split expression simplification
> >   tcg/optimize: simplify or/xor r, a, 0 cases
> >   tcg/optimize: simplify and r, a, 0 cases
> >   tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
> >   tcg/optimize: swap brcond/setcond arguments when possible
> >   tcg/optimize: add constant folding for setcond
> >   tcg/optimize: add constant folding for brcond
> 
> Patches 1-6,8:
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> 
> Patch 7 contains a trivial error.  With that fixed it could
> also bear my Reviewed-by mark.
> 

Thanks for the review.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
  2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
                   ` (8 preceding siblings ...)
  2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
@ 2012-09-07 12:34 ` Peter Maydell
  2012-09-07 13:00   ` Aurelien Jarno
  9 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2012-09-07 12:34 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 6 September 2012 16:00, Aurelien Jarno <aurelien@aurel32.net> wrote:
> This patch series improves the TCG optimizer, based on patterns found
> while executing various guest. The brcond ad setcond constant folding
> are useful especially useful when they are used to avoid some argument
> values (e.g. division by 0), and thus can be optimized when this argument
> is a constant.
>
> This bring around 0.5% improvement on openssl like benchmarks.

This didn't overall seem to make much difference on my popular
embedded benchmark setup. However I am rapidly losing confidence
in the benchmark since from run to run individual tests can have
results which vary by a factor of two, which is such high
variation it's almost impossible to say whether a change has
had an overall +1% or -1% effect. Hohum.

-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
  2012-09-07 12:34 ` Peter Maydell
@ 2012-09-07 13:00   ` Aurelien Jarno
  0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 13:00 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On Fri, Sep 07, 2012 at 01:34:10PM +0100, Peter Maydell wrote:
> On 6 September 2012 16:00, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > This patch series improves the TCG optimizer, based on patterns found
> > while executing various guest. The brcond ad setcond constant folding
> > are useful especially useful when they are used to avoid some argument
> > values (e.g. division by 0), and thus can be optimized when this argument
> > is a constant.
> >
> > This bring around 0.5% improvement on openssl like benchmarks.
> 
> This didn't overall seem to make much difference on my popular
> embedded benchmark setup. However I am rapidly losing confidence
> in the benchmark since from run to run individual tests can have
> results which vary by a factor of two, which is such high
> variation it's almost impossible to say whether a change has
> had an overall +1% or -1% effect. Hohum.
> 

I am usually doing the tests by setting the CPU performance to
performance and by pinning QEMU to a given CPU, on a machine without or
with very few other tasks. This improve the stability of the results.

Unfortunately it's not easy to do that on a laptop, especially when
running on battery.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-09-07 13:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and " Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, " Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
2012-09-06 16:40   ` Richard Henderson
2012-09-07 10:05     ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond Aurelien Jarno
2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
2012-09-07 10:06   ` Aurelien Jarno
2012-09-07 12:34 ` Peter Maydell
2012-09-07 13:00   ` Aurelien Jarno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).