* [Qemu-devel] [PATCH 1/8] tcg: improve profiler
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification Aurelien Jarno
` (8 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
Now that there are two passes of optimization (optimize.c, liveness)
there is no point of outputing the statistics of the liveness part
only. Update the code to take into account both optimizations.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/tcg.c | 18 ++++++++++--------
tcg/tcg.h | 2 +-
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8386b70..8907b9f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2059,22 +2059,23 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf,
}
#endif
+#ifdef CONFIG_PROFILER
+ s->opt_time -= profile_getclock();
+#endif
+
#ifdef USE_TCG_OPTIMIZATIONS
gen_opparam_ptr =
tcg_optimize(s, gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
#endif
-
-#ifdef CONFIG_PROFILER
- s->la_time -= profile_getclock();
-#endif
tcg_liveness_analysis(s);
+
#ifdef CONFIG_PROFILER
- s->la_time += profile_getclock();
+ s->opt_time += profile_getclock();
#endif
#ifdef DEBUG_DISAS
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP_OPT))) {
- qemu_log("OP after liveness analysis:\n");
+ qemu_log("OP after optimization:\n");
tcg_dump_ops(s);
qemu_log("\n");
}
@@ -2241,8 +2242,9 @@ void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
(double)s->interm_time / tot * 100.0);
cpu_fprintf(f, " gen_code time %0.1f%%\n",
(double)s->code_time / tot * 100.0);
- cpu_fprintf(f, "liveness/code time %0.1f%%\n",
- (double)s->la_time / (s->code_time ? s->code_time : 1) * 100.0);
+ cpu_fprintf(f, "optim./code time %0.1f%%\n",
+ (double)s->opt_time / (s->code_time ? s->code_time : 1)
+ * 100.0);
cpu_fprintf(f, "cpu_restore count %" PRId64 "\n",
s->restore_count);
cpu_fprintf(f, " avg cycles %0.1f\n",
diff --git a/tcg/tcg.h b/tcg/tcg.h
index d710694..7d63db5 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -381,7 +381,7 @@ struct TCGContext {
int64_t code_out_len;
int64_t interm_time;
int64_t code_time;
- int64_t la_time;
+ int64_t opt_time;
int64_t restore_count;
int64_t restore_time;
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases Aurelien Jarno
` (7 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
Split expression simplification in multiple parts so that a given op
can appear multiple times. This patch should not change anything.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9c65474..63f970d 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -322,7 +322,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
break;
}
- /* Simplify expression if possible. */
+ /* Simplify expression for "op r, a, 0 => mov r, a" cases */
switch (op) {
CASE_OP_32_64(add):
CASE_OP_32_64(sub):
@@ -352,6 +352,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
continue;
}
break;
+ default:
+ break;
+ }
+
+ /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
+ switch (op) {
CASE_OP_32_64(mul):
if ((temps[args[2]].state == TCG_TEMP_CONST
&& temps[args[2]].val == 0)) {
@@ -362,6 +368,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
continue;
}
break;
+ default:
+ break;
+ }
+
+ /* Simplify expression for "op r, a, a => mov r, a" cases */
+ switch (op) {
CASE_OP_32_64(or):
CASE_OP_32_64(and):
if (args[1] == args[2]) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 1/8] tcg: improve profiler Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 2/8] tcg/optimize: split expression simplification Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and " Aurelien Jarno
` (6 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
or/xor r, a, 0 is equivalent to a mov r, a.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 63f970d..0db849e 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -331,6 +331,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
CASE_OP_32_64(sar):
CASE_OP_32_64(rotl):
CASE_OP_32_64(rotr):
+ CASE_OP_32_64(or):
+ CASE_OP_32_64(xor):
if (temps[args[1]].state == TCG_TEMP_CONST) {
/* Proceed with possible constant folding. */
break;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and r, a, 0 cases
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (2 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 3/8] tcg/optimize: simplify or/xor r, a, 0 cases Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, " Aurelien Jarno
` (5 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
and r, a, 0 is equivalent to a movi r, 0.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 0db849e..c12cb2b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -360,6 +360,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
/* Simplify expression for "op r, a, 0 => movi r, 0" cases */
switch (op) {
+ CASE_OP_32_64(and):
CASE_OP_32_64(mul):
if ((temps[args[2]].state == TCG_TEMP_CONST
&& temps[args[2]].val == 0)) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (3 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 4/8] tcg/optimize: simplify and " Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible Aurelien Jarno
` (4 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
shift/rot r, 0, a is equivalent to movi r, 0.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index c12cb2b..1698ba3 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -322,6 +322,26 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
break;
}
+ /* Simplify expressions for "shift/rot r, 0, a => movi r, 0" */
+ switch (op) {
+ CASE_OP_32_64(shl):
+ CASE_OP_32_64(shr):
+ CASE_OP_32_64(sar):
+ CASE_OP_32_64(rotl):
+ CASE_OP_32_64(rotr):
+ if (temps[args[1]].state == TCG_TEMP_CONST
+ && temps[args[1]].val == 0) {
+ gen_opc_buf[op_index] = op_to_movi(op);
+ tcg_opt_gen_movi(gen_args, args[0], 0, nb_temps, nb_globals);
+ args += 3;
+ gen_args += 2;
+ continue;
+ }
+ break;
+ default:
+ break;
+ }
+
/* Simplify expression for "op r, a, 0 => mov r, a" cases */
switch (op) {
CASE_OP_32_64(add):
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (4 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 5/8] tcg/optimize: simplify shift/rot r, 0, a => movi r, " Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
` (3 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
brcond and setcond ops are not commutative, but it's easy to compute the
new condition after swapping the arguments. Try to always put the constant
argument in second position like for commutative ops, to help backends to
generate better code.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 1698ba3..7debc8a 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -318,6 +318,24 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
args[2] = tmp;
}
break;
+ CASE_OP_32_64(brcond):
+ if (temps[args[0]].state == TCG_TEMP_CONST
+ && temps[args[1]].state != TCG_TEMP_CONST) {
+ tmp = args[0];
+ args[0] = args[1];
+ args[1] = tmp;
+ args[2] = tcg_swap_cond(args[2]);
+ }
+ break;
+ CASE_OP_32_64(setcond):
+ if (temps[args[1]].state == TCG_TEMP_CONST
+ && temps[args[2]].state != TCG_TEMP_CONST) {
+ tmp = args[1];
+ args[1] = args[2];
+ args[2] = tmp;
+ args[3] = tcg_swap_cond(args[3]);
+ }
+ break;
default:
break;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (5 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 6/8] tcg/optimize: swap brcond/setcond arguments when possible Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 16:40 ` Richard Henderson
2012-09-06 15:00 ` [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond Aurelien Jarno
` (2 subsequent siblings)
9 siblings, 1 reply; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 79 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 7debc8a..c4af1e8 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
return res;
}
+static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
+ TCGArg y, TCGCond c)
+{
+ switch (op_bits(op)) {
+ case 32:
+ switch (c) {
+ case TCG_COND_EQ:
+ return (uint32_t)x == (uint32_t)y;
+ case TCG_COND_NE:
+ return (uint32_t)x != (uint32_t)y;
+ case TCG_COND_LT:
+ return (int32_t)x < (int32_t)y;
+ case TCG_COND_GE:
+ return (int32_t)x >= (int32_t)y;
+ case TCG_COND_LE:
+ return (int32_t)x <= (int32_t)y;
+ case TCG_COND_GT:
+ return (int32_t)x > (int32_t)y;
+ case TCG_COND_LTU:
+ return (uint32_t)x < (uint32_t)y;
+ case TCG_COND_GEU:
+ return (uint32_t)x >= (uint32_t)y;
+ case TCG_COND_LEU:
+ return (uint32_t)x <= (uint32_t)y;
+ case TCG_COND_GTU:
+ return (uint32_t)x > (uint32_t)y;
+ }
+ case 64:
+ switch (c) {
+ case TCG_COND_EQ:
+ return (uint64_t)x == (uint64_t)y;
+ case TCG_COND_NE:
+ return (uint64_t)x != (uint64_t)y;
+ case TCG_COND_LT:
+ return (int64_t)x < (int64_t)y;
+ case TCG_COND_GE:
+ return (int64_t)x >= (int64_t)y;
+ case TCG_COND_LE:
+ return (int64_t)x <= (int64_t)y;
+ case TCG_COND_GT:
+ return (int64_t)x > (int64_t)y;
+ case TCG_COND_LTU:
+ return (uint64_t)x < (uint64_t)y;
+ case TCG_COND_GEU:
+ return (uint64_t)x >= (uint64_t)y;
+ case TCG_COND_LEU:
+ return (uint64_t)x <= (uint64_t)y;
+ case TCG_COND_GTU:
+ return (uint64_t)x > (uint64_t)y;
+ }
+ default:
+ fprintf(stderr,
+ "Unrecognized bitness %d or condition %d in "
+ "do_constant_folding_cond.\n", op_bits(op), c);
+ tcg_abort();
+ }
+}
+
+
/* Propagate constants and copies, fold constant expressions. */
static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
TCGArg *args, TCGOpDef *tcg_op_defs)
@@ -522,6 +581,26 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
args += 3;
break;
}
+ CASE_OP_32_64(setcond):
+ if (temps[args[1]].state == TCG_TEMP_CONST
+ && temps[args[2]].state == TCG_TEMP_CONST) {
+ gen_opc_buf[op_index] = op_to_movi(op);
+ tmp = do_constant_folding_cond(op, temps[args[1]].val,
+ temps[args[2]].val, args[3]);
+ tcg_opt_gen_movi(gen_args, args[0], tmp, nb_temps, nb_globals);
+ gen_args += 2;
+ args += 4;
+ break;
+ } else {
+ reset_temp(args[0], nb_temps, nb_globals);
+ gen_args[0] = args[0];
+ gen_args[1] = args[1];
+ gen_args[2] = args[2];
+ gen_args[3] = args[3];
+ gen_args += 4;
+ args += 4;
+ break;
+ }
case INDEX_op_call:
nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
@ 2012-09-06 16:40 ` Richard Henderson
2012-09-07 10:05 ` Aurelien Jarno
0 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2012-09-06 16:40 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel
On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
> tcg/optimize.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 79 insertions(+)
>
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 7debc8a..c4af1e8 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
> return res;
> }
>
> +static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
> + TCGArg y, TCGCond c)
> +{
> + switch (op_bits(op)) {
> + case 32:
> + switch (c) {
> + case TCG_COND_EQ:
> + return (uint32_t)x == (uint32_t)y;
> + case TCG_COND_NE:
> + return (uint32_t)x != (uint32_t)y;
> + case TCG_COND_LT:
> + return (int32_t)x < (int32_t)y;
> + case TCG_COND_GE:
> + return (int32_t)x >= (int32_t)y;
> + case TCG_COND_LE:
> + return (int32_t)x <= (int32_t)y;
> + case TCG_COND_GT:
> + return (int32_t)x > (int32_t)y;
> + case TCG_COND_LTU:
> + return (uint32_t)x < (uint32_t)y;
> + case TCG_COND_GEU:
> + return (uint32_t)x >= (uint32_t)y;
> + case TCG_COND_LEU:
> + return (uint32_t)x <= (uint32_t)y;
> + case TCG_COND_GTU:
> + return (uint32_t)x > (uint32_t)y;
> + }
> + case 64:
> + switch (c) {
> + case TCG_COND_EQ:
> + return (uint64_t)x == (uint64_t)y;
> + case TCG_COND_NE:
> + return (uint64_t)x != (uint64_t)y;
> + case TCG_COND_LT:
> + return (int64_t)x < (int64_t)y;
> + case TCG_COND_GE:
> + return (int64_t)x >= (int64_t)y;
> + case TCG_COND_LE:
> + return (int64_t)x <= (int64_t)y;
> + case TCG_COND_GT:
> + return (int64_t)x > (int64_t)y;
> + case TCG_COND_LTU:
> + return (uint64_t)x < (uint64_t)y;
> + case TCG_COND_GEU:
> + return (uint64_t)x >= (uint64_t)y;
> + case TCG_COND_LEU:
> + return (uint64_t)x <= (uint64_t)y;
> + case TCG_COND_GTU:
> + return (uint64_t)x > (uint64_t)y;
> + }
> + default:
> + fprintf(stderr,
> + "Unrecognized bitness %d or condition %d in "
> + "do_constant_folding_cond.\n", op_bits(op), c);
> + tcg_abort();
> + }
You probably don't want the default here, but the statements after
the outer switch, and with proper breaks between the two cases.
Otherwise the error doesn't do what you wanted it to do.
r~
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond
2012-09-06 16:40 ` Richard Henderson
@ 2012-09-07 10:05 ` Aurelien Jarno
0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 10:05 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Thu, Sep 06, 2012 at 09:40:57AM -0700, Richard Henderson wrote:
> On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> > ---
> > tcg/optimize.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 79 insertions(+)
> >
> > diff --git a/tcg/optimize.c b/tcg/optimize.c
> > index 7debc8a..c4af1e8 100644
> > --- a/tcg/optimize.c
> > +++ b/tcg/optimize.c
> > @@ -267,6 +267,65 @@ static TCGArg do_constant_folding(TCGOpcode op, TCGArg x, TCGArg y)
> > return res;
> > }
> >
> > +static TCGArg do_constant_folding_cond(TCGOpcode op, TCGArg x,
> > + TCGArg y, TCGCond c)
> > +{
> > + switch (op_bits(op)) {
> > + case 32:
> > + switch (c) {
> > + case TCG_COND_EQ:
> > + return (uint32_t)x == (uint32_t)y;
> > + case TCG_COND_NE:
> > + return (uint32_t)x != (uint32_t)y;
> > + case TCG_COND_LT:
> > + return (int32_t)x < (int32_t)y;
> > + case TCG_COND_GE:
> > + return (int32_t)x >= (int32_t)y;
> > + case TCG_COND_LE:
> > + return (int32_t)x <= (int32_t)y;
> > + case TCG_COND_GT:
> > + return (int32_t)x > (int32_t)y;
> > + case TCG_COND_LTU:
> > + return (uint32_t)x < (uint32_t)y;
> > + case TCG_COND_GEU:
> > + return (uint32_t)x >= (uint32_t)y;
> > + case TCG_COND_LEU:
> > + return (uint32_t)x <= (uint32_t)y;
> > + case TCG_COND_GTU:
> > + return (uint32_t)x > (uint32_t)y;
> > + }
> > + case 64:
> > + switch (c) {
> > + case TCG_COND_EQ:
> > + return (uint64_t)x == (uint64_t)y;
> > + case TCG_COND_NE:
> > + return (uint64_t)x != (uint64_t)y;
> > + case TCG_COND_LT:
> > + return (int64_t)x < (int64_t)y;
> > + case TCG_COND_GE:
> > + return (int64_t)x >= (int64_t)y;
> > + case TCG_COND_LE:
> > + return (int64_t)x <= (int64_t)y;
> > + case TCG_COND_GT:
> > + return (int64_t)x > (int64_t)y;
> > + case TCG_COND_LTU:
> > + return (uint64_t)x < (uint64_t)y;
> > + case TCG_COND_GEU:
> > + return (uint64_t)x >= (uint64_t)y;
> > + case TCG_COND_LEU:
> > + return (uint64_t)x <= (uint64_t)y;
> > + case TCG_COND_GTU:
> > + return (uint64_t)x > (uint64_t)y;
> > + }
> > + default:
> > + fprintf(stderr,
> > + "Unrecognized bitness %d or condition %d in "
> > + "do_constant_folding_cond.\n", op_bits(op), c);
> > + tcg_abort();
> > + }
>
> You probably don't want the default here, but the statements after
> the outer switch, and with proper breaks between the two cases.
> Otherwise the error doesn't do what you wanted it to do.
>
Good catch, i'll fix that in version 2.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (6 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 7/8] tcg/optimize: add constant folding for setcond Aurelien Jarno
@ 2012-09-06 15:00 ` Aurelien Jarno
2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
2012-09-07 12:34 ` Peter Maydell
9 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-06 15:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Aurelien Jarno
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
tcg/optimize.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index c4af1e8..1221b8b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -601,6 +601,32 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
args += 4;
break;
}
+ CASE_OP_32_64(brcond):
+ if (temps[args[0]].state == TCG_TEMP_CONST
+ && temps[args[1]].state == TCG_TEMP_CONST) {
+ if (do_constant_folding_cond(op, temps[args[0]].val,
+ temps[args[1]].val, args[2])) {
+ memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+ gen_opc_buf[op_index] = INDEX_op_br;
+ gen_args[0] = args[3];
+ gen_args += 1;
+ args += 4;
+ } else {
+ gen_opc_buf[op_index] = INDEX_op_nop;
+ args += 4;
+ }
+ break;
+ } else {
+ memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+ reset_temp(args[0], nb_temps, nb_globals);
+ gen_args[0] = args[0];
+ gen_args[1] = args[1];
+ gen_args[2] = args[2];
+ gen_args[3] = args[3];
+ gen_args += 4;
+ args += 4;
+ break;
+ }
case INDEX_op_call:
nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {
@@ -622,7 +648,6 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
case INDEX_op_set_label:
case INDEX_op_jmp:
case INDEX_op_br:
- CASE_OP_32_64(brcond):
memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
for (i = 0; i < def->nb_args; i++) {
*gen_args = *args;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (7 preceding siblings ...)
2012-09-06 15:00 ` [Qemu-devel] [PATCH 8/8] tcg/optimize: add constant folding for brcond Aurelien Jarno
@ 2012-09-06 16:43 ` Richard Henderson
2012-09-07 10:06 ` Aurelien Jarno
2012-09-07 12:34 ` Peter Maydell
9 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2012-09-06 16:43 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel
On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> This patch series improves the TCG optimizer, based on patterns found
> while executing various guest. The brcond ad setcond constant folding
> are useful especially useful when they are used to avoid some argument
> values (e.g. division by 0), and thus can be optimized when this argument
> is a constant.
>
> This bring around 0.5% improvement on openssl like benchmarks.
>
> Aurelien Jarno (8):
> tcg: improve profiler
> tcg/optimize: split expression simplification
> tcg/optimize: simplify or/xor r, a, 0 cases
> tcg/optimize: simplify and r, a, 0 cases
> tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
> tcg/optimize: swap brcond/setcond arguments when possible
> tcg/optimize: add constant folding for setcond
> tcg/optimize: add constant folding for brcond
Patches 1-6,8:
Reviewed-by: Richard Henderson <rth@twiddle.net>
Patch 7 contains a trivial error. With that fixed it could
also bear my Reviewed-by mark.
r~
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
@ 2012-09-07 10:06 ` Aurelien Jarno
0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 10:06 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Thu, Sep 06, 2012 at 09:43:08AM -0700, Richard Henderson wrote:
> On 09/06/2012 08:00 AM, Aurelien Jarno wrote:
> > This patch series improves the TCG optimizer, based on patterns found
> > while executing various guest. The brcond ad setcond constant folding
> > are useful especially useful when they are used to avoid some argument
> > values (e.g. division by 0), and thus can be optimized when this argument
> > is a constant.
> >
> > This bring around 0.5% improvement on openssl like benchmarks.
> >
> > Aurelien Jarno (8):
> > tcg: improve profiler
> > tcg/optimize: split expression simplification
> > tcg/optimize: simplify or/xor r, a, 0 cases
> > tcg/optimize: simplify and r, a, 0 cases
> > tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
> > tcg/optimize: swap brcond/setcond arguments when possible
> > tcg/optimize: add constant folding for setcond
> > tcg/optimize: add constant folding for brcond
>
> Patches 1-6,8:
> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
> Patch 7 contains a trivial error. With that fixed it could
> also bear my Reviewed-by mark.
>
Thanks for the review.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
2012-09-06 15:00 [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Aurelien Jarno
` (8 preceding siblings ...)
2012-09-06 16:43 ` [Qemu-devel] [PATCH 0/8] Improve TCG optimizer Richard Henderson
@ 2012-09-07 12:34 ` Peter Maydell
2012-09-07 13:00 ` Aurelien Jarno
9 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2012-09-07 12:34 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel
On 6 September 2012 16:00, Aurelien Jarno <aurelien@aurel32.net> wrote:
> This patch series improves the TCG optimizer, based on patterns found
> while executing various guest. The brcond ad setcond constant folding
> are useful especially useful when they are used to avoid some argument
> values (e.g. division by 0), and thus can be optimized when this argument
> is a constant.
>
> This bring around 0.5% improvement on openssl like benchmarks.
This didn't overall seem to make much difference on my popular
embedded benchmark setup. However I am rapidly losing confidence
in the benchmark since from run to run individual tests can have
results which vary by a factor of two, which is such high
variation it's almost impossible to say whether a change has
had an overall +1% or -1% effect. Hohum.
-- PMM
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 0/8] Improve TCG optimizer
2012-09-07 12:34 ` Peter Maydell
@ 2012-09-07 13:00 ` Aurelien Jarno
0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2012-09-07 13:00 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel
On Fri, Sep 07, 2012 at 01:34:10PM +0100, Peter Maydell wrote:
> On 6 September 2012 16:00, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > This patch series improves the TCG optimizer, based on patterns found
> > while executing various guest. The brcond ad setcond constant folding
> > are useful especially useful when they are used to avoid some argument
> > values (e.g. division by 0), and thus can be optimized when this argument
> > is a constant.
> >
> > This bring around 0.5% improvement on openssl like benchmarks.
>
> This didn't overall seem to make much difference on my popular
> embedded benchmark setup. However I am rapidly losing confidence
> in the benchmark since from run to run individual tests can have
> results which vary by a factor of two, which is such high
> variation it's almost impossible to say whether a change has
> had an overall +1% or -1% effect. Hohum.
>
I am usually doing the tests by setting the CPU performance to
performance and by pinning QEMU to a given CPU, on a machine without or
with very few other tasks. This improve the stability of the results.
Unfortunately it's not easy to do that on a laptop, especially when
running on battery.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 15+ messages in thread