qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits
@ 2013-01-11 23:42 Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 1/3] optimize: only write to state when clearing optimizer data Richard Henderson
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Richard Henderson @ 2013-01-11 23:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno

This patch set is extracted from Paolo's target-i386 eflags2
optimization branch:

  git://github.com/bonzini/qemu.git eflags2

I've cherry-picked 3 patches and rebased them vs master.  I've
made a few trivial changes to the patches:

  * Extracting reset_all_temps as a function,
  * Fixing a few type errors wrt target_ulong vs tcg_target_ulong.

While these were written in support of the other changes that
Paolo made wrt target-i386 eflags computation, they're not
dependent on them, and I think they should be considered for
inclusion regardless.


r~


Paolo Bonzini (3):
  optimize: only write to state when clearing optimizer data
  optimize: track nonzero bits of registers
  optimize: optimize using nonzero bits

 tcg/optimize.c | 177 ++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 150 insertions(+), 27 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 1/3] optimize: only write to state when clearing optimizer data
  2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
@ 2013-01-11 23:42 ` Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 2/3] optimize: track nonzero bits of registers Richard Henderson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2013-01-11 23:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno

From: Paolo Bonzini <pbonzini@redhat.com>

The next patch will add to the TCG optimizer a field that should be
non-zero in the default case.  Thus, replace the memset of the
temps array with a loop.  Only the state field has to be up-to-date,
because others are not used except if the state is TCG_TEMP_COPY
or TCG_TEMP_CONST.

[rth: Extracted the loop to a function.]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9109b81..9d05a72 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -65,6 +65,15 @@ static void reset_temp(TCGArg temp)
     temps[temp].state = TCG_TEMP_UNDEF;
 }
 
+/* Reset all temporaries, given that there are NB_TEMPS of them.  */
+static void reset_all_temps(int nb_temps)
+{
+    int i;
+    for (i = 0; i < nb_temps; i++) {
+        temps[i].state = TCG_TEMP_UNDEF;
+    }
+}
+
 static int op_bits(TCGOpcode op)
 {
     const TCGOpDef *def = &tcg_op_defs[op];
@@ -482,7 +491,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
 
     nb_temps = s->nb_temps;
     nb_globals = s->nb_globals;
-    memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+    reset_all_temps(nb_temps);
 
     nb_ops = tcg_opc_ptr - s->gen_opc_buf;
     gen_args = args;
@@ -768,7 +777,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             tmp = do_constant_folding_cond(op, args[0], args[1], args[2]);
             if (tmp != 2) {
                 if (tmp) {
-                    memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                    reset_all_temps(nb_temps);
                     s->gen_opc_buf[op_index] = INDEX_op_br;
                     gen_args[0] = args[3];
                     gen_args += 1;
@@ -861,7 +870,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             tmp = do_constant_folding_cond2(&args[0], &args[2], args[4]);
             if (tmp != 2) {
                 if (tmp) {
-                    memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                    reset_all_temps(nb_temps);
                     s->gen_opc_buf[op_index] = INDEX_op_br;
                     gen_args[0] = args[5];
                     gen_args += 1;
@@ -875,7 +884,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                        && temps[args[3]].val == 0) {
                 /* Simplify LT/GE comparisons vs zero to a single compare
                    vs the high word of the input.  */
-                memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                reset_all_temps(nb_temps);
                 s->gen_opc_buf[op_index] = INDEX_op_brcond_i32;
                 gen_args[0] = args[1];
                 gen_args[1] = args[3];
@@ -940,7 +949,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                We trash everything if the operation is the end of a basic
                block, otherwise we only trash the output args.  */
             if (def->flags & TCG_OPF_BB_END) {
-                memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+                reset_all_temps(nb_temps);
             } else {
                 for (i = 0; i < def->nb_oargs; i++) {
                     reset_temp(args[i]);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/3] optimize: track nonzero bits of registers
  2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 1/3] optimize: only write to state when clearing optimizer data Richard Henderson
@ 2013-01-11 23:42 ` Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 3/3] optimize: optimize using nonzero bits Richard Henderson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2013-01-11 23:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno

From: Paolo Bonzini <pbonzini@redhat.com>

Add a "mask" field to the tcg_temp_info struct.  A bit that is zero
in "mask" will always be zero in the corresponding temporary.
Zero bits in the mask can be produced from moves of immediates,
zero-extensions, ANDs with constants, shifts; they can then be
be propagated by logical operations, shifts, sign-extensions,
negations, deposit operations, and conditional moves.  Other
operations will just reset the mask to all-ones, i.e. unknown.

[rth: s/target_ulong/tcg_target_ulong/]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 110 insertions(+), 22 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9d05a72..090efbc 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -46,6 +46,7 @@ struct tcg_temp_info {
     uint16_t prev_copy;
     uint16_t next_copy;
     tcg_target_ulong val;
+    tcg_target_ulong mask;
 };
 
 static struct tcg_temp_info temps[TCG_MAX_TEMPS];
@@ -63,6 +64,7 @@ static void reset_temp(TCGArg temp)
         }
     }
     temps[temp].state = TCG_TEMP_UNDEF;
+    temps[temp].mask = -1;
 }
 
 /* Reset all temporaries, given that there are NB_TEMPS of them.  */
@@ -71,6 +73,7 @@ static void reset_all_temps(int nb_temps)
     int i;
     for (i = 0; i < nb_temps; i++) {
         temps[i].state = TCG_TEMP_UNDEF;
+        temps[i].mask = -1;
     }
 }
 
@@ -148,33 +151,35 @@ static bool temps_are_copies(TCGArg arg1, TCGArg arg2)
 static void tcg_opt_gen_mov(TCGContext *s, TCGArg *gen_args,
                             TCGArg dst, TCGArg src)
 {
-        reset_temp(dst);
-        assert(temps[src].state != TCG_TEMP_CONST);
-
-        if (s->temps[src].type == s->temps[dst].type) {
-            if (temps[src].state != TCG_TEMP_COPY) {
-                temps[src].state = TCG_TEMP_COPY;
-                temps[src].next_copy = src;
-                temps[src].prev_copy = src;
-            }
-            temps[dst].state = TCG_TEMP_COPY;
-            temps[dst].next_copy = temps[src].next_copy;
-            temps[dst].prev_copy = src;
-            temps[temps[dst].next_copy].prev_copy = dst;
-            temps[src].next_copy = dst;
+    reset_temp(dst);
+    temps[dst].mask = temps[src].mask;
+    assert(temps[src].state != TCG_TEMP_CONST);
+
+    if (s->temps[src].type == s->temps[dst].type) {
+        if (temps[src].state != TCG_TEMP_COPY) {
+            temps[src].state = TCG_TEMP_COPY;
+            temps[src].next_copy = src;
+            temps[src].prev_copy = src;
         }
+        temps[dst].state = TCG_TEMP_COPY;
+        temps[dst].next_copy = temps[src].next_copy;
+        temps[dst].prev_copy = src;
+        temps[temps[dst].next_copy].prev_copy = dst;
+        temps[src].next_copy = dst;
+    }
 
-        gen_args[0] = dst;
-        gen_args[1] = src;
+    gen_args[0] = dst;
+    gen_args[1] = src;
 }
 
 static void tcg_opt_gen_movi(TCGArg *gen_args, TCGArg dst, TCGArg val)
 {
-        reset_temp(dst);
-        temps[dst].state = TCG_TEMP_CONST;
-        temps[dst].val = val;
-        gen_args[0] = dst;
-        gen_args[1] = val;
+    reset_temp(dst);
+    temps[dst].state = TCG_TEMP_CONST;
+    temps[dst].val = val;
+    temps[dst].mask = val;
+    gen_args[0] = dst;
+    gen_args[1] = val;
 }
 
 static TCGOpcode op_to_mov(TCGOpcode op)
@@ -479,6 +484,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                                     TCGArg *args, TCGOpDef *tcg_op_defs)
 {
     int i, nb_ops, op_index, nb_temps, nb_globals, nb_call_args;
+    tcg_target_ulong mask;
     TCGOpcode op;
     const TCGOpDef *def;
     TCGArg *gen_args;
@@ -621,6 +627,87 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
+        /* Simplify using known-zero bits */
+        mask = -1;
+        switch (op) {
+        CASE_OP_32_64(ext8s):
+            if ((temps[args[1]].mask & 0x80) != 0) {
+                break;
+            }
+        CASE_OP_32_64(ext8u):
+            mask = 0xff;
+            goto and_const;
+        CASE_OP_32_64(ext16s):
+            if ((temps[args[1]].mask & 0x8000) != 0) {
+                break;
+            }
+        CASE_OP_32_64(ext16u):
+            mask = 0xffff;
+            goto and_const;
+        case INDEX_op_ext32s_i64:
+            if ((temps[args[1]].mask & 0x80000000) != 0) {
+                break;
+            }
+        case INDEX_op_ext32u_i64:
+            mask = 0xffffffffU;
+            goto and_const;
+
+        CASE_OP_32_64(and):
+            mask = temps[args[2]].mask;
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+        and_const:
+                ;
+            }
+            mask = temps[args[1]].mask & mask;
+            break;
+
+        CASE_OP_32_64(sar):
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+                mask = ((tcg_target_long)temps[args[1]].mask
+                        >> temps[args[2]].val);
+            }
+            break;
+
+        CASE_OP_32_64(shr):
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+                mask = temps[args[1]].mask >> temps[args[2]].val;
+            }
+            break;
+
+        CASE_OP_32_64(shl):
+            if (temps[args[2]].state == TCG_TEMP_CONST) {
+                mask = temps[args[1]].mask << temps[args[2]].val;
+            }
+            break;
+
+        CASE_OP_32_64(neg):
+            /* Set to 1 all bits to the left of the rightmost.  */
+            mask = -(temps[args[1]].mask & -temps[args[1]].mask);
+            break;
+
+        CASE_OP_32_64(deposit):
+            tmp = ((1ull << args[4]) - 1);
+            mask = ((temps[args[1]].mask & ~(tmp << args[3]))
+                    | ((temps[args[2]].mask & tmp) << args[3]));
+            break;
+
+        CASE_OP_32_64(or):
+        CASE_OP_32_64(xor):
+            mask = temps[args[1]].mask | temps[args[2]].mask;
+            break;
+
+        CASE_OP_32_64(setcond):
+            mask = 1;
+            break;
+
+        CASE_OP_32_64(movcond):
+            mask = temps[args[3]].mask | temps[args[4]].mask;
+            break;
+
+        default:
+            break;
+        }
+
         /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
         switch (op) {
         CASE_OP_32_64(and):
@@ -947,7 +1034,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             /* Default case: we know nothing about operation (or were unable
                to compute the operation result) so no propagation is done.
                We trash everything if the operation is the end of a basic
-               block, otherwise we only trash the output args.  */
+               block, otherwise we only trash the output args.  "mask" is
+               the non-zero bits mask for the first output arg.  */
             if (def->flags & TCG_OPF_BB_END) {
                 reset_all_temps(nb_temps);
             } else {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 3/3] optimize: optimize using nonzero bits
  2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 1/3] optimize: only write to state when clearing optimizer data Richard Henderson
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 2/3] optimize: track nonzero bits of registers Richard Henderson
@ 2013-01-11 23:42 ` Richard Henderson
  2013-01-12  7:55 ` [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Paolo Bonzini
  2013-01-19 13:58 ` Blue Swirl
  4 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2013-01-11 23:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Aurelien Jarno

From: Paolo Bonzini <pbonzini@redhat.com>

This adds two optimizations using the non-zero bit mask.  In some cases
involving shifts or ANDs the value can become zero, and can thus be
optimized to a move of zero.  Second, useless zero-extension or an
AND with constant can be detected that would only zero bits that are
already zero.

The main advantage of this optimization is that it turns zero-extensions
into moves, thus enabling much better copy propagation (around 1% code
reduction).  Here is for example a "test $0xff0000,%ecx + je" before
optimization:

 mov_i64 tmp0,rcx
 movi_i64 tmp1,$0xff0000
 discard cc_src
 and_i64 cc_dst,tmp0,tmp1
 movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0

and after (without patch on the left, with on the right):

 movi_i64 tmp1,$0xff0000                 movi_i64 tmp1,$0xff0000
 discard cc_src                          discard cc_src
 and_i64 cc_dst,rcx,tmp1                 and_i64 cc_dst,rcx,tmp1
 movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0

Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit
(after optimization, without patch on the left, with on the right):

 discard cc_src                          discard cc_src
 mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
 movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,ne,$0x0           brcond_i64 rax,tmp12,ne,$0x0

"test $0x1, %dl + je":

 movi_i64 tmp1,$0x1                      movi_i64 tmp1,$0x1
 discard cc_src                          discard cc_src
 and_i64 cc_dst,rdx,tmp1                 and_i64 cc_dst,rdx,tmp1
 movi_i32 cc_op,$0x1a                    movi_i32 cc_op,$0x1a
 ext8u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0

In some cases TCG even outsmarts GCC. :)  Here the input code has
"and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer,
thanks to copy propagation, does the following:

 movi_i64 tmp12,$0x2                     movi_i64 tmp12,$0x2
 and_i64 rax,rax,tmp12                   and_i64 rax,rax,tmp12
 mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
 ext32s_i64 tmp0,rax                  -> nop
 mov_i64 rbx,tmp0                     -> mov_i64 rbx,cc_dst
 and_i64 cc_dst,rbx,rbx               -> nop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 090efbc..973d2d6 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -484,7 +484,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                                     TCGArg *args, TCGOpDef *tcg_op_defs)
 {
     int i, nb_ops, op_index, nb_temps, nb_globals, nb_call_args;
-    tcg_target_ulong mask;
+    tcg_target_ulong mask, affected;
     TCGOpcode op;
     const TCGOpDef *def;
     TCGArg *gen_args;
@@ -629,6 +629,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
 
         /* Simplify using known-zero bits */
         mask = -1;
+        affected = -1;
         switch (op) {
         CASE_OP_32_64(ext8s):
             if ((temps[args[1]].mask & 0x80) != 0) {
@@ -656,7 +657,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             mask = temps[args[2]].mask;
             if (temps[args[2]].state == TCG_TEMP_CONST) {
         and_const:
-                ;
+                affected = temps[args[1]].mask & ~mask;
             }
             mask = temps[args[1]].mask & mask;
             break;
@@ -708,6 +709,31 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         }
 
+        if (mask == 0) {
+            assert(def->nb_oargs == 1);
+            s->gen_opc_buf[op_index] = op_to_movi(op);
+            tcg_opt_gen_movi(gen_args, args[0], 0);
+            args += def->nb_oargs + def->nb_iargs + def->nb_cargs;
+            gen_args += 2;
+            continue;
+        }
+        if (affected == 0) {
+            assert(def->nb_oargs == 1);
+            if (temps_are_copies(args[0], args[1])) {
+                s->gen_opc_buf[op_index] = INDEX_op_nop;
+            } else if (temps[args[1]].state != TCG_TEMP_CONST) {
+                s->gen_opc_buf[op_index] = op_to_mov(op);
+                tcg_opt_gen_mov(s, gen_args, args[0], args[1]);
+                gen_args += 2;
+            } else {
+                s->gen_opc_buf[op_index] = op_to_movi(op);
+                tcg_opt_gen_movi(gen_args, args[0], temps[args[1]].val);
+                gen_args += 2;
+            }
+            args += def->nb_iargs + 1;
+            continue;
+        }
+
         /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
         switch (op) {
         CASE_OP_32_64(and):
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits
  2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
                   ` (2 preceding siblings ...)
  2013-01-11 23:42 ` [Qemu-devel] [PATCH 3/3] optimize: optimize using nonzero bits Richard Henderson
@ 2013-01-12  7:55 ` Paolo Bonzini
  2013-01-19 13:58 ` Blue Swirl
  4 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2013-01-12  7:55 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, Aurelien Jarno

Il 12/01/2013 00:42, Richard Henderson ha scritto:
> This patch set is extracted from Paolo's target-i386 eflags2
> optimization branch:
> 
>   git://github.com/bonzini/qemu.git eflags2
> 
> I've cherry-picked 3 patches and rebased them vs master.  I've
> made a few trivial changes to the patches:
> 
>   * Extracting reset_all_temps as a function,
>   * Fixing a few type errors wrt target_ulong vs tcg_target_ulong.
> 
> While these were written in support of the other changes that
> Paolo made wrt target-i386 eflags computation, they're not
> dependent on them, and I think they should be considered for
> inclusion regardless.
> 
> 
> r~

Sure, thanks! :)

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits
  2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
                   ` (3 preceding siblings ...)
  2013-01-12  7:55 ` [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Paolo Bonzini
@ 2013-01-19 13:58 ` Blue Swirl
  4 siblings, 0 replies; 6+ messages in thread
From: Blue Swirl @ 2013-01-19 13:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, qemu-devel, Aurelien Jarno

Thanks, applied all.

On Fri, Jan 11, 2013 at 11:42 PM, Richard Henderson <rth@twiddle.net> wrote:
> This patch set is extracted from Paolo's target-i386 eflags2
> optimization branch:
>
>   git://github.com/bonzini/qemu.git eflags2
>
> I've cherry-picked 3 patches and rebased them vs master.  I've
> made a few trivial changes to the patches:
>
>   * Extracting reset_all_temps as a function,
>   * Fixing a few type errors wrt target_ulong vs tcg_target_ulong.
>
> While these were written in support of the other changes that
> Paolo made wrt target-i386 eflags computation, they're not
> dependent on them, and I think they should be considered for
> inclusion regardless.
>
>
> r~
>
>
> Paolo Bonzini (3):
>   optimize: only write to state when clearing optimizer data
>   optimize: track nonzero bits of registers
>   optimize: optimize using nonzero bits
>
>  tcg/optimize.c | 177 ++++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 150 insertions(+), 27 deletions(-)
>
> --
> 1.7.11.7
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-19 13:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-11 23:42 [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Richard Henderson
2013-01-11 23:42 ` [Qemu-devel] [PATCH 1/3] optimize: only write to state when clearing optimizer data Richard Henderson
2013-01-11 23:42 ` [Qemu-devel] [PATCH 2/3] optimize: track nonzero bits of registers Richard Henderson
2013-01-11 23:42 ` [Qemu-devel] [PATCH 3/3] optimize: optimize using nonzero bits Richard Henderson
2013-01-12  7:55 ` [Qemu-devel] [PATCH 0/3] tcg-optimize with known-zero bits Paolo Bonzini
2013-01-19 13:58 ` Blue Swirl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).