All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Virolainen <Pablo.Virolainen@nomovok.com>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] RFC: TCG constant propagation.
Date: Wed, 05 Aug 2009 11:13:38 +0300	[thread overview]
Message-ID: <4A793F32.4090207@nomovok.com> (raw)
In-Reply-To: <E1MTGSg-0007wd-QX@lists.gnu.org>

[-- Attachment #1: Type: text/plain, Size: 743 bytes --]

Filip Navara kirjoitti:
> Add support for constant propagation to TCG. This has to be paired with the liveness
> analysis to remove the dead code. Not all possible operations are covered, but the
> most common ones are. This improves the code generation for several ARM instructions,
> like MVN (immediate), and it may help other targets as well.

On my small benchmark, qemu-system-sh4 was about 3% slower on Intel Xeon 
E5405@2.00GHz. I'm running 64-bit mode. My mini benchmark is to build 
zlib 1.2.3, so it's 'real' world work load. Ran the benchmark several 
times and results seems to be pretty constant.

ps. I added INDEX_op_*_i64 cases to the evaluation part. I'm not 
completly sure if those &mask should be there.

Pablo Virolainen

[-- Attachment #2: const.patch --]
[-- Type: text/x-patch, Size: 6493 bytes --]

--- qemu-0.11.0-rc1_orig/tcg/tcg.c	2009-07-30 03:38:26.000000000 +0300
+++ qemu-0.11.0-rc1/tcg/tcg.c	2009-08-05 10:43:48.000000000 +0300
@@ -1021,7 +1021,194 @@
 #endif
         tdefs++;
     }
+}
 
+static void tcg_const_analysis(TCGContext *s)
+{
+    int nb_cargs, nb_iargs, nb_oargs, dest, src, src2, del_args, i;
+    TCGArg *args;
+    uint16_t op;
+    uint16_t *opc_ptr;
+    const TCGOpDef *def;
+    uint8_t *const_temps;
+    tcg_target_ulong *temp_values;
+    tcg_target_ulong val, mask;
+    tcg_target_ulong dest_val, src_val, src2_val;
+
+    const_temps = tcg_malloc(s->nb_temps);
+    memset(const_temps, 0, s->nb_temps);
+    temp_values = tcg_malloc(s->nb_temps * sizeof(uint32_t));
+
+    opc_ptr = gen_opc_buf;
+    args = gen_opparam_buf;
+    while (opc_ptr < gen_opc_ptr) {
+        op = *opc_ptr;
+        def = &tcg_op_defs[op];
+        nb_oargs = def->nb_oargs;
+        nb_iargs = def->nb_iargs;
+        nb_cargs = def->nb_cargs;
+        del_args = 0;
+        mask = ~((tcg_target_ulong)0);
+
+        switch(op) {
+        case INDEX_op_movi_i32:
+#if TCG_TARGET_REG_BITS == 64
+        case INDEX_op_movi_i64:
+#endif
+            dest = args[0];
+            val = args[1];
+            const_temps[dest] = 1;
+            temp_values[dest] = val;
+            break;
+        case INDEX_op_mov_i32:
+#if TCG_TARGET_REG_BITS == 64
+        case INDEX_op_mov_i64:
+#endif
+            dest = args[0];
+            src = args[1];
+            const_temps[dest] = const_temps[src];
+            temp_values[dest] = temp_values[src];
+            break;
+        case INDEX_op_not_i32:
+#if TCG_TARGET_REG_BITS == 64
+            mask = 0xffffffff;
+        case INDEX_op_not_i64:
+#endif
+            dest = args[0];
+            src = args[1];
+            if (const_temps[src]) {
+                const_temps[dest] = 1;
+                dest_val = ~temp_values[src];
+                *opc_ptr = INDEX_op_movi_i32;
+                args[1] = temp_values[dest] = dest_val & mask;
+            } else {
+                const_temps[dest] = 0;
+            }
+            break;
+        case INDEX_op_add_i32:
+        case INDEX_op_sub_i32:
+        case INDEX_op_mul_i32:
+        case INDEX_op_and_i32:
+        case INDEX_op_or_i32:
+        case INDEX_op_xor_i32:
+        case INDEX_op_shl_i32:
+        case INDEX_op_shr_i32:
+#if TCG_TARGET_REG_BITS == 64
+            mask = 0xffffffff;
+        case INDEX_op_add_i64:
+        case INDEX_op_sub_i64:
+        case INDEX_op_mul_i64:
+        case INDEX_op_and_i64:
+        case INDEX_op_or_i64:
+        case INDEX_op_xor_i64:
+        case INDEX_op_shl_i64:
+        case INDEX_op_shr_i64:
+#endif
+
+            dest = args[0];
+            src = args[1];
+            src2 = args[2];
+            if (const_temps[src] && const_temps[src2]) {
+                src_val = temp_values[src];
+                src2_val = temp_values[src2];
+                const_temps[dest] = 1;
+                switch (op) {
+                case INDEX_op_add_i32:
+                    dest_val = src_val + src2_val;
+                    break;
+                case INDEX_op_add_i64:
+		    dest_val = (src_val + src2_val) & mask;
+		    break;
+                case INDEX_op_sub_i32:
+                    dest_val = src_val - src2_val;
+                    break;
+                case INDEX_op_sub_i64:
+		    dest_val = (src_val - src2_val) & mask;
+		    break;
+                case INDEX_op_mul_i32:
+                    dest_val = src_val * src2_val;
+                    break;
+                case INDEX_op_mul_i64:
+		    dest_val = (src_val * src2_val) & mask;
+		    break;
+                case INDEX_op_and_i32:
+                    dest_val = src_val & src2_val;
+                    break;
+                case INDEX_op_and_i64:
+		    dest_val = src_val & src2_val & mask;
+		    break;
+                case INDEX_op_or_i32:
+                    dest_val = src_val | src2_val;
+                    break;
+                case INDEX_op_or_i64:
+		    dest_val = (src_val | src2_val) & mask;
+		    break;
+                case INDEX_op_xor_i32:
+                    dest_val = src_val ^ src2_val;
+                    break;
+                case INDEX_op_xor_i64:
+		    dest_val = (src_val ^ src2_val) & mask;
+		    break;
+                case INDEX_op_shl_i32:
+                    dest_val = src_val << src2_val;
+                    break;
+                case INDEX_op_shl_i64:
+		    dest_val = (src_val << src2_val) & mask;
+		    break;
+                case INDEX_op_shr_i32:
+                    dest_val = src_val >> src2_val;
+                    break;
+                case INDEX_op_shr_i64:
+		    dest_val = (src_val >> src2_val) & mask;
+		    break;
+                default:
+		  fprintf(stderr,"index op %i\n",op);
+                    tcg_abort();
+                    return;
+                }
+                *opc_ptr = INDEX_op_movi_i32;                
+                args[1] = temp_values[dest] = dest_val & mask;
+                del_args = 1;
+            } else {
+                const_temps[dest] = 0;
+            }
+            break;
+        case INDEX_op_call:
+            nb_oargs = args[0] >> 16;
+            nb_iargs = args[0] & 0xffff;
+            nb_cargs = def->nb_cargs;
+            args++;
+            for (i = 0; i < nb_oargs; i++) {
+                const_temps[args[i]] = 0;
+            }
+            break;
+        case INDEX_op_nopn:
+            /* variable number of arguments */
+            nb_cargs = args[0];
+            break;
+        case INDEX_op_set_label:
+            memset(const_temps, 0, s->nb_temps);
+            break;
+        default:
+            if (def->flags & TCG_OPF_BB_END) {
+                memset(const_temps, 0, s->nb_temps);
+            } else {
+                for (i = 0; i < nb_oargs; i++) {
+                    const_temps[args[i]] = 0;
+                }
+            }
+            break;
+        }
+        opc_ptr++;
+        args += nb_iargs + nb_oargs + nb_cargs - del_args;
+        if (del_args > 0) {
+            gen_opparam_ptr -= del_args;
+            memmove(args, args + del_args, (gen_opparam_ptr - args) * sizeof(*args));
+        }
+    }
+
+    if (args != gen_opparam_ptr)
+        tcg_abort();
 }
 
 #ifdef USE_LIVENESS_ANALYSIS
@@ -1891,6 +2078,8 @@
     }
 #endif
 
+    tcg_const_analysis(s);
+
 #ifdef CONFIG_PROFILER
     s->la_time -= profile_getclock();
 #endif

  parent reply	other threads:[~2009-08-05  8:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-21 14:37 [Qemu-devel] [PATCH] RFC: TCG constant propagation Filip Navara
2009-07-23  9:08 ` Filip Navara
2009-07-23  9:25   ` Laurent Desnogues
2009-07-23  9:30     ` Paul Brook
2009-07-23  9:49     ` Filip Navara
2009-07-23 20:10   ` Stuart Brady
2009-07-23 20:28     ` Filip Navara
2009-07-23 22:02   ` Daniel Jacobowitz
2009-08-05  8:13 ` Pablo Virolainen [this message]
2009-08-05  8:48   ` Filip Navara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A793F32.4090207@nomovok.com \
    --to=pablo.virolainen@nomovok.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.