qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL] Queued tcg patch
@ 2015-05-05 23:24 Richard Henderson
  2015-05-05 23:24 ` [Qemu-devel] [PULL] tcg: optimise memory layout of TCGTemp Richard Henderson
  2015-05-06 11:47 ` [Qemu-devel] [PULL] Queued tcg patch Peter Maydell
  0 siblings, 2 replies; 3+ messages in thread
From: Richard Henderson @ 2015-05-05 23:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Only one tcg related patch since the 2.3 freeze.


r~



The following changes since commit 874e9aeeeb74c5459639a93439a502d262847e68:

  Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20150505-1' into staging (2015-05-05 14:06:12 +0100)

are available in the git repository at:

  git://github.com/rth7680/qemu.git tags/tcg-next-20150505

for you to fetch changes up to 00c8fa9ffeee7458e5ed62c962faf638156c18da:

  tcg: optimise memory layout of TCGTemp (2015-05-05 08:44:46 -0700)

----------------------------------------------------------------
size reduction merge

----------------------------------------------------------------
Emilio G. Cota (1):
      tcg: optimise memory layout of TCGTemp

 tcg/tcg.h | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Qemu-devel] [PULL] tcg: optimise memory layout of TCGTemp
  2015-05-05 23:24 [Qemu-devel] [PULL] Queued tcg patch Richard Henderson
@ 2015-05-05 23:24 ` Richard Henderson
  2015-05-06 11:47 ` [Qemu-devel] [PULL] Queued tcg patch Peter Maydell
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2015-05-05 23:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Emilio G. Cota

From: "Emilio G. Cota" <cota@braap.org>

This brings down the size of the struct from 56 to 32 bytes on 64-bit,
and to 20 bytes on 32-bit. This leads to memory savings:

Before:
$ find . -name 'tcg.o' | xargs size
   text    data     bss     dec     hex filename
  41131   29800      88   71019   1156b ./aarch64-softmmu/tcg/tcg.o
  37969   29416      96   67481   10799 ./x86_64-linux-user/tcg/tcg.o
  39354   28816      96   68266   10aaa ./arm-linux-user/tcg/tcg.o
  40802   29096      88   69986   11162 ./arm-softmmu/tcg/tcg.o
  39417   29672      88   69177   10e39 ./x86_64-softmmu/tcg/tcg.o

After:
$ find . -name 'tcg.o' | xargs size
   text    data     bss     dec     hex filename
  40883   29800      88   70771   11473 ./aarch64-softmmu/tcg/tcg.o
  37473   29416      96   66985   105a9 ./x86_64-linux-user/tcg/tcg.o
  38858   28816      96   67770   108ba ./arm-linux-user/tcg/tcg.o
  40554   29096      88   69738   1106a ./arm-softmmu/tcg/tcg.o
  39169   29672      88   68929   10d41 ./x86_64-softmmu/tcg/tcg.o

Note that using an entire byte for some enums that need less than
that wastes a few bits (noticeable in 32 bits, where we use
20 bytes instead of 16) but avoids extraction code, which overall
is a win--I've tested several variations of the patch, and the appended
is the best performer for OpenSSL's bntest by a very small margin:

Before:
$ taskset -c 0 perf stat -r 15 -- x86_64-linux-user/qemu-x86_64 img/bntest-x86_64 >/dev/null
[...]
 Performance counter stats for 'x86_64-linux-user/qemu-x86_64 img/bntest-x86_64' (15 runs):

      10538.479833 task-clock (msec)  # 0.999 CPUs utilized  ( +-  0.38% )
               772 context-switches   # 0.073 K/sec          ( +-  2.03% )
                 0 cpu-migrations     # 0.000 K/sec          ( +-100.00% )
             2,207 page-faults        # 0.209 K/sec          ( +-  0.08% )
      10.552871687 seconds time elapsed                      ( +-  0.39% )

After:
$ taskset -c 0 perf stat -r 15 -- x86_64-linux-user/qemu-x86_64 img/bntest-x86_64 >/dev/null
 Performance counter stats for 'x86_64-linux-user/qemu-x86_64 img/bntest-x86_64' (15 runs):

      10459.968847 task-clock (msec)  # 0.999 CPUs utilized  ( +-  0.30% )
               739 context-switches   # 0.071 K/sec          ( +-  1.71% )
                 0 cpu-migrations     # 0.000 K/sec          ( +- 68.14% )
             2,204 page-faults        # 0.211 K/sec          ( +-  0.10% )
      10.473900411 seconds time elapsed                      ( +-  0.30% )

Suggested-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg.h | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 3d004ba..fbb3daf 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -417,20 +417,19 @@ static inline TCGCond tcg_high_cond(TCGCond c)
     }
 }
 
-#define TEMP_VAL_DEAD  0
-#define TEMP_VAL_REG   1
-#define TEMP_VAL_MEM   2
-#define TEMP_VAL_CONST 3
+typedef enum TCGTempVal {
+    TEMP_VAL_DEAD,
+    TEMP_VAL_REG,
+    TEMP_VAL_MEM,
+    TEMP_VAL_CONST,
+} TCGTempVal;
 
-/* XXX: optimize memory layout */
 typedef struct TCGTemp {
-    TCGType base_type;
-    TCGType type;
-    int val_type;
-    int reg;
-    tcg_target_long val;
-    int mem_reg;
-    intptr_t mem_offset;
+    unsigned int reg:8;
+    unsigned int mem_reg:8;
+    TCGTempVal val_type:8;
+    TCGType base_type:8;
+    TCGType type:8;
     unsigned int fixed_reg:1;
     unsigned int mem_coherent:1;
     unsigned int mem_allocated:1;
@@ -438,6 +437,9 @@ typedef struct TCGTemp {
                                   basic blocks. Otherwise, it is not
                                   preserved across basic blocks. */
     unsigned int temp_allocated:1; /* never used for code gen */
+
+    tcg_target_long val;
+    intptr_t mem_offset;
     const char *name;
 } TCGTemp;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PULL] Queued tcg patch
  2015-05-05 23:24 [Qemu-devel] [PULL] Queued tcg patch Richard Henderson
  2015-05-05 23:24 ` [Qemu-devel] [PULL] tcg: optimise memory layout of TCGTemp Richard Henderson
@ 2015-05-06 11:47 ` Peter Maydell
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Maydell @ 2015-05-06 11:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 6 May 2015 at 00:24, Richard Henderson <rth@twiddle.net> wrote:
> Only one tcg related patch since the 2.3 freeze.
>
>
> r~
>
>
>
> The following changes since commit 874e9aeeeb74c5459639a93439a502d262847e68:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/pull-sdl-20150505-1' into staging (2015-05-05 14:06:12 +0100)
>
> are available in the git repository at:
>
>   git://github.com/rth7680/qemu.git tags/tcg-next-20150505
>
> for you to fetch changes up to 00c8fa9ffeee7458e5ed62c962faf638156c18da:
>
>   tcg: optimise memory layout of TCGTemp (2015-05-05 08:44:46 -0700)
>
> ----------------------------------------------------------------
> size reduction merge

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-05-06 11:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-05 23:24 [Qemu-devel] [PULL] Queued tcg patch Richard Henderson
2015-05-05 23:24 ` [Qemu-devel] [PULL] tcg: optimise memory layout of TCGTemp Richard Henderson
2015-05-06 11:47 ` [Qemu-devel] [PULL] Queued tcg patch Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).