All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ulrich Hecht <uli@suse.de>
To: Aurelien Jarno <aurelien@aurel32.net>
Cc: riku.voipio@iki.fi, qemu-devel@nongnu.org, agraf@suse.de
Subject: Re: [Qemu-devel] [PATCH 1/9] TCG "sync" op
Date: Fri, 16 Oct 2009 18:37:31 +0200	[thread overview]
Message-ID: <200910161837.31850.uli@suse.de> (raw)
In-Reply-To: <20091016155221.GF4127@hall.aurel32.net>

On Friday 16 October 2009, Aurelien Jarno wrote:
> IMHO, the correct way to do it is to use the following code, assuming
> you want to use 64-bit TCG regs to hold 32-bit values (that's
> something that is not really clear in your next patch):
>
> - for register load:
> | static TCGv load_reg(int reg)
> | {
> |    TCGv r = tcg_temp_new_i64();
> |    tcg_gen_ext32u_i64(r, tcgregs[reg]);
> |    return r;
> | }
> |
> | static void store_reg32(int reg, TCGv v)
> | {
> |    tcg_gen_ext32u_i64(v, v); /* may be optional */
> |    tcg_gen_andi_i64(tcgregs[reg], tcgregs[reg],
> | 0xffffffff00000000ULL); tcg_gen_or_i64(tcgregs[reg], tcgregs[reg],
> | v);
> | }

This is _extremely_ detrimental to performance. The point of the sync op 
is that in most cases it's a nop because registers are usually used with 
the same bitness again and again. The sign extension/masking stuff is 
done every time a register is accessed as 32 bits, which is the most 
common case. Compare the translation of the following sequence of 
instructions:

IN: _dl_aux_init
0x0000000080044ff6:  lhi        %r4,0
0x0000000080044ffa:  lhi        %r5,0
0x0000000080044ffe:  lhi        %r0,0

with sync:

OP:
 mov_i32 loc0,global_cc
 movi_i32 tmp1,$0x0
 sync_i64 R4
 mov_i32 r4,tmp1
 movi_i32 tmp1,$0x0
 sync_i64 R5
 mov_i32 r5,tmp1
 movi_i32 tmp1,$0x0
 sync_i64 R0
 mov_i32 r0,tmp1
 mov_i32 global_cc,loc0
 movi_i64 tmp2,$0x80045002
 st_i64 tmp2,env,$0x158
 exit_tb $0x0

OUT: [size=61]
0x6019a030:  mov    0x160(%r14),%ebp
0x6019a037:  mov    %rbp,%rbx
0x6019a03a:  mov    $0x80045002,%r12d
0x6019a040:  mov    %r12,0x158(%r14)
0x6019a047:  mov    %ebp,0xd1a0(%r14)
0x6019a04e:  mov    %ebx,0x160(%r14)
0x6019a055:  xor    %ebp,%ebp
0x6019a057:  mov    %ebp,(%r14)
0x6019a05a:  xor    %ebp,%ebp
0x6019a05c:  mov    %ebp,0x20(%r14)
0x6019a060:  xor    %ebp,%ebp
0x6019a062:  mov    %ebp,0x28(%r14)
0x6019a066:  xor    %eax,%eax
0x6019a068:  jmpq   0x621dc8ce


with sign extension:

OP:
 mov_i32 loc0,global_cc
 movi_i32 tmp1,$0x0
 ext32u_i64 tmp1,tmp1
 movi_i64 tmp2,$0xffffffff00000000
 and_i64 R4,R4,tmp2
 or_i64 R4,R4,tmp1
 movi_i32 tmp1,$0x0
 ext32u_i64 tmp1,tmp1
 movi_i64 tmp2,$0xffffffff00000000
 and_i64 R5,R5,tmp2
 or_i64 R5,R5,tmp1
 movi_i32 tmp1,$0x0
 ext32u_i64 tmp1,tmp1
 movi_i64 tmp2,$0xffffffff00000000
 and_i64 R0,R0,tmp2
 or_i64 R0,R0,tmp1
 mov_i32 global_cc,loc0
 movi_i64 tmp2,$0x80045002
 st_i64 tmp2,env,$0x158
 exit_tb $0x0

OUT: [size=126]
0x6019af10:  mov    0x160(%r14),%ebp
0x6019af17:  xor    %ebx,%ebx
0x6019af19:  mov    %ebx,%ebx
0x6019af1b:  mov    0x20(%r14),%r12
0x6019af1f:  mov    $0xffffffff00000000,%r13
0x6019af29:  and    %r13,%r12
0x6019af2c:  or     %rbx,%r12
0x6019af2f:  xor    %ebx,%ebx
0x6019af31:  mov    %ebx,%ebx
0x6019af33:  mov    0x28(%r14),%r13
0x6019af37:  mov    $0xffffffff00000000,%r15
0x6019af41:  and    %r15,%r13
0x6019af44:  or     %rbx,%r13
0x6019af47:  xor    %ebx,%ebx
0x6019af49:  mov    %ebx,%ebx
0x6019af4b:  mov    (%r14),%r15
0x6019af4e:  mov    $0xffffffff00000000,%r10
0x6019af58:  and    %r10,%r15
0x6019af5b:  or     %rbx,%r15
0x6019af5e:  mov    %rbp,%rbx
0x6019af61:  mov    $0x80045002,%r10d
0x6019af67:  mov    %r10,0x158(%r14)
0x6019af6e:  mov    %ebp,0xd1a0(%r14)
0x6019af75:  mov    %ebx,0x160(%r14)
0x6019af7c:  mov    %r15,(%r14)
0x6019af7f:  mov    %r12,0x20(%r14)
0x6019af83:  mov    %r13,0x28(%r14)
0x6019af87:  xor    %eax,%eax
0x6019af89:  jmpq   0x621dd78e

Its more than twice the size and has ten memory accesses instead of 
seven.

CU
Uli

-- 
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)

  reply	other threads:[~2009-10-16 16:37 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-16 12:38 [Qemu-devel] [PATCH 0/9] S/390 support updated Ulrich Hecht
2009-10-16 12:38 ` [Qemu-devel] [PATCH 1/9] TCG "sync" op Ulrich Hecht
2009-10-16 12:38   ` [Qemu-devel] [PATCH 2/9] S/390 CPU emulation Ulrich Hecht
2009-10-16 12:38     ` [Qemu-devel] [PATCH 3/9] S/390 host/target build system support Ulrich Hecht
2009-10-16 12:38       ` [Qemu-devel] [PATCH 4/9] S/390 host support for TCG Ulrich Hecht
2009-10-16 12:38         ` [Qemu-devel] [PATCH 5/9] linux-user: S/390 64-bit (s390x) support Ulrich Hecht
2009-10-16 12:38           ` [Qemu-devel] [PATCH 6/9] linux-user: don't do locking in single-threaded processes Ulrich Hecht
2009-10-16 12:38             ` [Qemu-devel] [PATCH 7/9] linux-user: dup3, fallocate syscalls Ulrich Hecht
2009-10-16 12:38               ` [Qemu-devel] [PATCH 8/9] linux-user: define a couple of syscalls for non-uid16 targets Ulrich Hecht
2009-10-16 12:38                 ` [Qemu-devel] [PATCH 9/9] linux-user: getpriority errno fix Ulrich Hecht
2009-10-17 10:44       ` [Qemu-devel] [PATCH 3/9] S/390 host/target build system support Aurelien Jarno
2009-10-17 10:42     ` [Qemu-devel] [PATCH 2/9] S/390 CPU emulation Aurelien Jarno
2009-10-19 17:17       ` Ulrich Hecht
2009-10-22 21:28         ` Aurelien Jarno
2009-11-02 15:16           ` Ulrich Hecht
2009-11-02 18:42             ` Aurelien Jarno
2009-11-02 19:03               ` Laurent Desnogues
2009-11-09 16:55                 ` Ulrich Hecht
2009-11-10 16:02                   ` Aurelien Jarno
2009-11-11 10:46                     ` Ulrich Hecht
2009-10-16 15:52   ` [Qemu-devel] [PATCH 1/9] TCG "sync" op Aurelien Jarno
2009-10-16 16:37     ` Ulrich Hecht [this message]
2009-10-16 17:29       ` Aurelien Jarno
2009-10-19 17:17         ` Ulrich Hecht
2009-10-17  8:59     ` Edgar E. Iglesias
2009-10-19 17:17       ` Ulrich Hecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200910161837.31850.uli@suse.de \
    --to=uli@suse.de \
    --cc=agraf@suse.de \
    --cc=aurelien@aurel32.net \
    --cc=qemu-devel@nongnu.org \
    --cc=riku.voipio@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.