Re: [Qemu-devel] [PATCH v3 3/8] target-sh4: optimize addc using add2

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Aurelien Jarno <aurelien@aurel32.net>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH v3 3/8] target-sh4: optimize addc using add2
Date: Thu, 4 Jun 2015 18:08:51 +0200	[thread overview]
Message-ID: <20150604160851.GC10311@aurel32.net> (raw)
In-Reply-To: <55702E68.3070908@redhat.com>

On 2015-06-04 12:54, Paolo Bonzini wrote:
> 
> 
> On 04/06/2015 07:03, Richard Henderson wrote:
> >> +            tcg_gen_add2_i32(t1, t2, REG(B11_8), t0, REG(B7_4), t0);
> >> +            tcg_gen_add2_i32(REG(B11_8), cpu_sr_t, t1, t2, cpu_sr_t,
> >> t0);
> > 
> > Swap these two adds and you don't need t2.  You can consume sr_t
> > immediately and start producing it in the same go.
> 
> Could TCG do some kind of intra-basic-block live range splitting?  In
> this case, the new sr_t could be allocated to a different register than
> the old one, saving one instruction on 2-address targets.

TCG doesn't use a fixed register to a temp, so it's kind of difficult to
know, but let's say it more or less do that (see below). On the other
hand it is really bad at handling the constant in that case.

> The pseudocode below uses "dest, src" operand order:
> 
>    // add2(t1, cpu_sr_t, cpu_sr_t, t0, REG(B7_4), t0)
>    add sr_t_in, B7_4    // instead of mov t1, sr_t; add t1, B7_4
>    mov sr_t_out, 0
>    adc sr_t_out, 0      // cout(B7_r + sr_t_in)

The registers are allocated from left to right, started by the inputs
first.

- cpu_sr_t is already in register or in memory and loaded to a register
- t0 is a constant, and the add2 op on x86_64 do not accept a constant
  three so it is loaded to a register. However it is aliased to the
  output and not dead as used again in the second add2 instruction. It
  is therefore copied into another register.
- REG(B7_4) is already in register or in memory and loaded to a register
- t0 appears again and has been loaded to a register and therefore not
  anymore a constant.

We therefore end up with (Intel notation)

     xor %ebx, %ebx       // this is t0
     mov %r12d, %ebx      // a copy of t0
     add %r13d, %ebp      // %r13d contains B7_4 and %ebp contains sr_t
     adc %r12d, %ebx      // %r12d is the new sr_t  

>    // add2(REG(B11_8), cpu_sr_t, t1, cpu_sr_t, REG(B11_8), t0)
>    add B11_8, sr_t_in   // B11_8 + B7_4 + sr_t_in
>    adc sr_t_out, 0      // cout(B11_8 + B7_4 + sr_t_in)

     add %ebp, %r13d      // %ebp is now B11_8
     adc %ebx, %r12d      // %ebx is now cpu_sr_t

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

next prev parent reply	other threads:[~2015-06-04 16:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-24 23:37 [Qemu-devel] [PATCH v3 0/8] target-sh4: optimizations and cleanups Aurelien Jarno
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 1/8] target-sh4: use bit number for SR constants Aurelien Jarno
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 2/8] target-sh4: Split out T from SR Aurelien Jarno
2015-06-04  5:01   ` Richard Henderson
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 3/8] target-sh4: optimize addc using add2 Aurelien Jarno
2015-06-04  5:03   ` Richard Henderson
2015-06-04 10:54     ` Paolo Bonzini
2015-06-04 16:08       ` Aurelien Jarno [this message]
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 4/8] target-sh4: optimize subc using sub2 Aurelien Jarno
2015-06-04  5:05   ` Richard Henderson
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 5/8] target-sh4: optimize negc using add2 and sub2 Aurelien Jarno
2015-06-04  5:07   ` Richard Henderson
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 6/8] target-sh4: split out Q and M from of SR and optimize div1 Aurelien Jarno
2015-06-04  5:31   ` Richard Henderson
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 7/8] target-sh4: factorize fmov implementation Aurelien Jarno
2015-05-24 23:37 ` [Qemu-devel] [PATCH v3 8/8] target-sh4: remove dead code Aurelien Jarno

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150604160851.GC10311@aurel32.net \
    --to=aurelien@aurel32.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).