All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Torbjorn Granlund <tg@gmplib.org>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Possible ppc comparision optimisation
Date: Wed, 08 May 2013 18:16:43 +0200	[thread overview]
Message-ID: <518A7A6B.6050404@redhat.com> (raw)
In-Reply-To: <86haidv5lz.fsf@shell.gmplib.org>

Il 08/05/2013 17:44, Torbjorn Granlund ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>   I think that would be faster on 32-bit hosts, truncs are cheap.
>   
> And slower perhaps on 64-bit hosts, at least for operations where
> additional explicit trunctation will be needed (such as before
> comparisions and after right shifts).
> 
>   > There could be a disadvantage of this compared to the old code, since
>   > this has a chained algebraic dependency, while the old code's many
>   > instructions might have been more independent.
>   
>   What about these alternatives:
>   
>   setcond LT, t0, arg0, arg1
>   setcond EQ, t1, arg0, arg1
>   trunc  s0, t0
>   trunc  s1, t1
>   shli   s0, s0, 1                ; s0 = (arg0 < arg1) ? 2 : 0
>   subi   s1, s1, 2                ; s1 = (arg0 != arg1) ? -2 : -1
>   sub    s0, s0, s1               ; < 4       == 1      > 2
>   shli   s0, s0, 1                ; < 8       == 2      > 4
>   
>   =======
>   
>   setcond LT, t0, arg0, arg1
>   setcond NE, t1, arg0, arg1
>   trunc   s0, t0
>   trunc   s1, t1
>   add     s0, s0, s1              ; < 2       == 0      > 1
>   movi    s1, 1
>   add     s0, s0, s1              ; < 3       == 1      > 2
>   shl     s1, s1, s0              ; < 8       == 2      > 4
>   
> Surely there are many alternative forms.
> Is your aim to add micro-parallelism?

Yes, I think in this respect I think the first one is better.  The
second could be three instructions on machines that have a set-nth-bit
instruction _and_ a zero register, but I'm not sure they exist...

> (Your sequences look a bit curious.  Did you use a super-optimiser?)

No, but I am attracted to these curious sequences from my previous life
working on compilers. :)  I know your superoptimizer and, in fact, we
both worked on some parts of GCC (optimization of conditional
branches/stores), just 20 years apart.

The second is actually not too curious after you look at it for a while,
it is a variant of the usual (x > y) + (x >= y) trick used to generate a
0/1/2 result.  The first I found by trial and error based on yours; it
is basically (x < y) * 2 - (x == y) + 2, with some reordering to get
parallelism and avoid the need for subfi-like instructions.

Paolo

      reply	other threads:[~2013-05-08 16:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-07 22:56 [Qemu-devel] Possible ppc comparision optimisation Torbjorn Granlund
2013-05-08  8:05 ` Paolo Bonzini
2013-05-08 15:44   ` Torbjorn Granlund
2013-05-08 16:16     ` Paolo Bonzini [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518A7A6B.6050404@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tg@gmplib.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.