From: Paolo Bonzini <pbonzini@redhat.com>
To: Torbjorn Granlund <tg@gmplib.org>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Possible ppc comparision optimisation
Date: Wed, 08 May 2013 18:16:43 +0200 [thread overview]
Message-ID: <518A7A6B.6050404@redhat.com> (raw)
In-Reply-To: <86haidv5lz.fsf@shell.gmplib.org>
Il 08/05/2013 17:44, Torbjorn Granlund ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
> I think that would be faster on 32-bit hosts, truncs are cheap.
>
> And slower perhaps on 64-bit hosts, at least for operations where
> additional explicit trunctation will be needed (such as before
> comparisions and after right shifts).
>
> > There could be a disadvantage of this compared to the old code, since
> > this has a chained algebraic dependency, while the old code's many
> > instructions might have been more independent.
>
> What about these alternatives:
>
> setcond LT, t0, arg0, arg1
> setcond EQ, t1, arg0, arg1
> trunc s0, t0
> trunc s1, t1
> shli s0, s0, 1 ; s0 = (arg0 < arg1) ? 2 : 0
> subi s1, s1, 2 ; s1 = (arg0 != arg1) ? -2 : -1
> sub s0, s0, s1 ; < 4 == 1 > 2
> shli s0, s0, 1 ; < 8 == 2 > 4
>
> =======
>
> setcond LT, t0, arg0, arg1
> setcond NE, t1, arg0, arg1
> trunc s0, t0
> trunc s1, t1
> add s0, s0, s1 ; < 2 == 0 > 1
> movi s1, 1
> add s0, s0, s1 ; < 3 == 1 > 2
> shl s1, s1, s0 ; < 8 == 2 > 4
>
> Surely there are many alternative forms.
> Is your aim to add micro-parallelism?
Yes, I think in this respect I think the first one is better. The
second could be three instructions on machines that have a set-nth-bit
instruction _and_ a zero register, but I'm not sure they exist...
> (Your sequences look a bit curious. Did you use a super-optimiser?)
No, but I am attracted to these curious sequences from my previous life
working on compilers. :) I know your superoptimizer and, in fact, we
both worked on some parts of GCC (optimization of conditional
branches/stores), just 20 years apart.
The second is actually not too curious after you look at it for a while,
it is a variant of the usual (x > y) + (x >= y) trick used to generate a
0/1/2 result. The first I found by trial and error based on yours; it
is basically (x < y) * 2 - (x == y) + 2, with some reordering to get
parallelism and avoid the need for subfi-like instructions.
Paolo
prev parent reply other threads:[~2013-05-08 16:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-07 22:56 [Qemu-devel] Possible ppc comparision optimisation Torbjorn Granlund
2013-05-08 8:05 ` Paolo Bonzini
2013-05-08 15:44 ` Torbjorn Granlund
2013-05-08 16:16 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=518A7A6B.6050404@redhat.com \
--to=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tg@gmplib.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).