linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Paubert <paubert@iram.es>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>,
	linux-kernel@vger.kernel.org, Scott Wood <oss@buserror.net>,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2
Date: Sat, 13 Aug 2016 00:49:50 +0200	[thread overview]
Message-ID: <20160812224950.GA21040@visitor2.iram.es> (raw)
In-Reply-To: <20160811221119.GA26763@gate.crashing.org>

On Thu, Aug 11, 2016 at 05:11:19PM -0500, Segher Boessenkool wrote:
> On Thu, Aug 11, 2016 at 11:34:37PM +0200, Gabriel Paubert wrote:
> > On the other hand gcc did at the time a very poor job (quite an
> > understatement) at bswapdi when compiling for 64 bit processors 
> > (see the example).
> > 
> > But what do modern compilers generate for bswapdi these days? Do they
> > still call the library or not?
> 
> Nope.

Great, could then these functions be removed from misc_32.S, or are
compilers that use libcalls still supported for kernel builds?

> 
> > After all, bswapdi on 32 bit processors only takes 6 instructions if the
> > input and output registers don't overlap.
> 
> For this testcase:
> ===
> typedef unsigned long long u64;
> u64 bs(u64 x) { return __builtin_bswap64(x); }
> ===
> 
> we get with -m32:
> ===
> bs:
> 	mr 9,3
> 	rotlwi 3,4,24
> 	rlwimi 3,4,8,8,15
> 	rlwimi 3,4,8,24,31
> 	rotlwi 4,9,24
> 	rlwimi 4,9,8,8,15
> 	rlwimi 4,9,8,24,31
> 	blr

In this case the compiler is constrained by the fact that the input and
ouput registers are the same. When inlined with other things it can
probably perform better scheduling and interleaving of operations.


> ===
> 
> and with -m64:
> ===
> .L.bs:
> 	srdi 10,3,32
> 	mr 9,3
> 	rotlwi 3,3,24
> 	rotlwi 8,10,24
> 	rlwimi 3,9,8,8,15
> 	rlwimi 8,10,8,8,15
> 	rlwimi 3,9,8,24,31
> 	rlwimi 8,10,8,24,31
> 	sldi 3,3,32
> 	or 3,3,8
> 	blr
> ===
> 

As demonstrated here where the two halves of the 64 bit quantity
are byte swapped in an interleaved fashion. Not perfect (I think
that with proper ordering the last 2 instructions could be replaced
by a rldimi), but reasonable.

> Neither as tight as possible, but neither horrible either.
> 

Indeed.

    Gabriel

      reply	other threads:[~2016-08-12 22:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-05 11:28 [PATCH] powerpc/32: Remove one insn in __bswapdi2 Christophe Leroy
2016-08-10  8:56 ` Gabriel Paubert
2016-08-10 10:18   ` Christophe Leroy
2016-08-11 21:34     ` Gabriel Paubert
2016-08-11 22:11       ` Segher Boessenkool
2016-08-12 22:49         ` Gabriel Paubert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160812224950.GA21040@visitor2.iram.es \
    --to=paubert@iram.es \
    --cc=christophe.leroy@c-s.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=oss@buserror.net \
    --cc=paulus@samba.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).