All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "Gabriel Paubert" <paubert@iram.es>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
	Ingo Molnar <mingo@elte.hu>,
	linux-next@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@ozlabs.org
Subject: Re: linux-next: x86-latest/powerpc-next merge conflict
Date: Mon, 21 Apr 2008 16:19:34 +0200	[thread overview]
Message-ID: <1208787574.7995.1249020641@webmail.messagingengine.com> (raw)
In-Reply-To: <20080421133606.GA27304@iram.es>

On Mon, 21 Apr 2008 15:36:06 +0200, "Gabriel Paubert" <paubert@iram.es>
said:
> On Mon, Apr 21, 2008 at 03:07:13PM +0200, Alexander van Heukelum wrote:
> > On Mon, 21 Apr 2008 22:13:06 +1000, "Paul Mackerras" <paulus@samba.org>
> > said:
> > > Alexander van Heukelum writes:
> > > > Powerpc would pick up an optimized version via this chain: generic =
fls64
> > > > ->
> > > > powerpc __fls --> __ilog2 --> asm (PPC_CNTLZL "%0,%1" : "=3Dr" (lz)=
 : "r"
> > > > (x)).
> > >=20
> > > Why wouldn't powerpc continue to use the fls64 that I have in there
> > > now?
> >=20
> > In Linus' tree that would be the generic one that uses (the 32-bit)
> > fls():
> >=20
> > static inline int fls64(__u64 x)
> > {
> >         __u32 h =3D x >> 32;
> >         if (h)
> >                 return fls(h) + 32;
> >         return fls(x);
> > }
> >=20
> > > > However, the generic version of fls64 first tests the argument for =
zero.
> > > > From
> > > > your code I derive that the count-leading-zeroes instruction for
> > > > argument zero
> > > > is defined as cntlzl(0) =3D=3D BITS_PER_LONG.
> > >=20
> > > That is correct.  If the argument is 0 then all of the zero bits are
> > > leading zeroes. :)
> >=20
> > So... for 64-bit powerpc it makes sense to have its own implementation
> > and ignore the (improved) generic one and for 32-bit powerpc the generic
> > implementation of fls64 is fine. The current situation in linux-next
> > seems
> > optimal to me.
>=20
>=20
> Not so sure, the optimal version of fls64 for 32 bit PPC seems to be:
>=20
> 	cntlzw	ch,h ; ch =3D fls32(h) where h =3D x>>32
> 	cntlzw	cl,l ; cl =3D fls32(l) where l =3D (__u32)x
> 	srwi	t1,ch,5
> 	neg	t1,t1	; t1 =3D (h=3D=3D0) ? -1 : 0
> 	and	cl,t1,cl ; cl =3D (h=3D=3D0) ? cl : 0
> 	add	result,ch,cl
>=20
> That's only 6 instructions without any branch, although the dependency=20
> chain is 5 instructions long. Good luck getting the compiler to=20
> generate something as compact as this.

I should not have said the magic word optimal, I guess ;). The code
you show would fit nicely as an arch-specific optimized version of
fls64 for 32-bit powerpc in include/arch-powerpc/bitops.h.

Greetings,
    Alexander

(who is not going to write and test a patch with
powerpc inline assembly soon. srwi?)

> Don't worry about the number of cntlzw, it's one clock on all 32 bit=20
> PPC processors I know, some may even be able to perform 2 or 3 cntlzw=20
> per clock.
>=20
> 	Regards,
> 	Gabriel
>=20
--=20
  Alexander van Heukelum
  heukelum@fastmail.fm

--=20
http://www.fastmail.fm - Same, same, but different=85

  reply	other threads:[~2008-04-21 14:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-21  9:12 linux-next: x86-latest/powerpc-next merge conflict Stephen Rothwell
2008-04-21  9:51 ` Ingo Molnar
2008-04-21 11:19   ` Alexander van Heukelum
2008-04-21 11:30     ` Alexander van Heukelum
2008-04-21 12:13     ` Paul Mackerras
2008-04-21 13:07       ` Alexander van Heukelum
2008-04-21 13:36         ` Gabriel Paubert
2008-04-21 14:19           ` Alexander van Heukelum [this message]
2008-04-21 12:10   ` Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1208787574.7995.1249020641@webmail.messagingengine.com \
    --to=heukelum@fastmail.fm \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mingo@elte.hu \
    --cc=paubert@iram.es \
    --cc=paulus@samba.org \
    --cc=sfr@canb.auug.org.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.