qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Aurelien Jarno <aurelien@aurel32.net>
To: Stuart Brady <sdbrady@ntlworld.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] target-arm: use clz32() instead of a for loop
Date: Fri, 23 Oct 2009 16:38:52 +0200	[thread overview]
Message-ID: <4AE1BFFC.8030509@aurel32.net> (raw)
In-Reply-To: <20091023124745.GA32401@miranda.arrow>

Stuart Brady a écrit :
> On Fri, Oct 23, 2009 at 09:04:53AM +0200, Aurelien Jarno wrote:
>> Stuart Brady a écrit :
>>> Just a quick note that the implementation of clz, ctz and popcnt is
>>> still listed in the TCG TODO list.  The last time I looked, I noticed
>>> that quite a few architectures have clz/ctz instructions:
>>>
>>>    http://lkml.indiana.edu/hypermail/linux/kernel/0601.3/1683.html
>> OTOH, a dump shows that those instruction are not used than often, so I
>> am not sure it worth implementing it.
> 
> Really?  I'm surprised, as I gather that optimised ffs/fls/hweight
> functions in the kernel do give a modest gain...  I suppose I'll have
> to try it on several different targets and see! :-)

I gave a quick look at MIPS, and at least here, it is used often.

>>> For those that don't, I think a combination the following two hacks at
>>> http://graphics.stanford.edu/~seander/bithacks.html could be used:
>> The best is probably to use an helper in that case, calling clz32(x).
> 
> Yes, you're right.
> 
> There are several other places that should also call clz32()/ctz32().
> The ones that I can see are helper_neon_cls_s32() for ARM, helper_bsf()
> and helper_bsr() for X86, helper_ff1() for M68K.  (I'm not sure about
> 'do_clz8' and 'do_clz16', though.)
> 
> At some point, possibly next weekend, I'll submit patches to add clz
> and ctz helpers to tcg-runtime.c, and to convert Alpha, ARM, CRIS, M68K,
> MIPS, PowerPC and x86 (any others I've missed?) to use those helpers.

The main problem I see for a TCG implementation is the definition of
clz/ctz. Some targets define that clz(0) or ctz(0) returns 32, some
other define it as being "undefined".

If we go for the common denominator for the TCG op, that is clz(0) =
undefined, it means that a test with brcond has to be added in the
targets using clz(0) = 32, and this is likely to give more slow down
than speed gain.

If we go for clz(0) = 32, it means the test has to be implemented in
TCG, which might be complicated for some hosts.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

      reply	other threads:[~2009-10-23 14:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-15 21:14 [Qemu-devel] [PATCH] target-arm: use clz32() instead of a for loop Aurelien Jarno
2009-10-18 14:21 ` Laurent Desnogues
2009-10-23  0:34 ` Stuart Brady
2009-10-23  7:04   ` Aurelien Jarno
2009-10-23 12:47     ` Stuart Brady
2009-10-23 14:38       ` Aurelien Jarno [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AE1BFFC.8030509@aurel32.net \
    --to=aurelien@aurel32.net \
    --cc=qemu-devel@nongnu.org \
    --cc=sdbrady@ntlworld.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).