From: Aurelien Jarno <aurelien@aurel32.net>
To: Stuart Brady <sdbrady@ntlworld.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] target-arm: use clz32() instead of a for loop
Date: Fri, 23 Oct 2009 16:38:52 +0200 [thread overview]
Message-ID: <4AE1BFFC.8030509@aurel32.net> (raw)
In-Reply-To: <20091023124745.GA32401@miranda.arrow>
Stuart Brady a écrit :
> On Fri, Oct 23, 2009 at 09:04:53AM +0200, Aurelien Jarno wrote:
>> Stuart Brady a écrit :
>>> Just a quick note that the implementation of clz, ctz and popcnt is
>>> still listed in the TCG TODO list. The last time I looked, I noticed
>>> that quite a few architectures have clz/ctz instructions:
>>>
>>> http://lkml.indiana.edu/hypermail/linux/kernel/0601.3/1683.html
>> OTOH, a dump shows that those instruction are not used than often, so I
>> am not sure it worth implementing it.
>
> Really? I'm surprised, as I gather that optimised ffs/fls/hweight
> functions in the kernel do give a modest gain... I suppose I'll have
> to try it on several different targets and see! :-)
I gave a quick look at MIPS, and at least here, it is used often.
>>> For those that don't, I think a combination the following two hacks at
>>> http://graphics.stanford.edu/~seander/bithacks.html could be used:
>> The best is probably to use an helper in that case, calling clz32(x).
>
> Yes, you're right.
>
> There are several other places that should also call clz32()/ctz32().
> The ones that I can see are helper_neon_cls_s32() for ARM, helper_bsf()
> and helper_bsr() for X86, helper_ff1() for M68K. (I'm not sure about
> 'do_clz8' and 'do_clz16', though.)
>
> At some point, possibly next weekend, I'll submit patches to add clz
> and ctz helpers to tcg-runtime.c, and to convert Alpha, ARM, CRIS, M68K,
> MIPS, PowerPC and x86 (any others I've missed?) to use those helpers.
The main problem I see for a TCG implementation is the definition of
clz/ctz. Some targets define that clz(0) or ctz(0) returns 32, some
other define it as being "undefined".
If we go for the common denominator for the TCG op, that is clz(0) =
undefined, it means that a test with brcond has to be added in the
targets using clz(0) = 32, and this is likely to give more slow down
than speed gain.
If we go for clz(0) = 32, it means the test has to be implemented in
TCG, which might be complicated for some hosts.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
aurelien@aurel32.net http://www.aurel32.net
prev parent reply other threads:[~2009-10-23 14:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-15 21:14 [Qemu-devel] [PATCH] target-arm: use clz32() instead of a for loop Aurelien Jarno
2009-10-18 14:21 ` Laurent Desnogues
2009-10-23 0:34 ` Stuart Brady
2009-10-23 7:04 ` Aurelien Jarno
2009-10-23 12:47 ` Stuart Brady
2009-10-23 14:38 ` Aurelien Jarno [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AE1BFFC.8030509@aurel32.net \
--to=aurelien@aurel32.net \
--cc=qemu-devel@nongnu.org \
--cc=sdbrady@ntlworld.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.