From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N1JXL-0000JY-En for qemu-devel@nongnu.org; Fri, 23 Oct 2009 08:46:31 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N1JXG-0000Iv-Qs for qemu-devel@nongnu.org; Fri, 23 Oct 2009 08:46:31 -0400 Received: from [199.232.76.173] (port=48072 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N1JXG-0000Is-KQ for qemu-devel@nongnu.org; Fri, 23 Oct 2009 08:46:26 -0400 Received: from mtaout01-winn.ispmail.ntl.com ([81.103.221.47]:40379) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N1JXG-0001pz-6T for qemu-devel@nongnu.org; Fri, 23 Oct 2009 08:46:26 -0400 Date: Fri, 23 Oct 2009 13:47:45 +0100 From: Stuart Brady Subject: Re: [Qemu-devel] [PATCH] target-arm: use clz32() instead of a for loop Message-ID: <20091023124745.GA32401@miranda.arrow> References: <20091015211452.GC7071@volta.aurel32.net> <20091023003417.GA31360@miranda.arrow> <4AE15595.8050709@aurel32.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4AE15595.8050709@aurel32.net> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno Cc: qemu-devel@nongnu.org On Fri, Oct 23, 2009 at 09:04:53AM +0200, Aurelien Jarno wrote: > Stuart Brady a écrit : > > Just a quick note that the implementation of clz, ctz and popcnt is > > still listed in the TCG TODO list. The last time I looked, I noticed > > that quite a few architectures have clz/ctz instructions: > > > > http://lkml.indiana.edu/hypermail/linux/kernel/0601.3/1683.html > > OTOH, a dump shows that those instruction are not used than often, so I > am not sure it worth implementing it. Really? I'm surprised, as I gather that optimised ffs/fls/hweight functions in the kernel do give a modest gain... I suppose I'll have to try it on several different targets and see! :-) > > For those that don't, I think a combination the following two hacks at > > http://graphics.stanford.edu/~seander/bithacks.html could be used: > > The best is probably to use an helper in that case, calling clz32(x). Yes, you're right. There are several other places that should also call clz32()/ctz32(). The ones that I can see are helper_neon_cls_s32() for ARM, helper_bsf() and helper_bsr() for X86, helper_ff1() for M68K. (I'm not sure about 'do_clz8' and 'do_clz16', though.) At some point, possibly next weekend, I'll submit patches to add clz and ctz helpers to tcg-runtime.c, and to convert Alpha, ARM, CRIS, M68K, MIPS, PowerPC and x86 (any others I've missed?) to use those helpers. Cheers, -- Stuart Brady