From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758796AbYDSU0p (ORCPT ); Sat, 19 Apr 2008 16:26:45 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755808AbYDSU0g (ORCPT ); Sat, 19 Apr 2008 16:26:36 -0400 Received: from out1.smtp.messagingengine.com ([66.111.4.25]:33924 "EHLO out1.smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755775AbYDSU0f convert rfc822-to-8bit (ORCPT ); Sat, 19 Apr 2008 16:26:35 -0400 Message-Id: <1208636794.26656.1248793071@webmail.messagingengine.com> X-Sasl-Enc: SG5FVDbTivN0t+99x0Z276kaaKOsfvcvI3WauPntuSNx 1208636794 From: "Alexander van Heukelum" To: "Joe Perches" Cc: "dean gaudet" , "Harvey Harrison" , "Alexander van Heukelum" , "Ingo Molnar" , "Andi Kleen" , "LKML" Content-Disposition: inline Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface References: <20080331171506.GA24017@mailshack.com> <20080401084710.GB4787@elte.hu> <20080401094618.GA24862@mailshack.com> <1207507897.18129.1246358115@webmail.messagingengine.com> <1207563950.7880.1246457209@webmail.messagingengine.com> <20080418201809.GA5036@mailshack.com> <1208563762.10414.19.camel@brick> <1208566724.4891.25.camel@localhost> <1208567093.10414.20.camel@brick> <1208573728.11990.14.camel@localhost> <1208607019.13829.1248748343@webmail.messagingengine.com> <1208629021.12388.25.camel@localhost> Subject: Re: Alternative implementation of the generic __ffs In-Reply-To: <1208629021.12388.25.camel@localhost> Date: Sat, 19 Apr 2008 22:26:34 +0200 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 19 Apr 2008 11:17:01 -0700, "Joe Perches" said: > On Sat, 2008-04-19 at 14:10 +0200, Alexander van Heukelum wrote: > > I've added that to the benchmark, which you can now find here: > > http://heukelum.fastmail.fm/ffs/. Thanks! Added the version you sent to the program and added the results of the ARM processor to the page. More ideas welcome ;). > retested on arm: > > $ gcc -Os -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3170 tics, 8326 tics > New: 4214 tics, 8793 tics > Smallest: 4023 tics, 7733 tics > Small const: 3442 tics, 6188 tics > Empty loop: 1517 tics, 2243 tics > > $ gcc -O2 -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3172 tics, 7832 tics > New: 4805 tics, 8790 tics > Smallest: 4405 tics, 7154 tics > Small const: 3442 tics, 5612 tics > Empty loop: 1516 tics, 2145 tics > > $ gcc -O3 -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3080 tics, 7709 tics > New: 4723 tics, 8656 tics > Smallest: 4333 tics, 7121 tics > Small const: 3379 tics, 5483 tics > Empty loop: 1447 tics, 2016 tics > > > Testing the same with > > "return x4 + x3 + x2 + x1 + x0;" as the last line would be > > interesting too. > > Adding is slower: > > $ gcc -Os -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3152 tics, 8310 tics > New: 4214 tics, 8789 tics > Smallest: 4024 tics, 7737 tics > Small const: 3538 tics, 6295 tics > Empty loop: 1517 tics, 2243 tics > > $ gcc -O2 -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3184 tics, 7849 tics > New: 4790 tics, 8814 tics > Smallest: 4406 tics, 7161 tics > Small const: 3538 tics, 5806 tics > Empty loop: 1521 tics, 2153 tics > > $ gcc -O3 -fomit-frame-pointer ffs.c > $ ./a.out > Original: 3091 tics, 7694 tics > New: 4718 tics, 8656 tics > Smallest: 4333 tics, 7124 tics > Small const: 3467 tics, 5687 tics > Empty loop: 1445 tics, 2066 tics > > -- Alexander van Heukelum heukelum@fastmail.fm -- http://www.fastmail.fm - And now for something completely different…