From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261283AbTD3IZK (ORCPT ); Wed, 30 Apr 2003 04:25:10 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261706AbTD3IZJ (ORCPT ); Wed, 30 Apr 2003 04:25:09 -0400 Received: from vitelus.com ([64.81.243.207]:63239 "EHLO vitelus.com") by vger.kernel.org with ESMTP id S261283AbTD3IZJ (ORCPT ); Wed, 30 Apr 2003 04:25:09 -0400 Date: Wed, 30 Apr 2003 01:36:07 -0700 From: Aaron Lehmann To: Willy Tarreau Cc: Daniel Phillips , Linus Torvalds , linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH] Faster generic_fls Message-ID: <20030430083607.GA6035@vitelus.com> References: <200304300446.24330.dphillips@sistina.com> <20030430071907.GA30999@alpha.home.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030430071907.GA30999@alpha.home.local> User-Agent: Mutt/1.5.4i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 30, 2003 at 09:19:07AM +0200, Willy Tarreau wrote: > But gcc's optimizer seems to be even worse with each new version. The NIL > checksum takes 3 times more to complete on gcc3/athlon than gcc2/athlon !!! > And on some test, the P4 goes faster when the code is optimized for athlon > than for P4 ! It would be interesting to compare the assembly output of the compilers to see what's happening. The gcc folks appreciate test cases and this function seems simple and sensitive enough to make a good one. > ===== gcc-2.95.3 -march=i686 -O2 / athlon MP/2200 (1.8 GHz) ===== -march=athlon would give gcc a chance to make better scheduling decisions, which could make the results more sensible. > ==== gcc 3.2.3 -march=i686 -O3 / Pentium 4 / 2.0 GHz ==== This version of gcc accepts -march=pentium4, which again might make gcc's behavior less bizarre.