From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41032) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxvMf-0007du-TK for qemu-devel@nongnu.org; Fri, 20 Jun 2014 05:44:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WxvMZ-0000Ps-Td for qemu-devel@nongnu.org; Fri, 20 Jun 2014 05:44:09 -0400 Received: from hall.aurel32.net ([2001:bc8:30d7:101::1]:36118) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxvMZ-0000Pm-Oe for qemu-devel@nongnu.org; Fri, 20 Jun 2014 05:44:03 -0400 Date: Fri, 20 Jun 2014 11:43:55 +0200 From: Aurelien Jarno Message-ID: <20140620094355.GA12011@ohm.rr44.fr> References: <1387711957-703-1-git-send-email-aurelien@aurel32.net> <53A2A102.4060807@redhat.com> <20140620084806.GB13901@ohm.rr44.fr> <53A3F7B7.2030206@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <53A3F7B7.2030206@redhat.com> Subject: Re: [Qemu-devel] [PATCH] bitops: provide an inline implementation of find_first_bit List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, Corentin Chary , Richard Henderson On Fri, Jun 20, 2014 at 10:58:31AM +0200, Paolo Bonzini wrote: > Il 20/06/2014 10:48, Aurelien Jarno ha scritto: > >In practice on x86_64, this function takes 27 instructions in the > >general case, and 18 instructions in the fixed case, even for big > >sizes. I therefore think that checking if the size is constant is a good > >idea, but we should not make any test on the size itself and trust the > >compiler to correctly decide if the loop should be unrolled or not. > > But if the size is large enough that the compiler will (likely) not > unroll the function, then it should pay off to use the more > optimized code in find_next_bit. The point there is that given find_next_bit is a generalized version of find_first_bit, it is actually slower. I originally noticed that by running profiling tools and noticing this function appeared relatively high for what it is supposed to do. > This of course is unless you expect find_first_bit to return a small > value and not be used in a loop; and dually expect find_next_bit's > usage to be more like walking sparser bitmaps in a loop. I think that's the point. In the TCG case, this is used to map the temp allocation to answer the question "give me a free temp". That said people might invent new usages. > This actually makes sense, and then there's no need to change anything. > > Paolo > -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurelien@aurel32.net http://www.aurel32.net