From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35D06C43214 for ; Thu, 26 Aug 2021 13:57:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F59F60F35 for ; Thu, 26 Aug 2021 13:57:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242755AbhHZN6R (ORCPT ); Thu, 26 Aug 2021 09:58:17 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:38630 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242768AbhHZN6K (ORCPT ); Thu, 26 Aug 2021 09:58:10 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id C66CF2230B; Thu, 26 Aug 2021 13:57:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1629986237; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=EnhMvs9n1nhsU1o7+xe9+Bo6hjRyCva0kdWrwKPRW08=; b=GR//V0SiIEZKLC6w/96Ri/cMSee15eFRF7D1Vw13XhIHov2iDzsBhkCDnzKVLSJpFK3HsG 8kGlTGpPk7GQg4S3H0Q3HtwEcMPUwRcy6z3BgXTc4to++5I4qh2Z3tVt7aaH90iUtXrGB+ XxO2hZXNgRy0t1NpHNzYylTUjcl8whY= Received: from suse.cz (unknown [10.100.224.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 7E2C5A3B99; Thu, 26 Aug 2021 13:57:14 +0000 (UTC) Date: Thu, 26 Aug 2021 15:57:13 +0200 From: Petr Mladek To: Yury Norov Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mmc@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, "James E.J. Bottomley" , Alexander Lobakin , Alexander Shishkin , Alexey Klimov , Andrea Merello , Andy Shevchenko , Arnaldo Carvalho de Melo , Arnd Bergmann , Ben Gardon , Benjamin Herrenschmidt , Brian Cain , Catalin Marinas , Christoph Lameter , Daniel Bristot de Oliveira , David Hildenbrand , Dennis Zhou , Geert Uytterhoeven , Heiko Carstens , Ian Rogers , Ingo Molnar , Jaegeuk Kim , Jakub Kicinski , Jiri Olsa , Joe Perches , Jonas Bonn , Leo Yan , Mark Rutland , Namhyung Kim , Palmer Dabbelt , Paolo Bonzini , Peter Xu , Peter Zijlstra , Rasmus Villemoes , Rich Felker , Samuel Mendoza-Jonas , Sean Christopherson , Sergey Senozhatsky , Shuah Khan , Stefan Kristiansson , Steven Rostedt , Tejun Heo , Thomas Bogendoerfer , Ulf Hansson , Will Deacon , Wolfram Sang , Yoshinori Sato Subject: Re: [PATCH 11/17] find: micro-optimize for_each_{set,clear}_bit() Message-ID: References: <20210814211713.180533-1-yury.norov@gmail.com> <20210814211713.180533-12-yury.norov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210814211713.180533-12-yury.norov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-arch@vger.kernel.org On Sat 2021-08-14 14:17:07, Yury Norov wrote: > The macros iterate thru all set/clear bits in a bitmap. They search a > first bit using find_first_bit(), and the rest bits using find_next_bit(). > > Since find_next_bit() is called shortly after find_first_bit(), we can > save few lines of I-cache by not using find_first_bit(). Is this only a speculation or does it fix a real performance problem? The macro is used like: for_each_set_bit(bit, addr, size) { fn(bit); } IMHO, the micro-opimization does not help when fn() is non-trivial. > --- a/include/linux/find.h > +++ b/include/linux/find.h > @@ -280,7 +280,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned > #endif > > #define for_each_set_bit(bit, addr, size) \ > - for ((bit) = find_first_bit((addr), (size)); \ > + for ((bit) = find_next_bit((addr), (size), 0); \ > (bit) < (size); \ > (bit) = find_next_bit((addr), (size), (bit) + 1)) > It is not a big deal. I just think that the original code is slightly more self-explaining. Best Regards, Petr