From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DA327C433F5 for ; Mon, 29 Nov 2021 07:39:05 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4J2cfw1l4Hz3bTS for ; Mon, 29 Nov 2021 18:39:04 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Ximtyy9D; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::f2d; helo=mail-qv1-xf2d.google.com; envelope-from=yury.norov@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Ximtyy9D; dkim-atps=neutral Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4J2bKP6WZVz2xYQ for ; Mon, 29 Nov 2021 17:38:47 +1100 (AEDT) Received: by mail-qv1-xf2d.google.com with SMTP id i12so280717qvh.11 for ; Sun, 28 Nov 2021 22:38:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=GW7wD7qKvROu8z9xZL5o2Gh9JvMEXrXNdc/d8bAaPNs=; b=Ximtyy9D6hF34jeHkx2abhxO2fEDksv5f5qJu4CbvoNWokti3yxmfDL3L7TYPRNI1N nLBExnAtE1kHxfW3UuSIcTZjpfdkrYaWw77PtYlgHprrRWmhg8onSirLlPaay6tbmdGV QOolCDulR6Tk/EKA2Gv8Xlb2VKcWxKtuMIpV9m8bP/Y2OIkRrwh1D8Uy/bj62ViSMN2l ZvGgVYwmHud/XyWXGXVdXp5Qsb6Tf/NpieRKvdWwCGOCaqCRU7xxjxP8pUys/Y0my6pK DI87mJm/+MqXPnLUnP7Gv+O+UoGKFiXftZWbYzOV5qGvrEcopuUsleS20idOc1qycgZd CJ3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=GW7wD7qKvROu8z9xZL5o2Gh9JvMEXrXNdc/d8bAaPNs=; b=fsHnrY8QWSC8SFrw2gXXf6KKjqTEcLZXtoNyg3wxr2ZjJM+ZwD3xTzwCVUB46MlbV2 WntcNZk0hMP+fjy6sEONHJ00/mUbClwVI3aQuSvv8H0uIGkh28gQtESFG7DvfgejN82t OpGnCo7Kfxl+o1dmW9BMsicaWLx+y4q+ezUI/2CZFXtOO/gL+6f8XOYtbGb7lVqx4Kg0 guASQHiHKDDsCZVU4xZgwiVSp2TToVkh1f84xunDM4XlPd+SLOZk+LTiOmUxVmwXy28T iKu73Hc2tB+/KU1cKOtHRcjOxtLzQq5uPCSVRTAjzAbdMAJ2Cl0NfuDVfReFPZ3xPV4+ a1IA== X-Gm-Message-State: AOAM532AJ9zUrzse3LjGIPq7zI99DgFFNRu42NYTnHTi+AxUWmnCdyY5 mqhTWnUwqLxLL3YAZUi5zM4= X-Google-Smtp-Source: ABdhPJwSywxmnS7RgaW8ur/GbWm8YSeTFfi4rMgLh/NdfnxtrZBeDnmatO6ciZp/cAThb6LO+btZbw== X-Received: by 2002:a05:6214:984:: with SMTP id dt4mr30031399qvb.120.1638167922796; Sun, 28 Nov 2021 22:38:42 -0800 (PST) Received: from localhost ([66.216.211.25]) by smtp.gmail.com with ESMTPSA id h19sm8495514qth.63.2021.11.28.22.38.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Nov 2021 22:38:42 -0800 (PST) Date: Sun, 28 Nov 2021 22:38:39 -0800 From: Yury Norov To: mirq-test@rere.qmqm.pl Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Message-ID: <20211129063839.GA338729@lapt> References: <20211128035704.270739-1-yury.norov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Mailman-Approved-At: Mon, 29 Nov 2021 18:38:27 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juri Lelli , Andrew Lunn , "Rafael J. Wysocki" , Catalin Marinas , Guo Ren , Christoph Lameter , Christoph Hellwig , Andi Kleen , Vincent Guittot , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , Viresh Kumar , Petr Mladek , Arnaldo Carvalho de Melo , Jens Axboe , Andy Lutomirski , Lee Jones , Greg Kroah-Hartman , Randy Dunlap , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Sergey Senozhatsky , Thomas Gleixner , linux-crypto@vger.kernel.org, Tejun Heo , Andrew Morton , Mark Rutland , Anup Patel , linux-ia64@vger.kernel.org, David Airlie , Roy Pledge , Dave Hansen , Solomon Peachy , Stephen Rothwell , Krzysztof Kozlowski , Dennis Zhou , Matti Vaittinen , linux-alpha@vger.kernel.org, Kalle Valo , Stephen Boyd , Tariq Toukan , Dinh Nguyen , Jonathan Cameron , Ulf Hansson , Alexander Shishkin , Mike Marciniszyn , Rasmus Villemoes , Subbaraya Sundeep , Will Deacon , Sagi Grimberg , linux-csky@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com, linux-arm-kernel@lists.infradead.org, linux-snps-arc@lists.infradead.org, Kees Cook , Yury Norov , "James E.J. Bottomley" , Vineet Gupta , Steven Rostedt , Mark Gross , Borislav Petkov , Mauro Carvalho Chehab , Thomas Bogendoerfer , "Martin K. Petersen" , David Laight , Sudeep Holla , Geetha sowjanya , Ian Rogers , kvm@vger.kernel.org, Peter Zijlstra , Amitkumar Karwar , linux-mm@kvack.org, linux-riscv@lists.infradead.org, Jiri Olsa , Ard Biesheuvel , Arnd Bergmann , Marc Zyngier , Russell King , Andy Gross , Jakub Kicinski , Vivien Didelot , Sunil Goutham , "Paul E. McKenney" , linux-s390@vger.kernel.org, Alexey Klimov , Heiko Carstens , Hans de Goede , Nicholas Piggin , Marcin Wojtas , Vlastimil Babka , linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org, Palmer Dabbelt , Daniel Vetter , Jason Wessel , Saeed Mahameed , Andy Shevchenko Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote: > On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > > In many cases people use bitmap_weight()-based functions like this: > > > > if (num_present_cpus() > 1) > > do_something(); > > > > This may take considerable amount of time on many-cpus machines because > > num_present_cpus() will traverse every word of underlying cpumask > > unconditionally. > > > > We can significantly improve on it for many real cases if stop traversing > > the mask as soon as we count present cpus to any number greater than 1: > > > > if (num_present_cpus_gt(1)) > > do_something(); > > > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > > functions together with corresponding wrappers in cpumask and nodemask. > > Having slept on it I have more structured thoughts: > > First, I like substituting bitmap_empty/full where possible - I think > the change stands on its own, so could be split and sent as is. Ok, I can do it. > I don't like the proposed API very much. One problem is that it hides > the comparison operator and makes call sites less readable: > > bitmap_weight(...) > N > > becomes: > > bitmap_weight_gt(..., N) > > and: > bitmap_weight(...) <= N > > becomes: > > bitmap_weight_lt(..., N+1) > or: > !bitmap_weight_gt(..., N) > > I'd rather see something resembling memcmp() API that's known enough > to be easier to grasp. For above examples: > > bitmap_weight_cmp(..., N) > 0 > bitmap_weight_cmp(..., N) <= 0 > ... bitmap_weight_cmp() cannot be efficient. Consider this example: bitmap_weight_lt(1000 0000 0000 0000, 1) == false ^ stop here bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0 ^ stop here I agree that '_gt' is less verbose than '>', but the advantage of '_gt' over '>' is proportional to length of bitmap, and it means that this API should exist. > This would also make the implementation easier in not having to > copy and paste the code three times. Could also use a simple > optimization reducing code size: In the next version I'll reduce code duplication like this: bool bitmap_eq(..., N); bool bitmap_ge(..., N); #define bitmap_weight_gt(..., N) bitmap_weight_ge(..., N + 1) #define bitmap_weight_lt(..., N) !bitmap_weight_ge(..., N) #define bitmap_weight_le(..., N) !bitmap_weight_gt(..., N) Thanks, Yury