From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5F1F4C433EF for ; Thu, 2 Dec 2021 06:20:11 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4J4QmT2nplz3bY0 for ; Thu, 2 Dec 2021 17:20:09 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=d9ufFBZI; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::f2e; helo=mail-qv1-xf2e.google.com; envelope-from=yury.norov@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=d9ufFBZI; dkim-atps=neutral Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4J4H2X2W2Jz2ypX for ; Thu, 2 Dec 2021 11:31:46 +1100 (AEDT) Received: by mail-qv1-xf2e.google.com with SMTP id g9so21510086qvd.2 for ; Wed, 01 Dec 2021 16:31:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=UYmHlVbMSQbQAn62a50WlYWlw5WYikEw/HKO8MU8o20=; b=d9ufFBZIjBeCnfJhxkhOJHR4U6gA3sYY5/+OartHrmtyqhNuwpEH+tfc93YA7tEDao yOKDodcNFRle3eMMLCg2FH/Tcgr7cW593yntXLl1utZLUJtVShRNQEOc9NvEo4hEL1rm L2l6bxTExKPSRmbnMRKqIcZhoOpd63W1ZKagDrFCRbYIcYvSgN+CxBxrwTxGRSsHeVKf UXZK5TS9EbCKBOwmJzZ51FgzKmkah4fU/bnP1hNtNTskTK6qjluwzPEk7uWUTAkKCdZ1 XxzA5TtEBQPg9PU9R4bStLW9Ne4zdJoIM+RpRwYI7AmQcqEYlTEIoOnEsCuvtTzXmNFj 00eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=UYmHlVbMSQbQAn62a50WlYWlw5WYikEw/HKO8MU8o20=; b=6mAssEZv8o/eCMWt841qPMbMDtUS/8jqauFprA197zZM5oYOFBIvv8qWwC7PLJzfXp yZ84zIT7HCYk6umOQORUkYo6XCPscvLkwVn4tIiE6i2+j5qn/7ex4vVpa6uYzdYISVBH cSxRS7O7N0T+6D8lOxxTM93u0ErNBu3mjSVvN+iHuH2CyxzWTe9zR0nQ1x6sdbDKOy2p /n7GWSH4rQEn60PEMfJrio6qJVNOhJJUsvAv/chQxcfBEn18j7+HCJQBRTrrv03RiQKb gG1CqA5S+Y4xZWDsSN6Wzsjn/9XjICFQqLBHj6EAUaOPwVkAapHMMTTeuWJ3082i60lH dYkw== X-Gm-Message-State: AOAM532Xhr76FKD+nL7tPLGICP3XQE29EMgWU6Strmx9N5XhFzDAIF1y EAxyNh19PXCp4HlQ7aQpEvI= X-Google-Smtp-Source: ABdhPJwpuPpSi7n83vlL+jLX5fyPRTIJyAHbiKfUe+xvjm76YV/xzPPNg/iMW1GhI6PSi3iKc0YBEA== X-Received: by 2002:a05:6214:ccc:: with SMTP id 12mr9930154qvx.8.1638405102640; Wed, 01 Dec 2021 16:31:42 -0800 (PST) Received: from localhost ([66.216.211.25]) by smtp.gmail.com with ESMTPSA id l1sm690890qkp.125.2021.12.01.16.31.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Dec 2021 16:31:42 -0800 (PST) Date: Wed, 1 Dec 2021 16:31:40 -0800 From: Yury Norov To: =?utf-8?B?TWljaGHFgiBNaXJvc8WCYXc=?= Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Message-ID: <20211202003140.GA430494@lapt> References: <20211128035704.270739-1-yury.norov@gmail.com> <20211129063839.GA338729@lapt> <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl> X-Mailman-Approved-At: Thu, 02 Dec 2021 17:19:32 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juri Lelli , Andrew Lunn , "Rafael J. Wysocki" , Catalin Marinas , Guo Ren , Christoph Lameter , Christoph Hellwig , Andi Kleen , Vincent Guittot , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , Viresh Kumar , Petr Mladek , Arnaldo Carvalho de Melo , Jens Axboe , Andy Lutomirski , Thomas Gleixner , Greg Kroah-Hartman , Randy Dunlap , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Sergey Senozhatsky , linux-crypto@vger.kernel.org, Tejun Heo , Andrew Morton , Mark Rutland , Anup Patel , linux-ia64@vger.kernel.org, David Airlie , Roy Pledge , Dave Hansen , Solomon Peachy , Stephen Rothwell , Krzysztof Kozlowski , Dennis Zhou , Matti Vaittinen , linux-alpha@vger.kernel.org, Kalle Valo , Stephen Boyd , Tariq Toukan , Dinh Nguyen , Jonathan Cameron , Ulf Hansson , Alexander Shishkin , Mike Marciniszyn , Rasmus Villemoes , Subbaraya Sundeep , Will Deacon , Sagi Grimberg , linux-csky@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com, linux-arm-kernel@lists.infradead.org, linux-snps-arc@lists.infradead.org, Kees Cook , Arnd Bergmann , "James E.J. Bottomley" , Vineet Gupta , Steven Rostedt , Mark Gross , Borislav Petkov , Mauro Carvalho Chehab , Thomas Bogendoerfer , "Martin K. Petersen" , David Laight , Sudeep Holla , Geetha sowjanya , Ian Rogers , kvm@vger.kernel.org, Peter Zijlstra , Amitkumar Karwar , linux-mm@kvack.org, linux-riscv@lists.infradead.org, Lee Jones , Ard Biesheuvel , Marc Zyngier , Jiri Olsa , Russell King , Andy Gross , Jakub Kicinski , Vivien Didelot , Sunil Goutham , "Paul E. McKenney" , linux-s390@vger.kernel.org, Alexey Klimov , Heiko Carstens , Hans de Goede , Nicholas Piggin , Marcin Wojtas , Vlastimil Babka , linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org, Palmer Dabbelt , Daniel Vetter , Jason Wessel , Saeed Mahameed , Andy Shevchenko Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote: > Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov napisał/a: > >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote: > >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > >> > In many cases people use bitmap_weight()-based functions like this: > >> > > >> > if (num_present_cpus() > 1) > >> > do_something(); > >> > > >> > This may take considerable amount of time on many-cpus machines because > >> > num_present_cpus() will traverse every word of underlying cpumask > >> > unconditionally. > >> > > >> > We can significantly improve on it for many real cases if stop traversing > >> > the mask as soon as we count present cpus to any number greater than 1: > >> > > >> > if (num_present_cpus_gt(1)) > >> > do_something(); > >> > > >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > >> > functions together with corresponding wrappers in cpumask and nodemask. > >> > >> Having slept on it I have more structured thoughts: > >> > >> First, I like substituting bitmap_empty/full where possible - I think > >> the change stands on its own, so could be split and sent as is. > > > >Ok, I can do it. > > > >> I don't like the proposed API very much. One problem is that it hides > >> the comparison operator and makes call sites less readable: > >> > >> bitmap_weight(...) > N > >> > >> becomes: > >> > >> bitmap_weight_gt(..., N) > >> > >> and: > >> bitmap_weight(...) <= N > >> > >> becomes: > >> > >> bitmap_weight_lt(..., N+1) > >> or: > >> !bitmap_weight_gt(..., N) > >> > >> I'd rather see something resembling memcmp() API that's known enough > >> to be easier to grasp. For above examples: > >> > >> bitmap_weight_cmp(..., N) > 0 > >> bitmap_weight_cmp(..., N) <= 0 > >> ... > > > >bitmap_weight_cmp() cannot be efficient. Consider this example: > > > >bitmap_weight_lt(1000 0000 0000 0000, 1) == false > > ^ > > stop here > > > >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0 > > ^ > > stop here > > > >I agree that '_gt' is less verbose than '>', but the advantage of > >'_gt' over '>' is proportional to length of bitmap, and it means > >that this API should exist. > > Thank you for the example. Indeed, for less-than to be efficient here you would need to replace > bitmap_weight_cmp(..., N) < 0 > with > bitmap_weight_cmp(..., N-1) <= 0 Indeed, thanks for pointing to it. > It would still be more readable, I think. To be honest, I'm not sure that bitmap_weight_cmp(..., N-1) <= 0 would be an obvious replacement for the original bitmap_weight(...) < N comparing to bitmap_weight_lt(..., N) I think the best thing I can do is to add bitmap_weight_cmp() as you suggested, and turn lt and others to be wrappers on it. This will let people choose a better function in each case. I also think that for v2 it would be better to drop the conversion for short bitmaps, except for switching to bitmap_empty(), because in that case readability wins over performance; if no objections. Thanks, Yury