From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48F5AC2BB3F for ; Sat, 18 Nov 2023 18:41:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229942AbjKRSlA (ORCPT ); Sat, 18 Nov 2023 13:41:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229869AbjKRSkw (ORCPT ); Sat, 18 Nov 2023 13:40:52 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43E09126 for ; Sat, 18 Nov 2023 10:40:47 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0379C433C7; Sat, 18 Nov 2023 18:40:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700332846; bh=auxcrYx4V+sOPA+YMsroHPoQR3unABZVWZi+BUOBmmU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=lka9wpl6ZTzUptJX5RU3W3hIae2SAgsaXkOkXpUjHLc3YJgWPOBma9QUhXmKEo7zs FMB/r9x4CfIX1/dMRhK6Vv1uCnfO/r/HnzDEfUKSwlEmio9AVKJTYP6+b+uPvVWcdy dvL7Vv6BnIEeONLUFYTSA2HuKRzHQmNzEP96YUwAgZjxKMSQ+csDp2rxbCogJe501n H68nLR4rBeM8VXoyzJ/Hqzp6oZ5S07837KdCFjUyamRZta6BGnUy0TYQ/lr+QexicI liVHl/eFzztzVt7XEpqr/D4KSOOhlRQiatc8PHxIPQ0wycUfej7sOcbXQD/0KKh3vs N31cRfZ+XWTzQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1r4QFI-00EKWQ-0R; Sat, 18 Nov 2023 18:40:44 +0000 Date: Sat, 18 Nov 2023 18:40:43 +0000 Message-ID: <87cyw6ykes.wl-maz@kernel.org> From: Marc Zyngier To: Yury Norov Cc: linux-kernel@vger.kernel.org, Will Deacon , Mark Rutland , linux-arm-kernel@lists.infradead.org, Jan Kara , Mirsad Todorovac , Matthew Wilcox , Rasmus Villemoes , Andy Shevchenko , Maxim Kuvyrkov , Alexey Klimov Subject: Re: [PATCH 31/34] drivers/perf: optimize m1_pmu_get_event_idx() by using find_bit() API In-Reply-To: <20231118155105.25678-32-yury.norov@gmail.com> References: <20231118155105.25678-1-yury.norov@gmail.com> <20231118155105.25678-32-yury.norov@gmail.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: yury.norov@gmail.com, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, jack@suse.cz, mirsad.todorovac@alu.unizg.hr, willy@infradead.org, linux@rasmusvillemoes.dk, andriy.shevchenko@linux.intel.com, maxim.kuvyrkov@linaro.org, klimov.linux@gmail.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 18 Nov 2023 15:51:02 +0000, Yury Norov wrote: > > The function searches used_mask for a bit in a for-loop bit by bit. > We can do it faster by using atomic find_and_set_bit(). Sure, let's do things fast. Correctness is overrated anyway. > > The comment to the function says that it searches for the first free > counter, but obviously for_each_set_bit() searches for the first set > counter. No it doesn't. It iterates over the counters the event can count on. > The following test_and_set_bit() tries to enable already set > bit, which is weird. Maybe you could try to actually read the code? > > This patch, by using find_and_set_bit(), fixes this automatically. This doesn't fix anything, but instead actively breaks the driver. > > Fixes: a639027a1be1 ("drivers/perf: Add Apple icestorm/firestorm CPU PMU driver") > Signed-off-by: Yury Norov > --- > drivers/perf/apple_m1_cpu_pmu.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c > index cd2de44b61b9..2d50670ffb01 100644 > --- a/drivers/perf/apple_m1_cpu_pmu.c > +++ b/drivers/perf/apple_m1_cpu_pmu.c > @@ -447,12 +447,8 @@ static int m1_pmu_get_event_idx(struct pmu_hw_events *cpuc, > * counting on the PMU at any given time, and by placing the > * most constraining events first. > */ > - for_each_set_bit(idx, &affinity, M1_PMU_NR_COUNTERS) { > - if (!test_and_set_bit(idx, cpuc->used_mask)) > - return idx; > - } > - > - return -EAGAIN; > + idx = find_and_set_bit(cpuc->used_mask, M1_PMU_NR_COUNTERS); > + return idx < M1_PMU_NR_COUNTERS ? idx : -EAGAIN; So now you're picking any possible counter, irrespective of the possible affinity of the event. This is great. M. -- Without deviation from the norm, progress is not possible.