From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH7PR06CU001.outbound.protection.outlook.com (mail-westus3azon11010065.outbound.protection.outlook.com [52.101.201.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 448A53D0C0B for ; Tue, 12 May 2026 16:34:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.201.65 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778603680; cv=fail; b=tnC6xtWeEQAgUusS2fIEYRC4SGgGatOE3BnuQdMBaaABoir2XollsM/GMoo6iUy2f//ztd1g3kiqh0xV21d4Yci2CvRqGoTNnQWUmQsTg5LcdBs8rUOngcJuwev5gDKQsFXFYQV/Vf0fYmAYgjgz1i9bj6NtGLHvgIbPuedLFgc= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778603680; c=relaxed/simple; bh=5CqaF6OCmsuDb7bC2KhMyMh4eWrp10F5iOJn5YDECEg=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=M22/MWY9QcyglJrMm8dkV8Erj75agatom8SZWN00HsUkLwiESgtxO5ESZb3lPLgJeN6UBmbYi/fKg6YEiXRd0sICiK06e23bi1ZWoPC9UN/O6EGHCNHgUd5KcM9/mkYYtMZDyPt8eG4hCKw7HfkPgrYqg1tiX2N8URLJQGR0FY0= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=d3QQ9jvw; arc=fail smtp.client-ip=52.101.201.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="d3QQ9jvw" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=gDUhDJo6GdJQ5DkGdqUC8ZsH1Si3cKCkJcevwMsuw9B8w3AaPTF8TPh9Xy+R2UOGK0RXH4BIk5/qTo5zrCnQvR1HTlNd9Qrcckld7d+Vm8od3chBUbpNhwObB+dDEGpF1dXdCVkTBulxv2JrCeP5+JiRdeYQCv813cfkZg3gjLarXcBnVik9fvj3LIu+NEuz/4dnHdq1XrrW+RaxLe4Npmcqqhq3g3IoTlQ5qyh3xzTUtSDpNKSHh0q+ZV0HOLryvAXUDSWq7bFGxNoowfJSBX61iyWxh/wKYufGkGpudmxpD7Kn7CxcSrpA539Ys8mdQvtx3g03FcbfoGmH2UUZNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=w5v4FNeP5rRbgXvJ5u78pmO+zVrp5yGMp9vZcbLOZfQ=; b=sYbK2fvNXK4/ct9p7AIcaz0LOAeu5SUsDb7npfSqo1wYAMe/LlJrzBH1iS4SrI42JqdLvd7t/wB3fo7pGOF7HkYwc/KEeJwEy9xy1kZwH0nMBHSY6fe70Ijr4Ks708Qui6Hmcxu5hrm7MpK2htxjdI1fLHNkgczXiH8J7Got8YaaIgi6Im5SyPvfln4nlS4VlnrQYizaAGvJQvZvuCFWlX3yvl5EwvHeWa3h1sPlcB6O9Tnc2u07mo1dSXkGg/vR2gCI/rVRK2V9HaCUxCG9tTYwT1pJt/9fB2uS4+AkXYBZpjZn0NsmFVKZIVxDS3NAlcoqaQ2DlOtffCwUmMbBMw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=w5v4FNeP5rRbgXvJ5u78pmO+zVrp5yGMp9vZcbLOZfQ=; b=d3QQ9jvwIQephwxRKNWmIHu7c3tjWmxjyuT0BB8aOVboHyM6IKH4gd9yyePb0L5o5Cwo64K6aKqS5yXkHCup9tq1iYzNs8foMjRaW9kFkS4pDDpfBMFjy7CBLRyweX2DMAuCpjd1fBkICIDItTAW70V6/5t+04T59KCxPh+COzKIj6/kwoFihtjZqBmNE5Famvb9Oigq9bsNTYgRH4soYGZgQ6aCFLur2dl3leqdsMlM9/xFn3d5vWqdddGm9VuRoQ6nKQ1oVCOLv6iWo2FVwYcUXEDqqyVNaAjTHbX4SZ05iH04CMW+QOYMAGJNXZGMitshemRT9Cb69kiQ6VCNXg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CY8PR12MB8300.namprd12.prod.outlook.com (2603:10b6:930:7d::16) by DS5PPFBABE93B01.namprd12.prod.outlook.com (2603:10b6:f:fc00::65f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.26; Tue, 12 May 2026 16:34:25 +0000 Received: from CY8PR12MB8300.namprd12.prod.outlook.com ([fe80::ce75:8187:3ac3:c5de]) by CY8PR12MB8300.namprd12.prod.outlook.com ([fe80::ce75:8187:3ac3:c5de%3]) with mapi id 15.20.9913.009; Tue, 12 May 2026 16:34:22 +0000 Date: Tue, 12 May 2026 12:34:20 -0400 From: Yury Norov To: Yi Sun Cc: yury.norov@gmail.com, akpm@linux-foundation.org, mina86@mina86.com, akinobu.mita@gmail.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2] Improve the performance of bitmap_find_next_zero_area_off() Message-ID: References: <20260512040659.2992142-1-yi.sun@unisoc.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260512040659.2992142-1-yi.sun@unisoc.com> X-ClientProxiedBy: BN0PR04CA0012.namprd04.prod.outlook.com (2603:10b6:408:ee::17) To CY8PR12MB8300.namprd12.prod.outlook.com (2603:10b6:930:7d::16) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY8PR12MB8300:EE_|DS5PPFBABE93B01:EE_ X-MS-Office365-Filtering-Correlation-Id: 833627c8-c58f-4772-82b7-08deb04452c5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|366016|10070799003|18002099003|56012099003|22082099003|11063799003; X-Microsoft-Antispam-Message-Info: v6KYKb9zc447laA9CQ1nMmuUF5P9X/s+Rt7mJa21CBAgGrwQdXqO4zG+dCcs+4cnqgG9pp9FnJTV1h4ujaJoHxBC3+me65PeRD4wTRHoOTd4ZhZ2g3gRBRpgFjFSKA5Pd9AHZ2uIo58niDBybpQnQQKjgWTWW1R55PgAEuz9kQshQv9plBDNmZrh+HCTmC/2p51rQZyMuaMUobIC7W0FS5uxqhMshE8V9j5lXytX3CvJBnhQueOD6zAhzDUJKskD2KO+RFJMCH2Hy/YHoY8SnHUtD5tk/VY5jwiAfTYrhzGmsx8amDEC7cvFr2mDgbQItQmiNOyXOOGzOxI6PsxEAudTAni4tH+HP95vih20ghahLQYcTrlFnbc8ptGtp27jMIatRUHKg81N+a3gBLpyEUKkDI9efrkVrupWp74d21HtYBO2oyp4fxXkw8Nv2TgqmTo/LbV3u5TWLo+xRS1nt2zpEboGFPFkMD7rhcojXy0LA5cy7L6IGphlOCWhDLM3WVdxZ1FTfVe+OZjA5sr+UCQ6wit2KnzCCiwBGrTU8fTK3st71yt6rDYzfU4gCITEJnMD7KI9W7PUOtBAh75RxWgi2rn8WyhuX3rOEkBImcJO2IZcwD7rq1fylOMxnya9WBA7eWzlBcO2miBzTVpO0tiWUlhF4qlt8fpJUybfcxM7hXiTkLDkJhnUcUz0pff0 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY8PR12MB8300.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(376014)(366016)(10070799003)(18002099003)(56012099003)(22082099003)(11063799003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?XWkA/okWSkFlkumjzh19tc3urf6aJJDWBHFrLoz+A6bSQZ/TgJI67GB8t/1g?= =?us-ascii?Q?6/f1ACAcbqG9+aH7tgRPLE3CjR7mrWImtymGixhD06az3ilu3XsVt8Q+A5un?= =?us-ascii?Q?NV2oA6KLyZHgw5yMkwCSI9oYLgVMhVYIpc7LLV/CumrxPi41SoJNh+1MSC+d?= =?us-ascii?Q?kWhV1hLVqp8NtkXC6HrMsOxe9AQOX0lSIUpaYGyYKrPvGvb4uZn87AlvGsQj?= =?us-ascii?Q?wjKhWDNdAB+2MtD+IBUWT8Zrgt6hckIaCrHD4z39WDukqnKrn93IC9OYWJrk?= =?us-ascii?Q?vQ170KYCj5jYhnb/4PHOnEp+uglemu04EvX3A4f/MeKXFJxfD7xOCoMYy0QP?= =?us-ascii?Q?29dPtFh6dajZofVocdxlnFb6601WkBxUo+xp+hTZQJSZF529CfrE7/AV5Y67?= =?us-ascii?Q?AmMvFvFZjt6ZKlAsRaeR5so55qb2rscimB0tvYhSs06Z8CqC5udUtduDYGGc?= =?us-ascii?Q?zBhYu1qmbnip6s7/vtAiPUmOHBnh+8dwYd6QPsut3jZyKyjmGKOaV4f0EJmR?= =?us-ascii?Q?+utGFmha80prjvFZSVY7kusspQVwtACrRrWokxOE61b2WOE+7S+hmz1+2l+x?= =?us-ascii?Q?IgWDFcm5tBhikpkMFnpLwCGtXR+Dn/e+sqnoicp6sXemFdcP+AdlxE0kGaMf?= =?us-ascii?Q?aY7URFRSO+OpDhFcCQ+KJT/Hurguj6vWJpBwzRHZtzVlNz8WaO0hZk1xhRRD?= =?us-ascii?Q?P8eadWy7spz64FU0VbH7SqGKP66gjX7IfTHrRaPzzhwfrejcPQGexhTZCQ01?= =?us-ascii?Q?X8wAOao37NH36b3/p1kpEOcivGOxQb8Nvu6zUELL5/z95IWKx3Neo6r5Hlm/?= =?us-ascii?Q?t5vBWqt3ArPhWosVo/Uwmtqp5/Xfrgw0fq0oYVLkPB/5RtwatrHloWs9y6ho?= =?us-ascii?Q?jMSimcXhHmMqvJadYHNju0dpOYYrMd1QIVLRedZ+r757fZsDoaazuSSamZ2o?= =?us-ascii?Q?L7XoR07w613gJ1ibuCH+vY/+Th0HO/JMspFtpZsgGaMEyjgpcFJ0Gw6BMfb1?= =?us-ascii?Q?8TnU23CMsX6VNRHODVNIfR5ju1UWwqk/29AqjXij5r+h7gFkasI2LBwONprf?= =?us-ascii?Q?a8FwWklg4Ar5KlDJGTpmsLNWLsmtajZsXR8u7M0ilxuzxvN7NoD20Cn4nUQW?= =?us-ascii?Q?TeZ2Zr7B4aPb6Rdvzxiu0N9ee2k6n89pxAqF6pUjFR6P8DD26Mro9bJMAR6W?= =?us-ascii?Q?kjqfLCJB/d6tu58Ku/s/LbjyTftCSXMg7sZcqD+vTHnheXYu2WKyZ94z/enp?= =?us-ascii?Q?Wqq+fcC3ygP2lbJv6VUyW8gVeGuSRotGwjd2OGZV4eRR1iOCr9F/SO9r1NnZ?= =?us-ascii?Q?yb/JqGSz9VIfFNC4VPhO8g4+blD9uPeCQ6bu4nXmjPS3URQIZO83jbgy6m3H?= =?us-ascii?Q?NtzFxGe413fLeyRKZH1VEZq0DCsIt75ez91/Nf/3QIe6vrmx19S6sVWfEnl9?= =?us-ascii?Q?VgIIuuQ5ANT7fOoXZcHvmYcOcKWS5VxfHayKtN3DhYHmIyopuMPc5x/dHgfw?= =?us-ascii?Q?5LNkjSXONKiFDaSTIWWIX7q2kw5hLl6m9CX0hYFwxxPrZKp7RG+gDlEsW/19?= =?us-ascii?Q?VedLsVf0FbI+OuVjeneAYo2RU9Qfal9+CXqrH12n29m0ctUaEreK0kOYRYWw?= =?us-ascii?Q?wQfmNxLj7iz7fNS/BhSXvI/ALw+Q3G/+12XN8PV/pg8+JnTIfjBvvbJtLCHV?= =?us-ascii?Q?iRYQnKQsVBVTI7X6yFaZI507v4xWVwE7sHeCAWbqOVDbehe1o0brzPZmqtKa?= =?us-ascii?Q?QoGdzmXSZ9dGPUUxxGiwXAL/gNTf+yeHXkWfjv4wBC9g8W1f8YQ6?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 833627c8-c58f-4772-82b7-08deb04452c5 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB8300.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 May 2026 16:34:22.8710 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BzmGl/1cMiK1yoPG8ibAb2Jvq/nvc20kiKkIKwgemrYf+9N5uO6LvpUaeWSWyhBckwngmyDYN3gdq6l6PBa4Zw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS5PPFBABE93B01 On Tue, May 12, 2026 at 12:06:57PM +0800, Yi Sun wrote: > Replacing find_next_bit() with find_last_bit_range() > can improve performance by an average of 50%. > > =========== > > Test result: > cnt old_a_cnt new_a_cnt cnt_ratio old_time(ns) new_time(ns) time_ratio > test1 8 71 34 52.1% 51357 25019 51.3% > test2 8 1 1 0% 1150 1153 around 0% > > test1 32 81925 10402 87.3% 23103730 2910315 87.4% > test2 32 1 1 0% 434 434 around 0% > > test1 128 82166 2572 96.9% 23054634 731453 96.8% > test2 128 1 1 0% 434 438 around 0% > > test1 1024 81620 321 99.6% 23035192 234330 99% > test2 1024 14 7 50% 4257 2257 47% > > test1 4096 80923 81 99.9% 22700265 57861 99.7% > test2 4096 648 92 85.8% 192854 27177 85.9% > > ============ > > Test result explanation: > @test1: The bitmap is filled with random numbers, > so the bitmap is very messy. > @test2: Sparse bitmap. > > @cnt: The expected number of consecutive clear bits. > > @old_a_cnt: Total number of "goto again" when > using find_next_bit(). > @new_a_cnt: Total number of "goto again" when > using find_last_bit_range(). > Finding @cnt consecutive clear bits in the bitmap > may require multiple attempts. > The number of repetitions should be recorded. > @cnt_ratio = (old_a_cnt - new_a_cnt) / old_a_cnt. > > @old_time(ns): The total time consumed by > bitmap_find_next_zero_area_off() when > using find_next_bit(). > @new_time(ns): The total time consumed by > bitmap_find_next_zero_area_off() when > using find_last_bit_range(). > @time_ratio = (old_time - new_time) / old_time. > > ============== > > Test case(refer to lib/find_bit_benchmark.c): > > define BITMAP_LEN (4096UL * 8 * 10) > define SPARSE 500 > static DECLARE_BITMAP(bitmap, BITMAP_LEN); > > static void test_main() > { > unsigned long nbits = BITMAP_LEN / SPARSE; > > //test1 > get_random_bytes(bitmap, sizeof(bitmap)); > __test_all(); > > //test2 > bitmap_zero(bitmap, BITMAP_LEN); > while (nbits--) > __set_bit(get_random_u32_below(BITMAP_LEN), bitmap); > __test_all(); > } > > static void __test_all() > { > //Expected number of consecutive clear bits. > u32 cnt = 8; > > //Ignore the results of this test. > __test_new(cnt); > > //To mitigate the impact of caching, > //we will use the results of this test. > __test_new(cnt); > > //Ignore the results of this test. > __test_old(cnt); > > //To mitigate the impact of caching, > //we will use the results of this test. > __test_old(cnt); > } > > //Add time-consuming statistics to bitmap_find_next_zero_area_off(). > static ktime_t __test_old/__test_new(u32 nr) > { > unsigned long *map = bitmap; > unsigned long size = BITMAP_LEN; > unsigned long start = 0; > unsigned long align_mask = 0; > unsigned long align_offset = 0; > > unsigned long index, end, i, again_cnt = 0; > //Here add time-consuming statistics. > ktime_t time = ktime_get(); > > again: > again_cnt++; > index = find_next_zero_bit(map, size, start); > /* Align allocation */ > index = __ALIGN_MASK(index + > align_offset, align_mask) - align_offset; > end = index + nr; > if (end > size) { > //Here add time-consuming statistics. > time = ktime_get() - time; > return time; > } > > //__test_old() use this. > i = find_next_bit(map, end, index); > > //__test_new() use this. > i = find_last_bit_range(map, end, index); > > if (i < end) { > start = i + 1; > goto again; > } > > //Here add time-consuming statistics. > time = ktime_get() - time; > return time; > } Please check the lib/find_bit_benchmark.c and extend it with your scenario. Please make sure you're printing and everything is aligned with the existing format. > Yi Sun (2): > lib: bitmap: add find_last_bit_range() and _find_last_bit_range() > lib: bitmap: reduce the number of goto again in > bitmap_find_next_zero_area_off() > > include/linux/find.h | 35 +++++++++++++++++++++++++++++++++++ > lib/bitmap.c | 2 +- > lib/find_bit.c | 30 ++++++++++++++++++++++++++++++ > 3 files changed, 66 insertions(+), 1 deletion(-) > > -- > 2.34.1