* [PATCH v7 0/2] Improve the performance of bitmap_find_next_zero_area_off() @ 2026-06-22 3:00 Yi Sun 2026-06-22 3:00 ` [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 3:00 ` [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() Yi Sun 0 siblings, 2 replies; 8+ messages in thread From: Yi Sun @ 2026-06-22 3:00 UTC (permalink / raw) To: yury.norov Cc: yi.sun, 279644543, mina86, mnazarewicz, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz Use lib/find_bit_benchmark.c for testing. Test results showed a 40% improvement in dense cases, and almost no change in sparse cases. Performance test results on my hardware: before(ns) after(ns) change p-value dense 1211 688 -43.2% 8.3e-11 sparse 13.3 13.4 0.8% 0.27 Version 7 only changed the alignment format of the test code. --- v6: https://lore.kernel.org/all/20260618133518.4079981-1-yi.sun@unisoc.com - Improve test code. - Change infinite for loop to for_each_clear_bit_from. v5: https://lore.kernel.org/all/20260618015252.3601554-1-yi.sun@unisoc.com - Improve test code. - Change the "goto again" code structure to a for loop. v4: https://lore.kernel.org/all/20260601094234.103863-1-yi.sun@unisoc.com - Test code has been added to PATCH v2. v3: https://lore.kernel.org/all/20260514090607.231387-1-yi.sun@unisoc.com - Code optimization was performed on PATCH v1. v2: https://lore.kernel.org/all/20260514035644.4118050-1-yi.sun@unisoc.com - Do not introduce find_last_bit_from(). v1: https://lore.kernel.org/all/20260512040659.2992142-1-yi.sun@unisoc.com Yi Sun (2): lib: bitmap: add tests for bitmap_find_next_zero_area_off() lib: bitmap: optimize bitmap_find_next_zero_area_off() lib/bitmap.c | 31 ++++++++++++++++--------------- lib/find_bit_benchmark.c | 17 +++++++++++++++++ lib/test_bitmap.c | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 71 insertions(+), 15 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() 2026-06-22 3:00 [PATCH v7 0/2] Improve the performance of bitmap_find_next_zero_area_off() Yi Sun @ 2026-06-22 3:00 ` Yi Sun 2026-06-22 10:05 ` Michał Nazarewicz 2026-06-22 3:00 ` [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() Yi Sun 1 sibling, 1 reply; 8+ messages in thread From: Yi Sun @ 2026-06-22 3:00 UTC (permalink / raw) To: yury.norov Cc: yi.sun, 279644543, mina86, mnazarewicz, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz Add functional and performance tests for bitmap_find_next_zero_area_off(). performance tests partial output: Start testing find_bit() with random-filled bitmap [ 0.310073] bitmap_find_next_zero_area_off: 852731 ns, 1154 iterations [ 0.311435] find_next_bit: 1356654 ns, 163975 iterations Start testing find_bit() with sparse bitmap [ 0.316267] bitmap_find_next_zero_area_off:4426808 ns, 322479 iterations [ 0.316292] find_next_bit: 15154 ns, 656 iterations Signed-off-by: Yi Sun <yi.sun@unisoc.com> --- lib/find_bit_benchmark.c | 17 +++++++++++++++++ lib/test_bitmap.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+) diff --git a/lib/find_bit_benchmark.c b/lib/find_bit_benchmark.c index 00d9dc61cd46..05305e655f99 100644 --- a/lib/find_bit_benchmark.c +++ b/lib/find_bit_benchmark.c @@ -149,6 +149,21 @@ static int __init test_find_next_and_bit(const void *bitmap, return 0; } +static int __init +test_bitmap_find_next_zero_area_off(unsigned long *bitmap, unsigned long len) +{ + unsigned long i, cnt; + ktime_t time; + + time = ktime_get(); + for (cnt = i = 0; i < BITMAP_LEN; cnt++) + i = bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN, i, 8, 0, 0) + 1; + time = ktime_get() - time; + pr_err("bitmap_find_next_zero_area_off:%7llu ns, %6ld iterations\n", time, cnt); + + return 0; +} + static int __init find_bit_test(void) { unsigned long nbits = BITMAP_LEN / SPARSE; @@ -158,6 +173,7 @@ static int __init find_bit_test(void) get_random_bytes(bitmap, sizeof(bitmap)); get_random_bytes(bitmap2, sizeof(bitmap2)); + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); test_find_next_bit(bitmap, BITMAP_LEN); test_find_next_zero_bit(bitmap, BITMAP_LEN); test_find_last_bit(bitmap, BITMAP_LEN); @@ -181,6 +197,7 @@ static int __init find_bit_test(void) __set_bit(get_random_u32_below(BITMAP_LEN), bitmap2); } + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); test_find_next_bit(bitmap, BITMAP_LEN); test_find_next_zero_bit(bitmap, BITMAP_LEN); test_find_last_bit(bitmap, BITMAP_LEN); diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c index 69813c10e6c0..8665df77c960 100644 --- a/lib/test_bitmap.c +++ b/lib/test_bitmap.c @@ -234,6 +234,43 @@ static void __init test_find_nth_bit(void) } } +static void __init +test_bitmap_find_next_zero_area_off(void) +{ + DECLARE_BITMAP(bmap, 192); + + bitmap_set(bmap, 0, 192); + + bitmap_clear(bmap, 0, 8); + __clear_bit(50, bmap); + bitmap_clear(bmap, 60, 18); + __set_bit(69, bmap); + __clear_bit(80, bmap); + bitmap_clear(bmap, 100, 10); + __clear_bit(120, bmap); + bitmap_clear(bmap, 145, 8); + bitmap_clear(bmap, 160, 32); + + expect_eq_uint(0, + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 0, 0)); + expect_eq_uint(0, + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 0)); + expect_eq_uint(163, + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 1)); + expect_eq_uint(60, + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 0, 0)); + expect_eq_uint(160, + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 0)); + expect_eq_uint(60, + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 4)); + expect_eq_uint(100, + bitmap_find_next_zero_area_off(bmap, 192, 0, 10, 0, 0)); + expect_eq_uint(160, + bitmap_find_next_zero_area_off(bmap, 192, 0, 32, 0, 0)); + expect_eq_uint(1, + !!(bitmap_find_next_zero_area_off(bmap, 192, 0, 33, 0, 0) >= 192)); +} + static void __init test_fill_set(void) { DECLARE_BITMAP(bmap, 1024); @@ -1559,6 +1596,7 @@ static void __init selftest(void) test_for_each_clear_bitrange_from(); test_for_each_set_clump8(); test_for_each_set_bit_wrap(); + test_bitmap_find_next_zero_area_off(); } KSTM_MODULE_LOADERS(test_bitmap); -- 2.34.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() 2026-06-22 3:00 ` [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() Yi Sun @ 2026-06-22 10:05 ` Michał Nazarewicz 2026-06-22 12:39 ` 答复: " 孙毅 (Yi Sun) 0 siblings, 1 reply; 8+ messages in thread From: Michał Nazarewicz @ 2026-06-22 10:05 UTC (permalink / raw) To: Yi Sun, yury.norov Cc: yi.sun, 279644543, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz On Mon, Jun 22 2026, Yi Sun wrote: > Add functional and performance tests > for bitmap_find_next_zero_area_off(). > > performance tests partial output: > Start testing find_bit() with random-filled bitmap > [ 0.310073] bitmap_find_next_zero_area_off: 852731 ns, 1154 iterations > [ 0.311435] find_next_bit: 1356654 ns, 163975 iterations > Start testing find_bit() with sparse bitmap > [ 0.316267] bitmap_find_next_zero_area_off:4426808 ns, 322479 iterations > [ 0.316292] find_next_bit: 15154 ns, 656 iterations > > Signed-off-by: Yi Sun <yi.sun@unisoc.com> Acked-by: Michał Nazarewicz <mina86@mina86.com> > --- > lib/find_bit_benchmark.c | 17 +++++++++++++++++ > lib/test_bitmap.c | 38 ++++++++++++++++++++++++++++++++++++++ > 2 files changed, 55 insertions(+) > > diff --git a/lib/find_bit_benchmark.c b/lib/find_bit_benchmark.c > index 00d9dc61cd46..05305e655f99 100644 > --- a/lib/find_bit_benchmark.c > +++ b/lib/find_bit_benchmark.c > @@ -149,6 +149,21 @@ static int __init test_find_next_and_bit(const void *bitmap, > return 0; > } > > +static int __init > +test_bitmap_find_next_zero_area_off(unsigned long *bitmap, unsigned long len) > +{ > + unsigned long i, cnt; > + ktime_t time; > + > + time = ktime_get(); > + for (cnt = i = 0; i < BITMAP_LEN; cnt++) > + i = bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN, i, 8, 0, 0) + 1; > + time = ktime_get() - time; > + pr_err("bitmap_find_next_zero_area_off:%7llu ns, %6ld iterations\n", time, cnt); > + > + return 0; > +} > + > static int __init find_bit_test(void) > { > unsigned long nbits = BITMAP_LEN / SPARSE; > @@ -158,6 +173,7 @@ static int __init find_bit_test(void) > get_random_bytes(bitmap, sizeof(bitmap)); > get_random_bytes(bitmap2, sizeof(bitmap2)); > > + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); > test_find_next_bit(bitmap, BITMAP_LEN); > test_find_next_zero_bit(bitmap, BITMAP_LEN); > test_find_last_bit(bitmap, BITMAP_LEN); > @@ -181,6 +197,7 @@ static int __init find_bit_test(void) > __set_bit(get_random_u32_below(BITMAP_LEN), bitmap2); > } > > + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); > test_find_next_bit(bitmap, BITMAP_LEN); > test_find_next_zero_bit(bitmap, BITMAP_LEN); > test_find_last_bit(bitmap, BITMAP_LEN); > diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c > index 69813c10e6c0..8665df77c960 100644 > --- a/lib/test_bitmap.c > +++ b/lib/test_bitmap.c > @@ -234,6 +234,43 @@ static void __init test_find_nth_bit(void) > } > } > > +static void __init > +test_bitmap_find_next_zero_area_off(void) > +{ > + DECLARE_BITMAP(bmap, 192); > + > + bitmap_set(bmap, 0, 192); > + > + bitmap_clear(bmap, 0, 8); > + __clear_bit(50, bmap); > + bitmap_clear(bmap, 60, 18); > + __set_bit(69, bmap); > + __clear_bit(80, bmap); > + bitmap_clear(bmap, 100, 10); > + __clear_bit(120, bmap); > + bitmap_clear(bmap, 145, 8); > + bitmap_clear(bmap, 160, 32); > + > + expect_eq_uint(0, > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 0, 0)); > + expect_eq_uint(0, > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 0)); > + expect_eq_uint(163, > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 1)); > + expect_eq_uint(60, > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 0, 0)); > + expect_eq_uint(160, > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 0)); > + expect_eq_uint(60, > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 4)); > + expect_eq_uint(100, > + bitmap_find_next_zero_area_off(bmap, 192, 0, 10, 0, 0)); > + expect_eq_uint(160, > + bitmap_find_next_zero_area_off(bmap, 192, 0, 32, 0, 0)); > + expect_eq_uint(1, > + !!(bitmap_find_next_zero_area_off(bmap, 192, 0, 33, 0, 0) >= 192)); In C `x >= y` yields 0 or 1. There’s no need for `!!`. About `>` vs `>=`, the original code returns `> 192` (because it returns `end` when it exceeds `size`). The new code returns `192` (because it returns `size` if loop fails to find an area). If that’s the case, this patch, PATCH 1/2, should have `>` and PATCH 2/2 should change it to `expect_eq_uint(192, …)` to reflect change in the API. > +} > + > static void __init test_fill_set(void) > { > DECLARE_BITMAP(bmap, 1024); > @@ -1559,6 +1596,7 @@ static void __init selftest(void) > test_for_each_clear_bitrange_from(); > test_for_each_set_clump8(); > test_for_each_set_bit_wrap(); > + test_bitmap_find_next_zero_area_off(); > } > > KSTM_MODULE_LOADERS(test_bitmap); -- Best regards ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴィツ «If at first you don’t succeed, give up skydiving» ^ permalink raw reply [flat|nested] 8+ messages in thread
* 答复: [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() 2026-06-22 10:05 ` Michał Nazarewicz @ 2026-06-22 12:39 ` 孙毅 (Yi Sun) 0 siblings, 0 replies; 8+ messages in thread From: 孙毅 (Yi Sun) @ 2026-06-22 12:39 UTC (permalink / raw) To: Michał Nazarewicz, yury.norov@gmail.com Cc: 279644543@qq.com, akpm@linux-foundation.org, akinobu.mita@gmail.com, linux-kernel@vger.kernel.org, tjmercier@google.com, fvdl@google.com, tglx@kernel.org, song@kernel.org, hch@lst.de, minchan@kernel.org, jstultz@google.com, 王科 (Ke Wang) > -----邮件原件----- > 发件人: Michał Nazarewicz <mnazarewicz@gmail.com> 代表 Micha? > Nazarewicz > 发送时间: 2026年6月22日 18:06 > 收件人: 孙毅 (Yi Sun) <Yi.Sun@unisoc.com>; yury.norov@gmail.com > 抄送: 孙毅 (Yi Sun) <Yi.Sun@unisoc.com>; 279644543@qq.com; akpm@linux- > foundation.org; akinobu.mita@gmail.com; linux-kernel@vger.kernel.org; > tjmercier@google.com; fvdl@google.com; tglx@kernel.org; song@kernel.org; > hch@lst.de; minchan@kernel.org; jstultz@google.com > 主题: Re: [PATCH v7 1/2] lib: bitmap: add tests for > bitmap_find_next_zero_area_off() > > > 注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链 > 接和附件。 > CAUTION: This email originated from outside of the organization. Do not click links > or open attachments unless you recognize the sender and know the content is > safe. > > > > On Mon, Jun 22 2026, Yi Sun wrote: > > Add functional and performance tests > > for bitmap_find_next_zero_area_off(). > > > > performance tests partial output: > > Start testing find_bit() with random-filled bitmap > > [ 0.310073] bitmap_find_next_zero_area_off: 852731 ns, 1154 iterations > > [ 0.311435] find_next_bit: 1356654 ns, 163975 iterations > > Start testing find_bit() with sparse bitmap > > [ 0.316267] bitmap_find_next_zero_area_off:4426808 ns, 322479 iterations > > [ 0.316292] find_next_bit: 15154 ns, 656 iterations > > > > Signed-off-by: Yi Sun <yi.sun@unisoc.com> > > Acked-by: Michał Nazarewicz <mina86@mina86.com> > > > --- > > lib/find_bit_benchmark.c | 17 +++++++++++++++++ > > lib/test_bitmap.c | 38 ++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 55 insertions(+) > > > > diff --git a/lib/find_bit_benchmark.c b/lib/find_bit_benchmark.c > > index 00d9dc61cd46..05305e655f99 100644 > > --- a/lib/find_bit_benchmark.c > > +++ b/lib/find_bit_benchmark.c > > @@ -149,6 +149,21 @@ static int __init test_find_next_and_bit(const void > *bitmap, > > return 0; > > } > > > > +static int __init > > +test_bitmap_find_next_zero_area_off(unsigned long *bitmap, unsigned long > len) > > +{ > > + unsigned long i, cnt; > > + ktime_t time; > > + > > + time = ktime_get(); > > + for (cnt = i = 0; i < BITMAP_LEN; cnt++) > > + i = bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN, i, 8, 0, 0) + 1; > > + time = ktime_get() - time; > > + pr_err("bitmap_find_next_zero_area_off:%7llu ns, %6ld iterations\n", time, > cnt); > > + > > + return 0; > > +} > > + > > static int __init find_bit_test(void) > > { > > unsigned long nbits = BITMAP_LEN / SPARSE; > > @@ -158,6 +173,7 @@ static int __init find_bit_test(void) > > get_random_bytes(bitmap, sizeof(bitmap)); > > get_random_bytes(bitmap2, sizeof(bitmap2)); > > > > + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); > > test_find_next_bit(bitmap, BITMAP_LEN); > > test_find_next_zero_bit(bitmap, BITMAP_LEN); > > test_find_last_bit(bitmap, BITMAP_LEN); > > @@ -181,6 +197,7 @@ static int __init find_bit_test(void) > > __set_bit(get_random_u32_below(BITMAP_LEN), bitmap2); > > } > > > > + test_bitmap_find_next_zero_area_off(bitmap, BITMAP_LEN); > > test_find_next_bit(bitmap, BITMAP_LEN); > > test_find_next_zero_bit(bitmap, BITMAP_LEN); > > test_find_last_bit(bitmap, BITMAP_LEN); > > diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c > > index 69813c10e6c0..8665df77c960 100644 > > --- a/lib/test_bitmap.c > > +++ b/lib/test_bitmap.c > > @@ -234,6 +234,43 @@ static void __init test_find_nth_bit(void) > > } > > } > > > > +static void __init > > +test_bitmap_find_next_zero_area_off(void) > > +{ > > + DECLARE_BITMAP(bmap, 192); > > + > > + bitmap_set(bmap, 0, 192); > > + > > + bitmap_clear(bmap, 0, 8); > > + __clear_bit(50, bmap); > > + bitmap_clear(bmap, 60, 18); > > + __set_bit(69, bmap); > > + __clear_bit(80, bmap); > > + bitmap_clear(bmap, 100, 10); > > + __clear_bit(120, bmap); > > + bitmap_clear(bmap, 145, 8); > > + bitmap_clear(bmap, 160, 32); > > + > > + expect_eq_uint(0, > > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 0, 0)); > > + expect_eq_uint(0, > > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 0)); > > + expect_eq_uint(163, > > + bitmap_find_next_zero_area_off(bmap, 192, 0, 8, 3, 1)); > > + expect_eq_uint(60, > > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 0, 0)); > > + expect_eq_uint(160, > > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 0)); > > + expect_eq_uint(60, > > + bitmap_find_next_zero_area_off(bmap, 192, 1, 8, 7, 4)); > > + expect_eq_uint(100, > > + bitmap_find_next_zero_area_off(bmap, 192, 0, 10, 0, 0)); > > + expect_eq_uint(160, > > + bitmap_find_next_zero_area_off(bmap, 192, 0, 32, 0, 0)); > > + expect_eq_uint(1, > > + !!(bitmap_find_next_zero_area_off(bmap, 192, 0, 33, 0, 0) >= 192)); > > In C `x >= y` yields 0 or 1. There’s no need for `!!`. > > About `>` vs `>=`, the original code returns `> 192` (because it returns > `end` when it exceeds `size`). The new code returns `192` (because it > returns `size` if loop fails to find an area). If that’s the case, this > patch, PATCH 1/2, should have `>` and PATCH 2/2 should change it to > `expect_eq_uint(192, …)` to reflect change in the API. > Very helpful feedback. I will make changes in v8. > > +} > > + > > static void __init test_fill_set(void) > > { > > DECLARE_BITMAP(bmap, 1024); > > @@ -1559,6 +1596,7 @@ static void __init selftest(void) > > test_for_each_clear_bitrange_from(); > > test_for_each_set_clump8(); > > test_for_each_set_bit_wrap(); > > + test_bitmap_find_next_zero_area_off(); > > } > > > > KSTM_MODULE_LOADERS(test_bitmap); > > -- > Best regards > ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴィツ > «If at first you don’t succeed, give up skydiving» ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() 2026-06-22 3:00 [PATCH v7 0/2] Improve the performance of bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 3:00 ` [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() Yi Sun @ 2026-06-22 3:00 ` Yi Sun 2026-06-22 9:56 ` Michał Nazarewicz 2026-06-30 20:52 ` Nathan Chancellor 1 sibling, 2 replies; 8+ messages in thread From: Yi Sun @ 2026-06-22 3:00 UTC (permalink / raw) To: yury.norov Cc: yi.sun, 279644543, mina86, mnazarewicz, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz Finding a contiguous free region in a highly fragmented bitmap is not easy and may require many repeated attempts. Therefore, find_next_bit(map, end, index) is not the optimal choice. This is because there may be multiple scattered free regions within the range [index, end) and none of them will meet the length requirement of @nr. Instead, it's sufficient to directly find the last bit within the range [index, end), thus reducing unnecessary repeated calls. An example of a bitmap: Bits 0-3: cleared(4 bits) Bits 4-5: set (2 bits) Bits 6-8: cleared(4 bits) Bits 9-10: set (2 bits) Bits 11-20: cleared(10 bits) The goal is to find a 10-bit free region. The old code logic is as follows: find_next_zero_bit(start = 0, find bit 0) -> find_next_bit(find bit 4) -> next loop -> find_next_zero_bit(start = 5, find bit 6) -> find_next_bit(find bit 9) -> next loop -> find_next_zero_bit(start = 10, find bit 11) -> success The new code logic is as follows: find_next_zero_bit(start = 0, find bit 0) -> find_last_bit(find bit 9) -> next loop -> find_next_zero_bit(start = 10, find bit 11) -> success Performance test results on my hardware(use lib/find_bit_benchmark.c): before after change p-value dense 1211 688 -43.2% 8.3e-11 sparse 13.3 13.4 0.8% 0.27 Co-developed-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Yi Sun <yi.sun@unisoc.com> --- lib/bitmap.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/lib/bitmap.c b/lib/bitmap.c index b9bfa157e095..a00a71f2b7bb 100644 --- a/lib/bitmap.c +++ b/lib/bitmap.c @@ -432,22 +432,23 @@ unsigned long bitmap_find_next_zero_area_off(unsigned long *map, unsigned long align_mask, unsigned long align_offset) { - unsigned long index, end, i; -again: - index = find_next_zero_bit(map, size, start); - - /* Align allocation */ - index = __ALIGN_MASK(index + align_offset, align_mask) - align_offset; - - end = index + nr; - if (end > size) - return end; - i = find_next_bit(map, end, index); - if (i < end) { - start = i + 1; - goto again; + unsigned long end, i, off; + + for_each_clear_bit_from(start, map, size) { + start = __ALIGN_MASK(start + align_offset, align_mask) - align_offset; + end = start + nr; + if (end > size) + break; + + off = round_down(start, BITS_PER_LONG); + i = find_last_bit(map + start / BITS_PER_LONG, end - off) + off; + if (i >= end || i < start) + return start; + + start = i; } - return index; + + return size; } EXPORT_SYMBOL(bitmap_find_next_zero_area_off); -- 2.34.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() 2026-06-22 3:00 ` [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() Yi Sun @ 2026-06-22 9:56 ` Michał Nazarewicz 2026-06-30 20:52 ` Nathan Chancellor 1 sibling, 0 replies; 8+ messages in thread From: Michał Nazarewicz @ 2026-06-22 9:56 UTC (permalink / raw) To: Yi Sun, yury.norov Cc: yi.sun, 279644543, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz On Mon, Jun 22 2026, Yi Sun wrote: > Finding a contiguous free region in a highly fragmented > bitmap is not easy and may require many repeated attempts. > Therefore, find_next_bit(map, end, index) is not the optimal choice. > This is because there may be multiple scattered free regions > within the range [index, end) and none of them will meet the length > requirement of @nr. > Instead, it's sufficient to directly find the last bit within > the range [index, end), thus reducing unnecessary repeated calls. […] > Co-developed-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Yi Sun <yi.sun@unisoc.com> Acked-by: Michał Nazarewicz <mina86@mina86.com> > --- > lib/bitmap.c | 31 ++++++++++++++++--------------- > 1 file changed, 16 insertions(+), 15 deletions(-) > > diff --git a/lib/bitmap.c b/lib/bitmap.c > index b9bfa157e095..a00a71f2b7bb 100644 > --- a/lib/bitmap.c > +++ b/lib/bitmap.c > @@ -432,22 +432,23 @@ unsigned long bitmap_find_next_zero_area_off(unsigned long *map, > unsigned long align_mask, > unsigned long align_offset) > { […] > + unsigned long end, i, off; > + > + for_each_clear_bit_from(start, map, size) { > + start = __ALIGN_MASK(start + align_offset, align_mask) - align_offset; > + end = start + nr; > + if (end > size) > + break; > + > + off = round_down(start, BITS_PER_LONG); > + i = find_last_bit(map + start / BITS_PER_LONG, end - off) + off; > + if (i >= end || i < start) > + return start; > + > + start = i; Gotta be honest, I’m not a fan of `for_each_clear_bit_from` the way it’s used here. It hides the `start++` that happens at end of each loop. The code works either way, but I find the current form harder to read than an infinite loop. Having said that, Yury is the maintainer here so if he prefers that, I’m obviously not gonna fight it. > } > - return index; > + > + return size; > } > EXPORT_SYMBOL(bitmap_find_next_zero_area_off); -- Best regards ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴィツ «If at first you don’t succeed, give up skydiving» ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() 2026-06-22 3:00 ` [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 9:56 ` Michał Nazarewicz @ 2026-06-30 20:52 ` Nathan Chancellor 2026-06-30 23:13 ` Yury Norov 1 sibling, 1 reply; 8+ messages in thread From: Nathan Chancellor @ 2026-06-30 20:52 UTC (permalink / raw) To: Yi Sun Cc: yury.norov, 279644543, mina86, mnazarewicz, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz Hi, On Mon, Jun 22, 2026 at 11:00:36AM +0800, Yi Sun wrote: > Finding a contiguous free region in a highly fragmented > bitmap is not easy and may require many repeated attempts. > Therefore, find_next_bit(map, end, index) is not the optimal choice. > This is because there may be multiple scattered free regions > within the range [index, end) and none of them will meet the length > requirement of @nr. > Instead, it's sufficient to directly find the last bit within > the range [index, end), thus reducing unnecessary repeated calls. > > An example of a bitmap: > Bits 0-3: cleared(4 bits) > Bits 4-5: set (2 bits) > Bits 6-8: cleared(4 bits) > Bits 9-10: set (2 bits) > Bits 11-20: cleared(10 bits) > > The goal is to find a 10-bit free region. > > The old code logic is as follows: > find_next_zero_bit(start = 0, find bit 0) -> find_next_bit(find bit 4) -> > next loop -> > find_next_zero_bit(start = 5, find bit 6) -> find_next_bit(find bit 9) -> > next loop -> > find_next_zero_bit(start = 10, find bit 11) -> success > > The new code logic is as follows: > find_next_zero_bit(start = 0, find bit 0) -> find_last_bit(find bit 9) -> > next loop -> > find_next_zero_bit(start = 10, find bit 11) -> success > > Performance test results on my hardware(use lib/find_bit_benchmark.c): > > before after change p-value > dense 1211 688 -43.2% 8.3e-11 > sparse 13.3 13.4 0.8% 0.27 > > Co-developed-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Yi Sun <yi.sun@unisoc.com> I bisected a warning I see when virtually boot testing powerpc to this change in -next as commit 6eecaf5cf2dd ("lib: bitmap: optimize bitmap_find_next_zero_area_off()"). $ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc64-linux- mrproper ppc64le_guest_defconfig zImage.epapr $ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20241120-044434/ppc64le-rootfs.cpio.zst | zstd -d >rootfs.cpio $ qemu-system-ppc64 \ -display none \ -nodefaults \ -device ipmi-bmc-sim,id=bmc0 \ -device isa-ipmi-bt,bmc=bmc0,irq=10 \ -machine powernv \ -kernel arch/powerpc/boot/zImage.epapr \ -initrd rootfs.cpio \ -m 2G \ -serial mon:stdio ... [ 0.660986][ T1] Running code patching self-tests ... [ 0.667347][ T1] ------------[ cut here ]------------ [ 0.667519][ T1] WARNING: arch/powerpc/sysdev/msi_bitmap.c:181 at msi_bitmap_selftest+0x1c8/0x3f8, CPU#0: swapper/0/1 [ 0.667964][ T1] Modules linked in: [ 0.668237][ T1] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.2.0-rc1-next-20260630 #1 PREEMPTLAZY [ 0.668462][ T1] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV [ 0.668681][ T1] NIP: c00000000201eb70 LR: c00000000201eb68 CTR: 0000000000000000 [ 0.668809][ T1] REGS: c0000000037d78b0 TRAP: 0700 Not tainted (7.2.0-rc1-next-20260630) [ 0.668956][ T1] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24000224 XER: 20040000 [ 0.669190][ T1] CFAR: c000000000181880 IRQMASK: 0 [ 0.669190][ T1] GPR00: c00000000201eb68 c0000000037d7b70 c000000001c78100 0000000000000001 [ 0.669190][ T1] GPR04: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 [ 0.669190][ T1] GPR08: 0000000000000000 0000000000000000 c000000003764f80 c000000000181d08 [ 0.669190][ T1] GPR12: 0000000000000000 c000000002c10000 c0000000000114d8 0000000000000000 [ 0.669190][ T1] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.669190][ T1] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.669190][ T1] GPR24: 0000000000000000 c000000002003408 c0000000020b1070 c0000000021ecee0 [ 0.669190][ T1] GPR28: 0000000000000000 0000000000000008 0000000000000000 c000000005836480 [ 0.670393][ T1] NIP [c00000000201eb70] msi_bitmap_selftest+0x1c8/0x3f8 [ 0.670499][ T1] LR [c00000000201eb68] msi_bitmap_selftest+0x1c0/0x3f8 [ 0.670657][ T1] Call Trace: [ 0.670747][ T1] [c0000000037d7b70] [c00000000201eb68] msi_bitmap_selftest+0x1c0/0x3f8 (unreliable) [ 0.670941][ T1] [c0000000037d7c20] [c000000000010d9c] do_one_initcall+0x7c/0x37c [ 0.671085][ T1] [c0000000037d7d00] [c000000002004ba8] kernel_init_freeable+0x2fc/0x3cc [ 0.671218][ T1] [c0000000037d7dd0] [c0000000000114fc] kernel_init+0x2c/0x1b0 [ 0.671336][ T1] [c0000000037d7e30] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c [ 0.671479][ T1] ---- interrupt: 0 at 0x0 [ 0.671777][ T1] Code: 38800001 38610068 4a162c61 54630ffe 0b030000 37deffff 4082ffe8 38800001 38610068 4a162c45 7c6318f8 54630ffe <0b030000> eba10070 7fdff378 3bde0001 [ 0.672127][ T1] ---[ end trace 0000000000000000 ]--- ... -- Cheers, Nathan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() 2026-06-30 20:52 ` Nathan Chancellor @ 2026-06-30 23:13 ` Yury Norov 0 siblings, 0 replies; 8+ messages in thread From: Yury Norov @ 2026-06-30 23:13 UTC (permalink / raw) To: Nathan Chancellor Cc: Yi Sun, yury.norov, 279644543, mina86, mnazarewicz, akpm, akinobu.mita, linux-kernel, tjmercier, fvdl, tglx, song, hch, minchan, jstultz On Tue, Jun 30, 2026 at 01:52:40PM -0700, Nathan Chancellor wrote: > Hi, > > On Mon, Jun 22, 2026 at 11:00:36AM +0800, Yi Sun wrote: > > Finding a contiguous free region in a highly fragmented > > bitmap is not easy and may require many repeated attempts. > > Therefore, find_next_bit(map, end, index) is not the optimal choice. > > This is because there may be multiple scattered free regions > > within the range [index, end) and none of them will meet the length > > requirement of @nr. > > Instead, it's sufficient to directly find the last bit within > > the range [index, end), thus reducing unnecessary repeated calls. > > > > An example of a bitmap: > > Bits 0-3: cleared(4 bits) > > Bits 4-5: set (2 bits) > > Bits 6-8: cleared(4 bits) > > Bits 9-10: set (2 bits) > > Bits 11-20: cleared(10 bits) > > > > The goal is to find a 10-bit free region. > > > > The old code logic is as follows: > > find_next_zero_bit(start = 0, find bit 0) -> find_next_bit(find bit 4) -> > > next loop -> > > find_next_zero_bit(start = 5, find bit 6) -> find_next_bit(find bit 9) -> > > next loop -> > > find_next_zero_bit(start = 10, find bit 11) -> success > > > > The new code logic is as follows: > > find_next_zero_bit(start = 0, find bit 0) -> find_last_bit(find bit 9) -> > > next loop -> > > find_next_zero_bit(start = 10, find bit 11) -> success > > > > Performance test results on my hardware(use lib/find_bit_benchmark.c): > > > > before after change p-value > > dense 1211 688 -43.2% 8.3e-11 > > sparse 13.3 13.4 0.8% 0.27 > > > > Co-developed-by: Yury Norov <yury.norov@gmail.com> > > Signed-off-by: Yury Norov <yury.norov@gmail.com> > > Signed-off-by: Yi Sun <yi.sun@unisoc.com> > > I bisected a warning I see when virtually boot testing powerpc to this > change in -next as commit 6eecaf5cf2dd ("lib: bitmap: optimize > bitmap_find_next_zero_area_off()"). Thanks Nathan, I'll drop it from -next for now. > $ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc64-linux- mrproper ppc64le_guest_defconfig zImage.epapr > > $ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20241120-044434/ppc64le-rootfs.cpio.zst | zstd -d >rootfs.cpio > > $ qemu-system-ppc64 \ > -display none \ > -nodefaults \ > -device ipmi-bmc-sim,id=bmc0 \ > -device isa-ipmi-bt,bmc=bmc0,irq=10 \ > -machine powernv \ > -kernel arch/powerpc/boot/zImage.epapr \ > -initrd rootfs.cpio \ > -m 2G \ > -serial mon:stdio > ... > [ 0.660986][ T1] Running code patching self-tests ... > [ 0.667347][ T1] ------------[ cut here ]------------ > [ 0.667519][ T1] WARNING: arch/powerpc/sysdev/msi_bitmap.c:181 at msi_bitmap_selftest+0x1c8/0x3f8, CPU#0: swapper/0/1 > [ 0.667964][ T1] Modules linked in: > [ 0.668237][ T1] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.2.0-rc1-next-20260630 #1 PREEMPTLAZY > [ 0.668462][ T1] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV > [ 0.668681][ T1] NIP: c00000000201eb70 LR: c00000000201eb68 CTR: 0000000000000000 > [ 0.668809][ T1] REGS: c0000000037d78b0 TRAP: 0700 Not tainted (7.2.0-rc1-next-20260630) > [ 0.668956][ T1] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24000224 XER: 20040000 > [ 0.669190][ T1] CFAR: c000000000181880 IRQMASK: 0 > [ 0.669190][ T1] GPR00: c00000000201eb68 c0000000037d7b70 c000000001c78100 0000000000000001 > [ 0.669190][ T1] GPR04: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 > [ 0.669190][ T1] GPR08: 0000000000000000 0000000000000000 c000000003764f80 c000000000181d08 > [ 0.669190][ T1] GPR12: 0000000000000000 c000000002c10000 c0000000000114d8 0000000000000000 > [ 0.669190][ T1] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 0.669190][ T1] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 0.669190][ T1] GPR24: 0000000000000000 c000000002003408 c0000000020b1070 c0000000021ecee0 > [ 0.669190][ T1] GPR28: 0000000000000000 0000000000000008 0000000000000000 c000000005836480 > [ 0.670393][ T1] NIP [c00000000201eb70] msi_bitmap_selftest+0x1c8/0x3f8 > [ 0.670499][ T1] LR [c00000000201eb68] msi_bitmap_selftest+0x1c0/0x3f8 > [ 0.670657][ T1] Call Trace: > [ 0.670747][ T1] [c0000000037d7b70] [c00000000201eb68] msi_bitmap_selftest+0x1c0/0x3f8 (unreliable) > [ 0.670941][ T1] [c0000000037d7c20] [c000000000010d9c] do_one_initcall+0x7c/0x37c > [ 0.671085][ T1] [c0000000037d7d00] [c000000002004ba8] kernel_init_freeable+0x2fc/0x3cc > [ 0.671218][ T1] [c0000000037d7dd0] [c0000000000114fc] kernel_init+0x2c/0x1b0 > [ 0.671336][ T1] [c0000000037d7e30] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c > [ 0.671479][ T1] ---- interrupt: 0 at 0x0 > [ 0.671777][ T1] Code: 38800001 38610068 4a162c61 54630ffe 0b030000 37deffff 4082ffe8 38800001 38610068 4a162c45 7c6318f8 54630ffe <0b030000> eba10070 7fdff378 3bde0001 > [ 0.672127][ T1] ---[ end trace 0000000000000000 ]--- > ... > > -- > Cheers, > Nathan ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-30 23:13 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-22 3:00 [PATCH v7 0/2] Improve the performance of bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 3:00 ` [PATCH v7 1/2] lib: bitmap: add tests for bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 10:05 ` Michał Nazarewicz 2026-06-22 12:39 ` 答复: " 孙毅 (Yi Sun) 2026-06-22 3:00 ` [PATCH v7 2/2] lib: bitmap: optimize bitmap_find_next_zero_area_off() Yi Sun 2026-06-22 9:56 ` Michał Nazarewicz 2026-06-30 20:52 ` Nathan Chancellor 2026-06-30 23:13 ` Yury Norov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox