All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] bitmap: rework bitmap_{bit,}remap()
@ 2023-08-28 18:43 Yury Norov
  2023-08-28 18:43 ` [PATCH 01/12] bitmap: add find_nth_bit_from() Yury Norov
                   ` (12 more replies)
  0 siblings, 13 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

This series adds a test, const-time optimizaton and fixes O(N^2)
complexity problem for the functions. It's based on discussion in
bitmap: optimize bitmap_remap() series [1], but there's much more work
here, so I decided to give it a separete run, and don't name it as v2.

bitmap_remap() API has just one user in generic code, and few more in
drivers, so this may look like an overkill. But the work gives ~10x
better performance for a 1000-bit bitmaps, which is typical for nodemasks
in major distros like Ubuntu.

Anyways, the work is done, so guys please review.

[1] https://lore.kernel.org/lkml/20230815235934.47782-1-yury.norov@gmail.com/T/
Yury Norov (12):
  bitmap: add find_nth_bit_from()
  bitmap: add bitmap_weight_from()
  bitmap: add test for bitmap_remap()
  bitmap: add test for bitmap_bitremap()
  bitmap: update comment for bitmap_{bit,}remap()
  bitmap: add small_cont_nbits() optimization for bitmap_remap()
  bitmap: add small_const_nbits() optimization for bitmap_bitremap()
  bitmap: optiimze bitmap_bitremap()
  bitmap: optimize bitmap_remap() when 'new' is empty map
  bitmap: separate handling of identity and remapping parts in
    bitmap_remap()
  bitmap: defer calculating weight of 'new' in bitmap_remap()
  bitmap: don't count bits from the beginning in bitmap_remap()

 include/linux/bitmap.h | 116 +++++++++++++++++++++++++++++++--
 include/linux/find.h   |  37 +++++++++++
 lib/bitmap.c           | 145 ++++++++++++++++++++---------------------
 lib/find_bit.c         |  29 +++++++++
 lib/test_bitmap.c      | 142 +++++++++++++++++++++++++++++++++++++++-
 5 files changed, 388 insertions(+), 81 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/12] bitmap: add find_nth_bit_from()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 02/12] bitmap: add bitmap_weight_from() Yury Norov
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

Similarly to find_next_bit(), add a _from flavor for find_nth_bit(). It's
useful when traversing bitmaps in a loop because it allows to not count
bits from the beginning every time.

The test is added as well.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/find.h | 37 +++++++++++++++++++++++++++++++++++++
 lib/find_bit.c       | 29 +++++++++++++++++++++++++++++
 lib/test_bitmap.c    | 10 +++++++++-
 3 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/include/linux/find.h b/include/linux/find.h
index 5e4f39ef2e72..f7fb88405201 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -27,6 +27,8 @@ unsigned long __find_nth_andnot_bit(const unsigned long *addr1, const unsigned l
 unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsigned long *addr2,
 					const unsigned long *addr3, unsigned long size,
 					unsigned long n);
+unsigned long __find_nth_bit_from(const unsigned long *addr, unsigned long size,
+					unsigned long start, unsigned long nth);
 extern unsigned long _find_first_and_bit(const unsigned long *addr1,
 					 const unsigned long *addr2, unsigned long size);
 extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
@@ -237,6 +239,41 @@ unsigned long find_nth_bit(const unsigned long *addr, unsigned long size, unsign
 	return __find_nth_bit(addr, size, n);
 }
 
+/**
+ * find_nth_bit_from - find N'th set bit in a memory region starting at @off
+ * @addr: The address to start the search at
+ * @size: The maximum number of bits to search
+ * @off: The offset to start search
+ * @n: The number of set bit, which position is needed, counting from 0
+ *
+ * The following is semantically equivalent:
+ *	 idx = find_nth_bit_from(addr, size, off, 0);
+ *	 idx = find_next_bit(addr, size, off);
+ *
+ * Return: the bit number of the N'th set bit.
+ * If no such, returns @size.
+ */
+static __always_inline
+unsigned long find_nth_bit_from(const unsigned long *addr, unsigned long size,
+				unsigned long start, unsigned long n)
+{
+	if (n >= size - start)
+		return size;
+
+	if (small_const_nbits(size - start) && size / BITS_PER_LONG == start / BITS_PER_LONG) {
+		unsigned long val, idx = start / BITS_PER_LONG;
+
+		val =  addr[idx] & GENMASK(size - 1, start);
+		if (val == 0)
+			return size;
+
+		val = idx * BITS_PER_LONG + fns(val, n);
+		return val < size ? val : size;
+	}
+
+	return __find_nth_bit_from(addr, size, start, n);
+}
+
 /**
  * find_nth_and_bit - find N'th set bit in 2 memory regions
  * @addr1: The 1st address to start the search at
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 32f99e9a670e..49959b51df8c 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -164,6 +164,35 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1,
 }
 EXPORT_SYMBOL(__find_nth_and_andnot_bit);
 
+#define FIND_NTH_BIT_FROM(FETCH, size, start, nth)				\
+({										\
+	unsigned long mask, idx, tmp, w, sz = (size), __start = (start), n = (nth);\
+										\
+	if (unlikely(__start >= sz || n > sz - __start))			\
+		goto out;							\
+										\
+	mask = BITMAP_FIRST_WORD_MASK(__start);					\
+	idx = __start / BITS_PER_LONG;						\
+										\
+	for (tmp = (FETCH) & mask; (w = hweight_long(tmp)) <= n; tmp = (FETCH)) {\
+		if ((idx + 1) * BITS_PER_LONG >= sz)				\
+			goto out;						\
+		idx++;								\
+		n -= w;								\
+	}									\
+										\
+	sz = min(idx * BITS_PER_LONG + fns(tmp, n), sz);			\
+out:										\
+	sz;									\
+})
+
+unsigned long __find_nth_bit_from(const unsigned long *addr, unsigned long size,
+					unsigned long start, unsigned long nth)
+{
+	return FIND_NTH_BIT_FROM(addr[idx], size, start, nth);
+}
+EXPORT_SYMBOL(__find_nth_bit_from);
+
 #ifndef find_next_and_bit
 unsigned long _find_next_and_bit(const unsigned long *addr1, const unsigned long *addr2,
 					unsigned long nbits, unsigned long start)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index def7d2f9bd14..3248ed58a817 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -223,7 +223,7 @@ static void __init test_zero_clear(void)
 
 static void __init test_find_nth_bit(void)
 {
-	unsigned long b, bit, cnt = 0;
+	unsigned long i, b, bit, cnt = 0;
 	DECLARE_BITMAP(bmap, 64 * 3);
 
 	bitmap_zero(bmap, 64 * 3);
@@ -260,6 +260,14 @@ static void __init test_find_nth_bit(void)
 		b = find_nth_bit(exp1, EXP1_IN_BITS, cnt++);
 		expect_eq_uint(b, bit);
 	}
+
+	for (i = 0; i < EXP1_IN_BITS; i++) {
+		bit = i; cnt = 0;
+		for_each_set_bit_from(bit, exp1, EXP1_IN_BITS) {
+			b = find_nth_bit_from(exp1, EXP1_IN_BITS, i, cnt++);
+			expect_eq_uint(b, bit);
+		}
+	}
 }
 
 static void __init test_fill_set(void)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/12] bitmap: add bitmap_weight_from()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
  2023-08-28 18:43 ` [PATCH 01/12] bitmap: add find_nth_bit_from() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-29  7:22   ` Rasmus Villemoes
  2023-08-28 18:43 ` [PATCH 03/12] bitmap: add test for bitmap_remap() Yury Norov
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

Add a _from flavor for bitmap_weight(). It's useful when traversing
bitmaps in a loop to not count bits from the beginning every time.

The test for bitmap_weight{,_from}() is added as well.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h | 14 ++++++++++++++
 lib/bitmap.c           | 25 +++++++++++++++++++++++++
 lib/test_bitmap.c      | 18 ++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 692d0a1dbe92..6acbdd2abd0c 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -168,6 +168,8 @@ bool __bitmap_subset(const unsigned long *bitmap1,
 unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits);
 unsigned int __bitmap_weight_and(const unsigned long *bitmap1,
 				 const unsigned long *bitmap2, unsigned int nbits);
+unsigned int __bitmap_weight_from(const unsigned long *bitmap,
+					unsigned int bits, unsigned int start);
 void __bitmap_set(unsigned long *map, unsigned int start, int len);
 void __bitmap_clear(unsigned long *map, unsigned int start, int len);
 
@@ -446,6 +448,18 @@ unsigned long bitmap_weight_and(const unsigned long *src1,
 	return __bitmap_weight_and(src1, src2, nbits);
 }
 
+static __always_inline
+unsigned int bitmap_weight_from(const unsigned long *src, unsigned int nbits, unsigned int start)
+{
+	if (unlikely(start >= nbits))
+		return 0;
+
+	if (small_const_nbits(nbits - start) && nbits / BITS_PER_LONG == start / BITS_PER_LONG)
+		return hweight_long(*src & GENMASK(nbits-1, start));
+
+	return __bitmap_weight_from(src, nbits, start);
+}
+
 static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 		unsigned int nbits)
 {
diff --git a/lib/bitmap.c b/lib/bitmap.c
index cf77d56cf223..65c64911c92f 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -345,6 +345,24 @@ EXPORT_SYMBOL(__bitmap_subset);
 	w;									\
 })
 
+#define BITMAP_WEIGHT_FROM(FETCH, bits, start)					\
+({										\
+	unsigned long __bits = (bits), val, idx, w = 0, __start = (start), mask;\
+										\
+	mask = BITMAP_FIRST_WORD_MASK(__start);					\
+	idx = __start / BITS_PER_LONG;						\
+										\
+	for (val = (FETCH) & mask; idx < __bits / BITS_PER_LONG; val = (FETCH))	{\
+		w += hweight_long(val);						\
+		idx++;								\
+	}									\
+										\
+	if (__bits % BITS_PER_LONG)						\
+		w += hweight_long((val) & BITMAP_LAST_WORD_MASK(__bits));	\
+										\
+	w;									\
+})
+
 unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int bits)
 {
 	return BITMAP_WEIGHT(bitmap[idx], bits);
@@ -358,6 +376,13 @@ unsigned int __bitmap_weight_and(const unsigned long *bitmap1,
 }
 EXPORT_SYMBOL(__bitmap_weight_and);
 
+unsigned int __bitmap_weight_from(const unsigned long *bitmap,
+					unsigned int bits, unsigned int start)
+{
+	return BITMAP_WEIGHT_FROM(bitmap[idx], bits, start);
+}
+EXPORT_SYMBOL(__bitmap_weight_from);
+
 void __bitmap_set(unsigned long *map, unsigned int start, int len)
 {
 	unsigned long *p = map + BIT_WORD(start);
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 3248ed58a817..a5d823f7589d 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -361,6 +361,23 @@ static void __init test_bitmap_region(void)
 	expect_eq_uint(bitmap_weight(bmap, 1000), 0);
 }
 
+static void __init test_weight(void)
+{
+	DECLARE_BITMAP(bmap, 1024);
+	unsigned int idx, w;
+
+	for (idx = 0; idx < 1024; idx++)
+		__assign_bit(idx, bmap, idx);
+
+	w = bitmap_weight(bmap, 1024);
+	for (idx = 0; idx < 1024; idx++) {
+		unsigned int w1 = bitmap_weight(bmap, idx);
+		unsigned int w2 = bitmap_weight_from(bmap, 1024, idx);
+
+		expect_eq_uint(w1 + w2, w);
+	}
+}
+
 #define EXP2_IN_BITS	(sizeof(exp2) * 8)
 
 static void __init test_replace(void)
@@ -1260,6 +1277,7 @@ static void __init selftest(void)
 	test_copy();
 	test_bitmap_region();
 	test_replace();
+	test_weight();
 	test_bitmap_arr32();
 	test_bitmap_arr64();
 	test_bitmap_parse();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/12] bitmap: add test for bitmap_remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
  2023-08-28 18:43 ` [PATCH 01/12] bitmap: add find_nth_bit_from() Yury Norov
  2023-08-28 18:43 ` [PATCH 02/12] bitmap: add bitmap_weight_from() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 04/12] bitmap: add test for bitmap_bitremap() Yury Norov
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

Basic functional and performance tests for bitmap_remap(). 1000 bits
length is chosen for performance test because it's of the same order
as default value for MAX_NUMNODES in major distros like Ubuntu (1024).

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/test_bitmap.c | 80 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index a5d823f7589d..e1c22d399f24 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -378,6 +378,85 @@ static void __init test_weight(void)
 	}
 }
 
+static void __init test_remap(void)
+{
+	DECLARE_BITMAP(dst, 8);
+
+	DECLARE_BITMAP(empty, 8) = { 0 };
+	DECLARE_BITMAP(src, 8) = { 0b00101010 };
+	DECLARE_BITMAP(old, 8) = { 0b00011100 };
+	DECLARE_BITMAP(new, 8) = { 0b00111000 };
+	DECLARE_BITMAP(exp0, 8) = { 0b00110010 };
+	DECLARE_BITMAP(exp1, 8) = { 0b00011010 };
+	DECLARE_BITMAP(exp2, 8) = { 0b10000010 };
+
+	DECLARE_BITMAP(perf_exp, 1000);
+	DECLARE_BITMAP(perf_dst, 1000);
+	DECLARE_BITMAP(perf_src, 1000);
+	DECLARE_BITMAP(perf_old, 1000);
+	DECLARE_BITMAP(perf_new, 1000);
+
+	unsigned int i;
+	ktime_t time;
+
+	bitmap_remap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp0, dst, 8);
+
+	/*
+	 * When old mapping is the same as new, source bits are copied to dst.
+	 * Real code must use bitmap_copy() if it's known in advance.
+	 */
+	bitmap_remap(dst, src, old, old, 8);
+	expect_eq_bitmap(src, dst, 8);
+
+	bitmap_remap(dst, src, new, new, 8);
+	expect_eq_bitmap(src, dst, 8);
+
+	/*
+	 * When either old or new mappings are empty, source bits are copied to
+	 * dst. Real code must use bitmap_copy() if it's known in advance.
+	 */
+	bitmap_remap(dst, src, empty, new, 8);
+	expect_eq_bitmap(src, dst, 8);
+
+	bitmap_remap(dst, src, old, empty, 8);
+	expect_eq_bitmap(src, dst, 8);
+
+	bitmap_remap(dst, src, empty, empty, 8);
+	expect_eq_bitmap(src, dst, 8);
+
+	/* Set extra bit in old map to test carry logic */
+	set_bit(5, old);
+	bitmap_remap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp1, dst, 8);
+
+	/* Map old bits to #7 */
+	bitmap_zero(new, 8);
+	set_bit(7, new);
+	bitmap_remap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp2, dst, 8);
+
+	bitmap_fill(perf_src, 1000);
+	bitmap_set(perf_old, 0, 500);
+	bitmap_clear(perf_old, 500, 500);
+
+	for (i = 0; i < 1000; i += 20) {
+		bitmap_set(perf_new, i, 10);
+		bitmap_clear(perf_new, i + 10, 10);
+	}
+
+	bitmap_copy(perf_exp, perf_new, 500);
+	bitmap_set(perf_exp, 500, 500);
+
+	time = ktime_get();
+	bitmap_remap(perf_dst, perf_src, perf_old, perf_new, 1000);
+	time = ktime_get() - time;
+
+	expect_eq_bitmap(perf_exp, perf_dst, 1000);
+	pr_err("bitmap_remap:  %llu ns\n", time);
+
+}
+
 #define EXP2_IN_BITS	(sizeof(exp2) * 8)
 
 static void __init test_replace(void)
@@ -1278,6 +1357,7 @@ static void __init selftest(void)
 	test_bitmap_region();
 	test_replace();
 	test_weight();
+	test_remap();
 	test_bitmap_arr32();
 	test_bitmap_arr64();
 	test_bitmap_parse();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/12] bitmap: add test for bitmap_bitremap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (2 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 03/12] bitmap: add test for bitmap_remap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 05/12] bitmap: update comment for bitmap_{bit,}remap() Yury Norov
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

Similarly to bitmap_remap, test bitmap_bitremap().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/test_bitmap.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index e1c22d399f24..e9211f9a0e67 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -378,6 +378,19 @@ static void __init test_weight(void)
 	}
 }
 
+static __always_inline void __init __test_bitremap(unsigned long *dst, unsigned long *src,
+				unsigned long *old, unsigned long *new, unsigned long nbits)
+{
+	unsigned long oldbit, newbit;
+
+	bitmap_zero(dst, nbits);
+
+	for_each_set_bit(oldbit, src, nbits) {
+		newbit = bitmap_bitremap(oldbit, old, new, nbits);
+		__set_bit(newbit, dst);
+	}
+}
+
 static void __init test_remap(void)
 {
 	DECLARE_BITMAP(dst, 8);
@@ -402,6 +415,9 @@ static void __init test_remap(void)
 	bitmap_remap(dst, src, old, new, 8);
 	expect_eq_bitmap(exp0, dst, 8);
 
+	__test_bitremap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp0, dst, 8);
+
 	/*
 	 * When old mapping is the same as new, source bits are copied to dst.
 	 * Real code must use bitmap_copy() if it's known in advance.
@@ -409,6 +425,9 @@ static void __init test_remap(void)
 	bitmap_remap(dst, src, old, old, 8);
 	expect_eq_bitmap(src, dst, 8);
 
+	__test_bitremap(dst, src, old, old, 8);
+	expect_eq_bitmap(src, dst, 8);
+
 	bitmap_remap(dst, src, new, new, 8);
 	expect_eq_bitmap(src, dst, 8);
 
@@ -419,23 +438,38 @@ static void __init test_remap(void)
 	bitmap_remap(dst, src, empty, new, 8);
 	expect_eq_bitmap(src, dst, 8);
 
+	__test_bitremap(dst, src, empty, new, 8);
+	expect_eq_bitmap(src, dst, 8);
+
 	bitmap_remap(dst, src, old, empty, 8);
 	expect_eq_bitmap(src, dst, 8);
 
+	__test_bitremap(dst, src, old, empty, 8);
+	expect_eq_bitmap(src, dst, 8);
+
 	bitmap_remap(dst, src, empty, empty, 8);
 	expect_eq_bitmap(src, dst, 8);
 
+	__test_bitremap(dst, src, empty, empty, 8);
+	expect_eq_bitmap(src, dst, 8);
+
 	/* Set extra bit in old map to test carry logic */
 	set_bit(5, old);
 	bitmap_remap(dst, src, old, new, 8);
 	expect_eq_bitmap(exp1, dst, 8);
 
+	__test_bitremap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp1, dst, 8);
+
 	/* Map old bits to #7 */
 	bitmap_zero(new, 8);
 	set_bit(7, new);
 	bitmap_remap(dst, src, old, new, 8);
 	expect_eq_bitmap(exp2, dst, 8);
 
+	__test_bitremap(dst, src, old, new, 8);
+	expect_eq_bitmap(exp2, dst, 8);
+
 	bitmap_fill(perf_src, 1000);
 	bitmap_set(perf_old, 0, 500);
 	bitmap_clear(perf_old, 500, 500);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/12] bitmap: update comment for bitmap_{bit,}remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (3 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 04/12] bitmap: add test for bitmap_bitremap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 06/12] bitmap: add small_cont_nbits() optimization for bitmap_remap() Yury Norov
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

Add an illustrated example for bitmap_remap(), and drop duplicated wording
in bitmap_bitremap(). This exact example is tested in lib/test_bitmap.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 54 +++++++++++++++++++++++++---------------------------
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 65c64911c92f..30c375bffe8b 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1002,10 +1002,30 @@ static int bitmap_pos_to_ord(const unsigned long *buf, unsigned int pos, unsigne
  *
  * Let @old and @new define a mapping of bit positions, such that
  * whatever position is held by the n-th set bit in @old is mapped
- * to the n-th set bit in @new.  In the more general case, allowing
- * for the possibility that the weight 'w' of @new is less than the
- * weight of @old, map the position of the n-th set bit in @old to
- * the position of the m-th set bit in @new, where m == n % w.
+ * to the n-th set bit in @new. For example lets say that @old has
+ * bits 2 through 4 set, and @new has bits 3 through 5 set:
+ *
+ *	old: 00011100
+ *	     |||///||
+ *	new: 00111000
+ *
+ * This defines the mapping of bit position 2 to 3, 3 to 4 and 4 to 5,
+ * and of all other bit positions unchanged. So if say @src comes into
+ * this routine with bits 1, 3 and 5 set, then @dst should leave with
+ * bits 1, 4 and 5 set:
+ *
+ *	src: 00101010
+ *	       v v v
+ *	old: 00011100
+ *	     |||///||
+ *	new: 00111000
+ *	       vv  v
+ *	dst: 00110010
+ *
+ * In the more general case, allowing for the possibility that the weight
+ * 'w' of @new is less than the weight of @old, map the position of the
+ * n-th set bit in @old to the position of the m-th set bit in @new, where
+ * m == n % w.
  *
  * If either of the @old and @new bitmaps are empty, or if @src and
  * @dst point to the same location, then this routine copies @src
@@ -1016,13 +1036,6 @@ static int bitmap_pos_to_ord(const unsigned long *buf, unsigned int pos, unsigne
  *
  * Apply the above specified mapping to @src, placing the result in
  * @dst, clearing any bits previously set in @dst.
- *
- * For example, lets say that @old has bits 4 through 7 set, and
- * @new has bits 12 through 15 set.  This defines the mapping of bit
- * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
- * bit positions unchanged.  So if say @src comes into this routine
- * with bits 1, 5 and 7 set, then @dst should leave with bits 1,
- * 13 and 15 set.
  */
 void bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new,
@@ -1053,24 +1066,9 @@ EXPORT_SYMBOL(bitmap_remap);
  *	@new: defines range of map
  *	@bits: number of bits in each of these bitmaps
  *
- * Let @old and @new define a mapping of bit positions, such that
- * whatever position is held by the n-th set bit in @old is mapped
- * to the n-th set bit in @new.  In the more general case, allowing
- * for the possibility that the weight 'w' of @new is less than the
- * weight of @old, map the position of the n-th set bit in @old to
- * the position of the m-th set bit in @new, where m == n % w.
- *
- * The positions of unset bits in @old are mapped to themselves
- * (the identity map).
- *
- * Apply the above specified mapping to bit position @oldbit, returning
- * the new bit position.
+ * A special case of bitmap_remap(), when a single bit remapping is needed.
  *
- * For example, lets say that @old has bits 4 through 7 set, and
- * @new has bits 12 through 15 set.  This defines the mapping of bit
- * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
- * bit positions unchanged.  So if say @oldbit is 5, then this routine
- * returns 13.
+ * Returns: position of remapped bit
  */
 int bitmap_bitremap(int oldbit, const unsigned long *old,
 				const unsigned long *new, int bits)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/12] bitmap: add small_cont_nbits() optimization for bitmap_remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (4 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 05/12] bitmap: update comment for bitmap_{bit,}remap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 07/12] bitmap: add small_const_nbits() optimization for bitmap_bitremap() Yury Norov
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

When nbits is less than BITS_PER_LONG, we can handle trivial cases
inline.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h | 66 ++++++++++++++++++++++++++++++++++++++++--
 lib/bitmap.c           | 49 ++-----------------------------
 2 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 6acbdd2abd0c..486451b80339 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -161,6 +161,8 @@ bool __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
 void __bitmap_replace(unsigned long *dst,
 		      const unsigned long *old, const unsigned long *new,
 		      const unsigned long *mask, unsigned int nbits);
+void __bitmap_remap(unsigned long *dst, const unsigned long *src,
+		const unsigned long *old, const unsigned long *new, unsigned int nbits);
 bool __bitmap_intersects(const unsigned long *bitmap1,
 			 const unsigned long *bitmap2, unsigned int nbits);
 bool __bitmap_subset(const unsigned long *bitmap1,
@@ -211,8 +213,7 @@ int bitmap_parselist(const char *buf, unsigned long *maskp,
 			int nmaskbits);
 int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
 			unsigned long *dst, int nbits);
-void bitmap_remap(unsigned long *dst, const unsigned long *src,
-		const unsigned long *old, const unsigned long *new, unsigned int nbits);
+
 int bitmap_bitremap(int oldbit,
 		const unsigned long *old, const unsigned long *new, int bits);
 void bitmap_onto(unsigned long *dst, const unsigned long *orig,
@@ -671,6 +672,67 @@ static inline int bitmap_find_free_region(unsigned long *bitmap, unsigned int bi
 	return -ENOMEM;
 }
 
+/**
+ * bitmap_remap - Apply map defined by a pair of bitmaps to another bitmap
+ *	@dst: remapped result
+ *	@src: subset to be remapped
+ *	@old: defines domain of map
+ *	@new: defines range of map
+ *	@nbits: number of bits in each of these bitmaps
+ *
+ * Let @old and @new define a mapping of bit positions, such that
+ * whatever position is held by the n-th set bit in @old is mapped
+ * to the n-th set bit in @new. For example lets say that @old has
+ * bits 2 through 4 set, and @new has bits 3 through 5 set:
+ *
+ *	old: 00011100
+ *	     |||///||
+ *	new: 00111000
+ *
+ * This defines the mapping of bit position 2 to 3, 3 to 4 and 4 to 5,
+ * and of all other bit positions unchanged. So if say @src comes into
+ * this routine with bits 1, 3 and 5 set, then @dst should leave with
+ * bits 1, 4 and 5 set:
+ *
+ *	src: 00101010
+ *	       v v v
+ *	old: 00011100
+ *	     |||///||
+ *	new: 00111000
+ *	       vv  v
+ *	dst: 00110010
+ *
+ * In the more general case, allowing for the possibility that the weight
+ * 'w' of @new is less than the weight of @old, map the position of the
+ * n-th set bit in @old to the position of the m-th set bit in @new, where
+ * m == n % w.
+ *
+ * If either of the @old and @new bitmaps are empty, or if @src and
+ * @dst point to the same location, then this routine copies @src
+ * to @dst.
+ *
+ * The positions of unset bits in @old are mapped to themselves
+ * (the identity map).
+ *
+ * Apply the above specified mapping to @src, placing the result in
+ * @dst, clearing any bits previously set in @dst.
+ */
+static inline void bitmap_remap(unsigned long *dst, const unsigned long *src,
+		const unsigned long *old, const unsigned long *new, unsigned int nbits)
+{
+	if (small_const_nbits(nbits)) {
+		if ((*src & BITMAP_LAST_WORD_MASK(nbits)) == 0 ||
+		    (*old & BITMAP_LAST_WORD_MASK(nbits)) == 0 ||
+		    (*new & BITMAP_LAST_WORD_MASK(nbits)) == 0 ||
+		    ((*new ^ *old) & BITMAP_LAST_WORD_MASK(nbits)) == 0) {
+			*dst = *src & BITMAP_LAST_WORD_MASK(nbits);
+			return;
+		}
+	}
+
+	__bitmap_remap(dst, src, old, new, nbits);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __LINUX_BITMAP_H */
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 30c375bffe8b..2ac48d9bcbc0 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -992,52 +992,7 @@ static int bitmap_pos_to_ord(const unsigned long *buf, unsigned int pos, unsigne
 	return bitmap_weight(buf, pos);
 }
 
-/**
- * bitmap_remap - Apply map defined by a pair of bitmaps to another bitmap
- *	@dst: remapped result
- *	@src: subset to be remapped
- *	@old: defines domain of map
- *	@new: defines range of map
- *	@nbits: number of bits in each of these bitmaps
- *
- * Let @old and @new define a mapping of bit positions, such that
- * whatever position is held by the n-th set bit in @old is mapped
- * to the n-th set bit in @new. For example lets say that @old has
- * bits 2 through 4 set, and @new has bits 3 through 5 set:
- *
- *	old: 00011100
- *	     |||///||
- *	new: 00111000
- *
- * This defines the mapping of bit position 2 to 3, 3 to 4 and 4 to 5,
- * and of all other bit positions unchanged. So if say @src comes into
- * this routine with bits 1, 3 and 5 set, then @dst should leave with
- * bits 1, 4 and 5 set:
- *
- *	src: 00101010
- *	       v v v
- *	old: 00011100
- *	     |||///||
- *	new: 00111000
- *	       vv  v
- *	dst: 00110010
- *
- * In the more general case, allowing for the possibility that the weight
- * 'w' of @new is less than the weight of @old, map the position of the
- * n-th set bit in @old to the position of the m-th set bit in @new, where
- * m == n % w.
- *
- * If either of the @old and @new bitmaps are empty, or if @src and
- * @dst point to the same location, then this routine copies @src
- * to @dst.
- *
- * The positions of unset bits in @old are mapped to themselves
- * (the identity map).
- *
- * Apply the above specified mapping to @src, placing the result in
- * @dst, clearing any bits previously set in @dst.
- */
-void bitmap_remap(unsigned long *dst, const unsigned long *src,
+void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new,
 		unsigned int nbits)
 {
@@ -1057,7 +1012,7 @@ void bitmap_remap(unsigned long *dst, const unsigned long *src,
 			set_bit(find_nth_bit(new, nbits, n % w), dst);
 	}
 }
-EXPORT_SYMBOL(bitmap_remap);
+EXPORT_SYMBOL(__bitmap_remap);
 
 /**
  * bitmap_bitremap - Apply map defined by a pair of bitmaps to a single bit
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/12] bitmap: add small_const_nbits() optimization for bitmap_bitremap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (5 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 06/12] bitmap: add small_cont_nbits() optimization for bitmap_remap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 08/12] bitmap: optiimze bitmap_bitremap() Yury Norov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

When 'bits' is less than BITS_PER_LONG, we can remap a bit inline.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h | 36 ++++++++++++++++++++++++++++++++++--
 lib/bitmap.c           | 15 ++-------------
 2 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 486451b80339..da5030a99af5 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -163,6 +163,8 @@ void __bitmap_replace(unsigned long *dst,
 		      const unsigned long *mask, unsigned int nbits);
 void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new, unsigned int nbits);
+int __bitmap_bitremap(int oldbit,
+		const unsigned long *old, const unsigned long *new, int nbits);
 bool __bitmap_intersects(const unsigned long *bitmap1,
 			 const unsigned long *bitmap2, unsigned int nbits);
 bool __bitmap_subset(const unsigned long *bitmap1,
@@ -214,8 +216,6 @@ int bitmap_parselist(const char *buf, unsigned long *maskp,
 int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
 			unsigned long *dst, int nbits);
 
-int bitmap_bitremap(int oldbit,
-		const unsigned long *old, const unsigned long *new, int bits);
 void bitmap_onto(unsigned long *dst, const unsigned long *orig,
 		const unsigned long *relmap, unsigned int bits);
 void bitmap_fold(unsigned long *dst, const unsigned long *orig,
@@ -733,6 +733,38 @@ static inline void bitmap_remap(unsigned long *dst, const unsigned long *src,
 	__bitmap_remap(dst, src, old, new, nbits);
 }
 
+/**
+ * bitmap_bitremap - Apply map defined by a pair of bitmaps to a single bit
+ *	@oldbit: bit position to be mapped
+ *	@old: defines domain of map
+ *	@new: defines range of map
+ *	@nbits: number of bits in each of these bitmaps
+ *
+ * A special case of bitmap_remap(), when a single bit remapping is needed.
+ *
+ * Returns: position of remapped bit
+ */
+static inline int bitmap_bitremap(int oldbit,
+		const unsigned long *old, const unsigned long *new, int nbits)
+{
+	if (small_const_nbits(nbits)) {
+		unsigned int w, n;
+
+		if ((*old & BIT(oldbit)) == 0 ||
+		    (*old & BITMAP_LAST_WORD_MASK(nbits)) == 0 ||
+		    (*new & BITMAP_LAST_WORD_MASK(nbits)) == 0 ||
+		    ((*new ^ *old) & BITMAP_LAST_WORD_MASK(nbits)) == 0) {
+			return oldbit;
+		}
+
+		w = bitmap_weight(new, nbits);
+		n = bitmap_weight(old, oldbit);
+		return find_nth_bit(new, nbits, n % w);
+	}
+
+	return __bitmap_bitremap(oldbit, old, new, nbits);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __LINUX_BITMAP_H */
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 2ac48d9bcbc0..9ecdc74cb6b4 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1014,18 +1014,7 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 }
 EXPORT_SYMBOL(__bitmap_remap);
 
-/**
- * bitmap_bitremap - Apply map defined by a pair of bitmaps to a single bit
- *	@oldbit: bit position to be mapped
- *	@old: defines domain of map
- *	@new: defines range of map
- *	@bits: number of bits in each of these bitmaps
- *
- * A special case of bitmap_remap(), when a single bit remapping is needed.
- *
- * Returns: position of remapped bit
- */
-int bitmap_bitremap(int oldbit, const unsigned long *old,
+int __bitmap_bitremap(int oldbit, const unsigned long *old,
 				const unsigned long *new, int bits)
 {
 	int w = bitmap_weight(new, bits);
@@ -1035,7 +1024,7 @@ int bitmap_bitremap(int oldbit, const unsigned long *old,
 	else
 		return find_nth_bit(new, bits, n % w);
 }
-EXPORT_SYMBOL(bitmap_bitremap);
+EXPORT_SYMBOL(__bitmap_bitremap);
 
 #ifdef CONFIG_NUMA
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/12] bitmap: optiimze bitmap_bitremap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (6 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 07/12] bitmap: add small_const_nbits() optimization for bitmap_bitremap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 09/12] bitmap: optimize bitmap_remap() when 'new' is empty map Yury Norov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

When 'new' map is empty, we can skip remapping entirely and return
the old bit.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 9ecdc74cb6b4..2e8deeb8bf99 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1017,12 +1017,17 @@ EXPORT_SYMBOL(__bitmap_remap);
 int __bitmap_bitremap(int oldbit, const unsigned long *old,
 				const unsigned long *new, int bits)
 {
-	int w = bitmap_weight(new, bits);
-	int n = bitmap_pos_to_ord(old, oldbit, bits);
-	if (n < 0 || w == 0)
+	int w, n;
+
+	w = bitmap_weight(new, bits);
+	if (w == 0)
+		return oldbit;
+
+	n = bitmap_pos_to_ord(old, oldbit, bits);
+	if (n < 0)
 		return oldbit;
-	else
-		return find_nth_bit(new, bits, n % w);
+
+	return find_nth_bit(new, bits, n % w);
 }
 EXPORT_SYMBOL(__bitmap_bitremap);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/12] bitmap: optimize bitmap_remap() when 'new' is empty map
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (7 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 08/12] bitmap: optiimze bitmap_bitremap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 10/12] bitmap: separate handling of identity and remapping parts in bitmap_remap() Yury Norov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

When 'new' map is empty, we can skip remapping entirely and replace it
with bitmap_copy().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 2e8deeb8bf99..50385d61e6ea 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1003,6 +1003,11 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 	bitmap_zero(dst, nbits);
 
 	w = bitmap_weight(new, nbits);
+	if (w == 0) {
+		bitmap_copy(dst, src, nbits);
+		return;
+	}
+
 	for_each_set_bit(oldbit, src, nbits) {
 		int n = bitmap_pos_to_ord(old, oldbit, nbits);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/12] bitmap: separate handling of identity and remapping parts in bitmap_remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (8 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 09/12] bitmap: optimize bitmap_remap() when 'new' is empty map Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 11/12] bitmap: defer calculating weight of 'new' " Yury Norov
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

For unset bits in 'old' map (identity mapping), 'src' bits must be copied
to 'dst'. Doing that separately from remapping in a for-loop has some
advantages:
 - implicitly initialize 'dst' without calling bitmap_zero();
 - optimize performance of handling identity parts, because per-word
   bitmap_andnot() is faster than per-bit set_bit() in a for-loop;
 - make inner part of the loop unconditional;
 - when 'old' map is empty, new logic simiply skips the for-loop part.

While here, replace set_bit() with a non-atomic version, because the whole
function is non-atomic anyways. If atomicity required, it should be handled
by user.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 50385d61e6ea..f62ea97e942c 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -996,11 +996,10 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new,
 		unsigned int nbits)
 {
-	unsigned int oldbit, w;
+	unsigned int oldbit, w, n;
 
 	if (dst == src)		/* following doesn't handle inplace remaps */
 		return;
-	bitmap_zero(dst, nbits);
 
 	w = bitmap_weight(new, nbits);
 	if (w == 0) {
@@ -1008,13 +1007,13 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		return;
 	}
 
-	for_each_set_bit(oldbit, src, nbits) {
-		int n = bitmap_pos_to_ord(old, oldbit, nbits);
+	/* Identity part */
+	bitmap_andnot(dst, src, old, nbits);
 
-		if (n < 0 || w == 0)
-			set_bit(oldbit, dst);	/* identity map */
-		else
-			set_bit(find_nth_bit(new, nbits, n % w), dst);
+	/* Remapping part */
+	for_each_and_bit(oldbit, src, old, nbits) {
+		n = bitmap_weight(old, oldbit);
+		__set_bit(find_nth_bit(new, nbits, n % w), dst);
 	}
 }
 EXPORT_SYMBOL(__bitmap_remap);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/12] bitmap: defer calculating weight of 'new' in bitmap_remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (9 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 10/12] bitmap: separate handling of identity and remapping parts in bitmap_remap() Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-28 18:43 ` [PATCH 12/12] bitmap: don't count bits from the beginning " Yury Norov
  2023-08-29  7:33 ` [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Rasmus Villemoes
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

When 'old' map is empty, we don't need to calculate weight of 'new' map,
because all 'src' bits are simply copied to dst.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index f62ea97e942c..1fca60d54cb4 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -996,23 +996,24 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new,
 		unsigned int nbits)
 {
-	unsigned int oldbit, w, n;
+	unsigned int oldbit, w = 0, n;
 
 	if (dst == src)		/* following doesn't handle inplace remaps */
 		return;
 
-	w = bitmap_weight(new, nbits);
-	if (w == 0) {
-		bitmap_copy(dst, src, nbits);
-		return;
-	}
-
 	/* Identity part */
 	bitmap_andnot(dst, src, old, nbits);
 
 	/* Remapping part */
 	for_each_and_bit(oldbit, src, old, nbits) {
 		n = bitmap_weight(old, oldbit);
+		if (w == 0) { /* if not initialized */
+			w = bitmap_weight(new, nbits);
+			if (w == 0) { /* if empty */
+				bitmap_copy(dst, src, nbits);
+				return;
+			}
+		}
 		__set_bit(find_nth_bit(new, nbits, n % w), dst);
 	}
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 12/12] bitmap: don't count bits from the beginning in bitmap_remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (10 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 11/12] bitmap: defer calculating weight of 'new' " Yury Norov
@ 2023-08-28 18:43 ` Yury Norov
  2023-08-29  7:33 ` [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Rasmus Villemoes
  12 siblings, 0 replies; 18+ messages in thread
From: Yury Norov @ 2023-08-28 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes

In the remapping part, we count bits in 'old' and 'new' masks from the
beginning for every bit. Complexity of this approach is O(N^2). We can
do it better in O(N), if switch bitmap_remap() to using _from versions
of bitmap_weight() and find_nth_bit().

On kvm/x86_64 it works 9x faster than the current version for the 1000-bit
bitmap in lib/test_bitmap. Because popular distros like Ubuntu has 1024-bit
nodemasks, ~10x performance improvement is expected where bitmap_remap()
is used.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 1fca60d54cb4..8bca6bcf2bc5 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -996,7 +996,8 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new,
 		unsigned int nbits)
 {
-	unsigned int oldbit, w = 0, n;
+	unsigned int newbit, oldbit, w = 0, n;
+	unsigned int _oldbit, _newbit;
 
 	if (dst == src)		/* following doesn't handle inplace remaps */
 		return;
@@ -1005,8 +1006,17 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 	bitmap_andnot(dst, src, old, nbits);
 
 	/* Remapping part */
+	_oldbit = _newbit = 0;
 	for_each_and_bit(oldbit, src, old, nbits) {
-		n = bitmap_weight(old, oldbit);
+		n = bitmap_weight_from(old, oldbit, _oldbit);
+		newbit = find_nth_bit_from(new, nbits, _newbit, n);
+		if (newbit < nbits)
+			goto set_bit;
+
+		/*
+		 * If we're out of bits in 'new' map, wrap
+		 * around and start from the beginning.
+		 */
 		if (w == 0) { /* if not initialized */
 			w = bitmap_weight(new, nbits);
 			if (w == 0) { /* if empty */
@@ -1014,7 +1024,13 @@ void __bitmap_remap(unsigned long *dst, const unsigned long *src,
 				return;
 			}
 		}
-		__set_bit(find_nth_bit(new, nbits, n % w), dst);
+
+		n -= bitmap_weight_from(new, nbits, _newbit);
+		newbit = find_nth_bit(new, nbits, n % w);
+set_bit:
+		__set_bit(newbit, dst);
+		_oldbit = oldbit;
+		_newbit = newbit;
 	}
 }
 EXPORT_SYMBOL(__bitmap_remap);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 02/12] bitmap: add bitmap_weight_from()
  2023-08-28 18:43 ` [PATCH 02/12] bitmap: add bitmap_weight_from() Yury Norov
@ 2023-08-29  7:22   ` Rasmus Villemoes
  0 siblings, 0 replies; 18+ messages in thread
From: Rasmus Villemoes @ 2023-08-29  7:22 UTC (permalink / raw)
  To: Yury Norov, linux-kernel; +Cc: Andy Shevchenko

On 28/08/2023 20.43, Yury Norov wrote:
> Add a _from flavor for bitmap_weight(). It's useful when traversing
> bitmaps in a loop to not count bits from the beginning every time.
> 
> The test for bitmap_weight{,_from}() is added as well.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/linux/bitmap.h | 14 ++++++++++++++
>  lib/bitmap.c           | 25 +++++++++++++++++++++++++
>  lib/test_bitmap.c      | 18 ++++++++++++++++++
>  3 files changed, 57 insertions(+)
> 
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 692d0a1dbe92..6acbdd2abd0c 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -168,6 +168,8 @@ bool __bitmap_subset(const unsigned long *bitmap1,
>  unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits);
>  unsigned int __bitmap_weight_and(const unsigned long *bitmap1,
>  				 const unsigned long *bitmap2, unsigned int nbits);
> +unsigned int __bitmap_weight_from(const unsigned long *bitmap,
> +					unsigned int bits, unsigned int start);
>  void __bitmap_set(unsigned long *map, unsigned int start, int len);
>  void __bitmap_clear(unsigned long *map, unsigned int start, int len);
>  
> @@ -446,6 +448,18 @@ unsigned long bitmap_weight_and(const unsigned long *src1,
>  	return __bitmap_weight_and(src1, src2, nbits);
>  }
>  
> +static __always_inline
> +unsigned int bitmap_weight_from(const unsigned long *src, unsigned int nbits, unsigned int start)
> +{
> +	if (unlikely(start >= nbits))
> +		return 0;
> +
> +	if (small_const_nbits(nbits - start) && nbits / BITS_PER_LONG == start / BITS_PER_LONG)
> +		return hweight_long(*src & GENMASK(nbits-1, start));

This must be buggy. If I pass compile-time constants nbits==96 and
start==64, the whole if() is true, and we call GENMASK with total
garbage arguments, and access src[0] and not src[1] as that start-value
would suggest.

Don't optimize for irrelevant cases. The expected use of this function
must be that start is some runtime thing. So just make the whole if()
just test whether nbits is small_const, and if so, I think the
hweight_long() line is actually right as-is (though I can never remember
GENMASK argument order).
>  
> +#define BITMAP_WEIGHT_FROM(FETCH, bits, start)					\
> +({										\
> +	unsigned long __bits = (bits), val, idx, w = 0, __start = (start), mask;\
> +										\
> +	mask = BITMAP_FIRST_WORD_MASK(__start);					\
> +	idx = __start / BITS_PER_LONG;						\
> +										\
> +	for (val = (FETCH) & mask; idx < __bits / BITS_PER_LONG; val = (FETCH))	{\
> +		w += hweight_long(val);						\
> +		idx++;								\
> +	}									\
> +										\
> +	if (__bits % BITS_PER_LONG)						\
> +		w += hweight_long((val) & BITMAP_LAST_WORD_MASK(__bits));	\
> +										\
> +	w;									\
> +})
>

Why does the entire function body need to be an unholy statement expression?

Rasmus


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/12] bitmap: rework bitmap_{bit,}remap()
  2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
                   ` (11 preceding siblings ...)
  2023-08-28 18:43 ` [PATCH 12/12] bitmap: don't count bits from the beginning " Yury Norov
@ 2023-08-29  7:33 ` Rasmus Villemoes
  2023-08-29 13:38   ` Andy Shevchenko
  12 siblings, 1 reply; 18+ messages in thread
From: Rasmus Villemoes @ 2023-08-29  7:33 UTC (permalink / raw)
  To: Yury Norov, linux-kernel; +Cc: Andy Shevchenko

On 28/08/2023 20.43, Yury Norov wrote:
> This series adds a test, const-time optimizaton and fixes O(N^2)
> complexity problem for the functions. It's based on discussion in
> bitmap: optimize bitmap_remap() series [1], but there's much more work
> here, so I decided to give it a separete run, and don't name it as v2.
> 
> bitmap_remap() API has just one user in generic code, and few more in
> drivers, so this may look like an overkill. But the work gives ~10x
> better performance for a 1000-bit bitmaps, which is typical for nodemasks
> in major distros like Ubuntu.

Can you find just _one_ project on Debian Code Search or elsewhere that
actually uses mbind(2), that could possibly ever trigger the use of that
bitmap_remap stuff? Also, the bitmap may be order 10, but that's just
because the kitchen sink distros are silly, real machines have nowhere
near that number of nodes, so even if mbind is used, the bitmaps
involved will never actually have anything beyond index ~64.

I think this is all total overkill for non-existing problems, and when
it takes me 20 seconds to find the first bug, I really don't think it's
worth the churn. I'm not giving a thorough review on the rest of the
series, nor commenting on followups.

Rasmus


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/12] bitmap: rework bitmap_{bit,}remap()
  2023-08-29  7:33 ` [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Rasmus Villemoes
@ 2023-08-29 13:38   ` Andy Shevchenko
  2023-08-29 13:50     ` Yury Norov
  0 siblings, 1 reply; 18+ messages in thread
From: Andy Shevchenko @ 2023-08-29 13:38 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Yury Norov, linux-kernel

On Tue, Aug 29, 2023 at 09:33:29AM +0200, Rasmus Villemoes wrote:
> On 28/08/2023 20.43, Yury Norov wrote:
> > This series adds a test, const-time optimizaton and fixes O(N^2)
> > complexity problem for the functions. It's based on discussion in
> > bitmap: optimize bitmap_remap() series [1], but there's much more work
> > here, so I decided to give it a separete run, and don't name it as v2.
> > 
> > bitmap_remap() API has just one user in generic code, and few more in
> > drivers, so this may look like an overkill. But the work gives ~10x
> > better performance for a 1000-bit bitmaps, which is typical for nodemasks
> > in major distros like Ubuntu.
> 
> Can you find just _one_ project on Debian Code Search or elsewhere that
> actually uses mbind(2), that could possibly ever trigger the use of that
> bitmap_remap stuff? Also, the bitmap may be order 10, but that's just
> because the kitchen sink distros are silly, real machines have nowhere
> near that number of nodes, so even if mbind is used, the bitmaps
> involved will never actually have anything beyond index ~64.
> 
> I think this is all total overkill for non-existing problems, and when
> it takes me 20 seconds to find the first bug, I really don't think it's
> worth the churn. I'm not giving a thorough review on the rest of the
> series, nor commenting on followups.

I posted one patch to replace these APIs with something else, more particular
for GPIO case(s). Have you chance to look at that? With that taking in, I'm
fully agree on the above statement (as we lose the user of this complicated
thingy which is a niche of the NUMA as you mentioned already).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/12] bitmap: rework bitmap_{bit,}remap()
  2023-08-29 13:38   ` Andy Shevchenko
@ 2023-08-29 13:50     ` Yury Norov
  2023-08-29 14:56       ` Andy Shevchenko
  0 siblings, 1 reply; 18+ messages in thread
From: Yury Norov @ 2023-08-29 13:50 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: Rasmus Villemoes, linux-kernel

On Tue, Aug 29, 2023 at 6:39 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Tue, Aug 29, 2023 at 09:33:29AM +0200, Rasmus Villemoes wrote:
> > On 28/08/2023 20.43, Yury Norov wrote:
> > > This series adds a test, const-time optimizaton and fixes O(N^2)
> > > complexity problem for the functions. It's based on discussion in
> > > bitmap: optimize bitmap_remap() series [1], but there's much more work
> > > here, so I decided to give it a separete run, and don't name it as v2.
> > >
> > > bitmap_remap() API has just one user in generic code, and few more in
> > > drivers, so this may look like an overkill. But the work gives ~10x
> > > better performance for a 1000-bit bitmaps, which is typical for nodemasks
> > > in major distros like Ubuntu.
> >
> > Can you find just _one_ project on Debian Code Search or elsewhere that
> > actually uses mbind(2), that could possibly ever trigger the use of that
> > bitmap_remap stuff? Also, the bitmap may be order 10, but that's just
> > because the kitchen sink distros are silly, real machines have nowhere
> > near that number of nodes, so even if mbind is used, the bitmaps
> > involved will never actually have anything beyond index ~64.
> >
> > I think this is all total overkill for non-existing problems, and when
> > it takes me 20 seconds to find the first bug, I really don't think it's
> > worth the churn. I'm not giving a thorough review on the rest of the
> > series, nor commenting on followups.
>
> I posted one patch to replace these APIs with something else, more particular
> for GPIO case(s). Have you chance to look at that? With that taking in, I'm
> fully agree on the above statement (as we lose the user of this complicated
> thingy which is a niche of the NUMA as you mentioned already).

I saw the code in a comment in some thread but not as a separate patch, and
AFAIK you said it's a work-in-progress.

Sorry if I missed your submission. Can you please send the patch or point me to
the previous submission?

Also, if after your change bitmap_remap would become a NUMA-specific, should
protect it with NUMA guards?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/12] bitmap: rework bitmap_{bit,}remap()
  2023-08-29 13:50     ` Yury Norov
@ 2023-08-29 14:56       ` Andy Shevchenko
  0 siblings, 0 replies; 18+ messages in thread
From: Andy Shevchenko @ 2023-08-29 14:56 UTC (permalink / raw)
  To: Yury Norov; +Cc: Rasmus Villemoes, linux-kernel

On Tue, Aug 29, 2023 at 06:50:08AM -0700, Yury Norov wrote:
> On Tue, Aug 29, 2023 at 6:39 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Tue, Aug 29, 2023 at 09:33:29AM +0200, Rasmus Villemoes wrote:

...

> > I posted one patch to replace these APIs with something else, more particular
> > for GPIO case(s). Have you chance to look at that? With that taking in, I'm
> > fully agree on the above statement (as we lose the user of this complicated
> > thingy which is a niche of the NUMA as you mentioned already).
> 
> I saw the code in a comment in some thread but not as a separate patch, and
> AFAIK you said it's a work-in-progress.

The patch itself is self-contained, the only problem it has no users as is.
The work-in-progress for the test cases, but before I continue with that I want
to hear if it's accepted approach.

> Sorry if I missed your submission. Can you please send the patch or point me to
> the previous submission?
> 
> Also, if after your change bitmap_remap would become a NUMA-specific, should
> protect it with NUMA guards?

Rasmus' idea was to move that completely out of the bitmap namespace and scope.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-08-29 14:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-28 18:43 [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Yury Norov
2023-08-28 18:43 ` [PATCH 01/12] bitmap: add find_nth_bit_from() Yury Norov
2023-08-28 18:43 ` [PATCH 02/12] bitmap: add bitmap_weight_from() Yury Norov
2023-08-29  7:22   ` Rasmus Villemoes
2023-08-28 18:43 ` [PATCH 03/12] bitmap: add test for bitmap_remap() Yury Norov
2023-08-28 18:43 ` [PATCH 04/12] bitmap: add test for bitmap_bitremap() Yury Norov
2023-08-28 18:43 ` [PATCH 05/12] bitmap: update comment for bitmap_{bit,}remap() Yury Norov
2023-08-28 18:43 ` [PATCH 06/12] bitmap: add small_cont_nbits() optimization for bitmap_remap() Yury Norov
2023-08-28 18:43 ` [PATCH 07/12] bitmap: add small_const_nbits() optimization for bitmap_bitremap() Yury Norov
2023-08-28 18:43 ` [PATCH 08/12] bitmap: optiimze bitmap_bitremap() Yury Norov
2023-08-28 18:43 ` [PATCH 09/12] bitmap: optimize bitmap_remap() when 'new' is empty map Yury Norov
2023-08-28 18:43 ` [PATCH 10/12] bitmap: separate handling of identity and remapping parts in bitmap_remap() Yury Norov
2023-08-28 18:43 ` [PATCH 11/12] bitmap: defer calculating weight of 'new' " Yury Norov
2023-08-28 18:43 ` [PATCH 12/12] bitmap: don't count bits from the beginning " Yury Norov
2023-08-29  7:33 ` [PATCH 00/12] bitmap: rework bitmap_{bit,}remap() Rasmus Villemoes
2023-08-29 13:38   ` Andy Shevchenko
2023-08-29 13:50     ` Yury Norov
2023-08-29 14:56       ` Andy Shevchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.