* [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations
@ 2024-10-20 4:01 Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions Kuan-Wei Chiu
` (11 more replies)
0 siblings, 12 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Add non-inline versions of the min heap API functions in lib/min_heap.c
and updates all users outside of kernel/events/core.c to use these
non-inline versions. To mitigate the performance impact of indirect
function calls caused by the non-inline versions of the swap and
compare functions, a builtin swap has been introduced that swaps
elements based on their size. Additionally, it micro-optimizes the
efficiency of the min heap by pre-scaling the counter, following the
same approach as in lib/sort.c. Documentation for the min heap API has
also been added to the core-api section.
Regards,
Kuan-Wei
---
Changes in v2:
- Added a builtin swap to reduce the performance impact of indirect
function calls.
- Cleaned up duplicate min_heap_callbacks in bcachefs.
- Wrapped documentation at 80 columns.
- Updated Example Usage.
- Included documentation explaining that NULL can be passed to the
swp in min_heap_callbacks to use the builtin swap.
v1: https://lore.kernel.org/lkml/20241013184703.659652-1-visitorckw@gmail.com
Kuan-Wei Chiu (10):
lib/min_heap: Introduce non-inline versions of min heap API functions
lib min_heap: Optimize min heap by prescaling counters for better
performance
lib min_heap: Avoid indirect function call by providing default swap
lib/test_min_heap: Update min_heap_callbacks to use default builtin
swap
perf/core: Update min_heap_callbacks to use default builtin swap
dm vdo: Update min_heap_callbacks to use default builtin swap
bcache: Update min_heap_callbacks to use default builtin swap
bcachefs: Clean up duplicate min_heap_callbacks declarations
bcachefs: Update min_heap_callbacks to use default builtin swap
Documentation/core-api: Add min heap API introduction
Documentation/core-api/index.rst | 1 +
Documentation/core-api/min_heap.rst | 300 +++++++++++++++++++++++
drivers/md/bcache/Kconfig | 1 +
drivers/md/bcache/alloc.c | 11 +-
drivers/md/bcache/bset.c | 14 +-
drivers/md/bcache/extents.c | 10 +-
drivers/md/bcache/movinggc.c | 10 +-
drivers/md/dm-vdo/Kconfig | 1 +
drivers/md/dm-vdo/repair.c | 2 +-
drivers/md/dm-vdo/slab-depot.c | 10 +-
fs/bcachefs/Kconfig | 1 +
fs/bcachefs/clock.c | 25 +-
fs/bcachefs/ec.c | 19 +-
include/linux/min_heap.h | 357 ++++++++++++++++++++++------
kernel/events/core.c | 15 +-
lib/Kconfig | 3 +
lib/Kconfig.debug | 1 +
lib/Makefile | 1 +
lib/min_heap.c | 70 ++++++
lib/test_min_heap.c | 16 +-
20 files changed, 694 insertions(+), 174 deletions(-)
create mode 100644 Documentation/core-api/min_heap.rst
create mode 100644 lib/min_heap.c
--
2.34.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-11-26 13:27 ` Geert Uytterhoeven
2024-10-20 4:01 ` [PATCH v2 02/10] lib min_heap: Optimize min heap by prescaling counters for better performance Kuan-Wei Chiu
` (10 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
All current min heap API functions are marked with '__always_inline'.
However, as the number of users increases, inlining these functions
everywhere leads to a increase in kernel size.
In performance-critical paths, such as when perf events are enabled and
min heap functions are called on every context switch, it is important
to retain the inline versions for optimal performance. To balance this,
the original inline functions are kept, and additional non-inline
versions of the functions have been added in lib/min_heap.c.
Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
Code size for bcache:
Before:
text data bss dec hex filename
7257 302 0 7559 1d87 ./drivers/md/bcache/alloc.o
10786 515 0 11301 2c25 ./drivers/md/bcache/bset.o
33246 1596 8 34850 8822 ./drivers/md/bcache/btree.o
7195 256 0 7451 1d1b ./drivers/md/bcache/extents.o
3524 174 0 3698 e72 ./drivers/md/bcache/movinggc.o
17424 2816 0 20240 4f10 ./drivers/md/bcache/sysfs.o
8994 328 0 9322 246a ./drivers/md/bcache/writeback.o
After:
text data bss dec hex filename
6339 300 0 6639 19ef ./drivers/md/bcache/alloc.o
10428 537 0 10965 2ad5 ./drivers/md/bcache/bset.o
33134 1596 8 34738 87b2 ./drivers/md/bcache/btree.o
6619 264 0 6883 1ae3 ./drivers/md/bcache/extents.o
2958 152 0 3110 c26 ./drivers/md/bcache/movinggc.o
17408 2816 0 20224 4f00 ./drivers/md/bcache/sysfs.o
8962 328 0 9290 244a ./drivers/md/bcache/writeback.o
Code size for bcachefs:
Before:
text data bss dec hex filename
2286 155 0 2441 989 ./fs/bcachefs/clock.o
41481 1634 20 43135 a87f ./fs/bcachefs/ec.o
After:
text data bss dec hex filename
1928 132 0 2060 80c ./fs/bcachefs/clock.o
40259 1624 20 41903 a3af ./fs/bcachefs/ec.o
Code size for dm-vdo:
Before:
text data bss dec hex filename
14047 264 0 14311 37e7 ./drivers/md/dm-vdo/repair.o
37432 944 0 38376 95e8 ./drivers/md/dm-vdo/slab-depot.o
After:
text data bss dec hex filename
13697 264 0 13961 3689 ./drivers/md/dm-vdo/repair.o
37074 960 0 38034 9492 ./drivers/md/dm-vdo/slab-depot.o
Code size for test_min_heap:
Before:
text data bss dec hex filename
5499 171 0 5670 1626 ./lib/test_min_heap.o
After:
text data bss dec hex filename
2581 96 0 2677 a75 ./lib/test_min_heap.o
drivers/md/bcache/Kconfig | 1 +
drivers/md/dm-vdo/Kconfig | 1 +
fs/bcachefs/Kconfig | 1 +
include/linux/min_heap.h | 129 +++++++++++++++++++++++++-------------
kernel/events/core.c | 6 +-
lib/Kconfig | 3 +
lib/Kconfig.debug | 1 +
lib/Makefile | 1 +
lib/min_heap.c | 70 +++++++++++++++++++++
9 files changed, 167 insertions(+), 46 deletions(-)
create mode 100644 lib/min_heap.c
diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig
index b2d10063d35f..d4697e79d5a3 100644
--- a/drivers/md/bcache/Kconfig
+++ b/drivers/md/bcache/Kconfig
@@ -5,6 +5,7 @@ config BCACHE
select BLOCK_HOLDER_DEPRECATED if SYSFS
select CRC64
select CLOSURES
+ select MIN_HEAP
help
Allows a block device to be used as cache for other devices; uses
a btree for indexing and the layout is optimized for SSDs.
diff --git a/drivers/md/dm-vdo/Kconfig b/drivers/md/dm-vdo/Kconfig
index 111ecd2c2a24..2400b2bc4bc7 100644
--- a/drivers/md/dm-vdo/Kconfig
+++ b/drivers/md/dm-vdo/Kconfig
@@ -7,6 +7,7 @@ config DM_VDO
select DM_BUFIO
select LZ4_COMPRESS
select LZ4_DECOMPRESS
+ select MIN_HEAP
help
This device mapper target presents a block device with
deduplication, compression and thin-provisioning.
diff --git a/fs/bcachefs/Kconfig b/fs/bcachefs/Kconfig
index 5bac803ea367..ab6c95b895b3 100644
--- a/fs/bcachefs/Kconfig
+++ b/fs/bcachefs/Kconfig
@@ -24,6 +24,7 @@ config BCACHEFS_FS
select XXHASH
select SRCU
select SYMBOLIC_ERRNAME
+ select MIN_HEAP
help
The bcachefs filesystem - a modern, copy on write filesystem, with
support for multiple devices, compression, checksumming, etc.
diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
index 43a7b9dcf15e..0abb21173979 100644
--- a/include/linux/min_heap.h
+++ b/include/linux/min_heap.h
@@ -40,7 +40,7 @@ struct min_heap_callbacks {
/* Initialize a min-heap. */
static __always_inline
-void __min_heap_init(min_heap_char *heap, void *data, int size)
+void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
{
heap->nr = 0;
heap->size = size;
@@ -50,33 +50,33 @@ void __min_heap_init(min_heap_char *heap, void *data, int size)
heap->data = heap->preallocated;
}
-#define min_heap_init(_heap, _data, _size) \
- __min_heap_init((min_heap_char *)_heap, _data, _size)
+#define min_heap_init_inline(_heap, _data, _size) \
+ __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
/* Get the minimum element from the heap. */
static __always_inline
-void *__min_heap_peek(struct min_heap_char *heap)
+void *__min_heap_peek_inline(struct min_heap_char *heap)
{
return heap->nr ? heap->data : NULL;
}
-#define min_heap_peek(_heap) \
- (__minheap_cast(_heap) __min_heap_peek((min_heap_char *)_heap))
+#define min_heap_peek_inline(_heap) \
+ (__minheap_cast(_heap) __min_heap_peek_inline((min_heap_char *)_heap))
/* Check if the heap is full. */
static __always_inline
-bool __min_heap_full(min_heap_char *heap)
+bool __min_heap_full_inline(min_heap_char *heap)
{
return heap->nr == heap->size;
}
-#define min_heap_full(_heap) \
- __min_heap_full((min_heap_char *)_heap)
+#define min_heap_full_inline(_heap) \
+ __min_heap_full_inline((min_heap_char *)_heap)
/* Sift the element at pos down the heap. */
static __always_inline
-void __min_heap_sift_down(min_heap_char *heap, int pos, size_t elem_size,
- const struct min_heap_callbacks *func, void *args)
+void __min_heap_sift_down_inline(min_heap_char *heap, int pos, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
{
void *left, *right;
void *data = heap->data;
@@ -108,13 +108,14 @@ void __min_heap_sift_down(min_heap_char *heap, int pos, size_t elem_size,
}
}
-#define min_heap_sift_down(_heap, _pos, _func, _args) \
- __min_heap_sift_down((min_heap_char *)_heap, _pos, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_sift_down_inline(_heap, _pos, _func, _args) \
+ __min_heap_sift_down_inline((min_heap_char *)_heap, _pos, __minheap_obj_size(_heap), \
+ _func, _args)
/* Sift up ith element from the heap, O(log2(nr)). */
static __always_inline
-void __min_heap_sift_up(min_heap_char *heap, size_t elem_size, size_t idx,
- const struct min_heap_callbacks *func, void *args)
+void __min_heap_sift_up_inline(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args)
{
void *data = heap->data;
size_t parent;
@@ -128,27 +129,28 @@ void __min_heap_sift_up(min_heap_char *heap, size_t elem_size, size_t idx,
}
}
-#define min_heap_sift_up(_heap, _idx, _func, _args) \
- __min_heap_sift_up((min_heap_char *)_heap, __minheap_obj_size(_heap), _idx, _func, _args)
+#define min_heap_sift_up_inline(_heap, _idx, _func, _args) \
+ __min_heap_sift_up_inline((min_heap_char *)_heap, __minheap_obj_size(_heap), _idx, \
+ _func, _args)
/* Floyd's approach to heapification that is O(nr). */
static __always_inline
-void __min_heapify_all(min_heap_char *heap, size_t elem_size,
- const struct min_heap_callbacks *func, void *args)
+void __min_heapify_all_inline(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
{
int i;
for (i = heap->nr / 2 - 1; i >= 0; i--)
- __min_heap_sift_down(heap, i, elem_size, func, args);
+ __min_heap_sift_down_inline(heap, i, elem_size, func, args);
}
-#define min_heapify_all(_heap, _func, _args) \
- __min_heapify_all((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
+#define min_heapify_all_inline(_heap, _func, _args) \
+ __min_heapify_all_inline((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
/* Remove minimum element from the heap, O(log2(nr)). */
static __always_inline
-bool __min_heap_pop(min_heap_char *heap, size_t elem_size,
- const struct min_heap_callbacks *func, void *args)
+bool __min_heap_pop_inline(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
{
void *data = heap->data;
@@ -158,13 +160,13 @@ bool __min_heap_pop(min_heap_char *heap, size_t elem_size,
/* Place last element at the root (position 0) and then sift down. */
heap->nr--;
memcpy(data, data + (heap->nr * elem_size), elem_size);
- __min_heap_sift_down(heap, 0, elem_size, func, args);
+ __min_heap_sift_down_inline(heap, 0, elem_size, func, args);
return true;
}
-#define min_heap_pop(_heap, _func, _args) \
- __min_heap_pop((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_pop_inline(_heap, _func, _args) \
+ __min_heap_pop_inline((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
/*
* Remove the minimum element and then push the given element. The
@@ -172,22 +174,21 @@ bool __min_heap_pop(min_heap_char *heap, size_t elem_size,
* efficient than a pop followed by a push that does 2.
*/
static __always_inline
-void __min_heap_pop_push(min_heap_char *heap,
- const void *element, size_t elem_size,
- const struct min_heap_callbacks *func,
- void *args)
+void __min_heap_pop_push_inline(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
{
memcpy(heap->data, element, elem_size);
- __min_heap_sift_down(heap, 0, elem_size, func, args);
+ __min_heap_sift_down_inline(heap, 0, elem_size, func, args);
}
-#define min_heap_pop_push(_heap, _element, _func, _args) \
- __min_heap_pop_push((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_pop_push_inline(_heap, _element, _func, _args) \
+ __min_heap_pop_push_inline((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), \
+ _func, _args)
/* Push an element on to the heap, O(log2(nr)). */
static __always_inline
-bool __min_heap_push(min_heap_char *heap, const void *element, size_t elem_size,
- const struct min_heap_callbacks *func, void *args)
+bool __min_heap_push_inline(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
{
void *data = heap->data;
int pos;
@@ -201,18 +202,19 @@ bool __min_heap_push(min_heap_char *heap, const void *element, size_t elem_size,
heap->nr++;
/* Sift child at pos up. */
- __min_heap_sift_up(heap, elem_size, pos, func, args);
+ __min_heap_sift_up_inline(heap, elem_size, pos, func, args);
return true;
}
-#define min_heap_push(_heap, _element, _func, _args) \
- __min_heap_push((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_push_inline(_heap, _element, _func, _args) \
+ __min_heap_push_inline((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), \
+ _func, _args)
/* Remove ith element from the heap, O(log2(nr)). */
static __always_inline
-bool __min_heap_del(min_heap_char *heap, size_t elem_size, size_t idx,
- const struct min_heap_callbacks *func, void *args)
+bool __min_heap_del_inline(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args)
{
void *data = heap->data;
@@ -224,12 +226,53 @@ bool __min_heap_del(min_heap_char *heap, size_t elem_size, size_t idx,
if (idx == heap->nr)
return true;
func->swp(data + (idx * elem_size), data + (heap->nr * elem_size), args);
- __min_heap_sift_up(heap, elem_size, idx, func, args);
- __min_heap_sift_down(heap, idx, elem_size, func, args);
+ __min_heap_sift_up_inline(heap, elem_size, idx, func, args);
+ __min_heap_sift_down_inline(heap, idx, elem_size, func, args);
return true;
}
+#define min_heap_del_inline(_heap, _idx, _func, _args) \
+ __min_heap_del_inline((min_heap_char *)_heap, __minheap_obj_size(_heap), _idx, \
+ _func, _args)
+
+void __min_heap_init(min_heap_char *heap, void *data, int size);
+void *__min_heap_peek(struct min_heap_char *heap);
+bool __min_heap_full(min_heap_char *heap);
+void __min_heap_sift_down(min_heap_char *heap, int pos, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args);
+void __min_heap_sift_up(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args);
+void __min_heapify_all(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args);
+bool __min_heap_pop(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args);
+void __min_heap_pop_push(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args);
+bool __min_heap_push(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args);
+bool __min_heap_del(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args);
+
+#define min_heap_init(_heap, _data, _size) \
+ __min_heap_init((min_heap_char *)_heap, _data, _size)
+#define min_heap_peek(_heap) \
+ (__minheap_cast(_heap) __min_heap_peek((min_heap_char *)_heap))
+#define min_heap_full(_heap) \
+ __min_heap_full((min_heap_char *)_heap)
+#define min_heap_sift_down(_heap, _pos, _func, _args) \
+ __min_heap_sift_down((min_heap_char *)_heap, _pos, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_sift_up(_heap, _idx, _func, _args) \
+ __min_heap_sift_up((min_heap_char *)_heap, __minheap_obj_size(_heap), _idx, _func, _args)
+#define min_heapify_all(_heap, _func, _args) \
+ __min_heapify_all((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_pop(_heap, _func, _args) \
+ __min_heap_pop((min_heap_char *)_heap, __minheap_obj_size(_heap), _func, _args)
+#define min_heap_pop_push(_heap, _element, _func, _args) \
+ __min_heap_pop_push((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), \
+ _func, _args)
+#define min_heap_push(_heap, _element, _func, _args) \
+ __min_heap_push((min_heap_char *)_heap, _element, __minheap_obj_size(_heap), _func, _args)
#define min_heap_del(_heap, _idx, _func, _args) \
__min_heap_del((min_heap_char *)_heap, __minheap_obj_size(_heap), _idx, _func, _args)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e3589c4287cb..cbf365e67f6e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3870,7 +3870,7 @@ static noinline int visit_groups_merge(struct perf_event_context *ctx,
perf_assert_pmu_disabled((*evt)->pmu_ctx->pmu);
}
- min_heapify_all(&event_heap, &perf_min_heap, NULL);
+ min_heapify_all_inline(&event_heap, &perf_min_heap, NULL);
while (event_heap.nr) {
ret = func(*evt, data);
@@ -3879,9 +3879,9 @@ static noinline int visit_groups_merge(struct perf_event_context *ctx,
*evt = perf_event_groups_next(*evt, pmu);
if (*evt)
- min_heap_sift_down(&event_heap, 0, &perf_min_heap, NULL);
+ min_heap_sift_down_inline(&event_heap, 0, &perf_min_heap, NULL);
else
- min_heap_pop(&event_heap, &perf_min_heap, NULL);
+ min_heap_pop_inline(&event_heap, &perf_min_heap, NULL);
}
return 0;
diff --git a/lib/Kconfig b/lib/Kconfig
index b38849af6f13..037a84731b7d 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -777,3 +777,6 @@ config POLYNOMIAL
config FIRMWARE_TABLE
bool
+
+config MIN_HEAP
+ bool
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7312ae7c3cc5..c9e6ce184044 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2279,6 +2279,7 @@ config TEST_LIST_SORT
config TEST_MIN_HEAP
tristate "Min heap test"
depends on DEBUG_KERNEL || m
+ select MIN_HEAP
help
Enable this to turn on min heap function tests. This test is
executed only once during system boot (so affects only boot time),
diff --git a/lib/Makefile b/lib/Makefile
index 773adf88af41..e7ffee03e186 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -39,6 +39,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
lib-$(CONFIG_PRINTK) += dump_stack.o
lib-$(CONFIG_SMP) += cpumask.o
+lib-$(CONFIG_MIN_HEAP) += min_heap.o
lib-y += kobject.o klist.o
obj-y += lockref.o
diff --git a/lib/min_heap.c b/lib/min_heap.c
new file mode 100644
index 000000000000..4485372ff3b1
--- /dev/null
+++ b/lib/min_heap.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/export.h>
+#include <linux/min_heap.h>
+
+void __min_heap_init(min_heap_char *heap, void *data, int size)
+{
+ __min_heap_init_inline(heap, data, size);
+}
+EXPORT_SYMBOL(__min_heap_init);
+
+void *__min_heap_peek(struct min_heap_char *heap)
+{
+ return __min_heap_peek_inline(heap);
+}
+EXPORT_SYMBOL(__min_heap_peek);
+
+bool __min_heap_full(min_heap_char *heap)
+{
+ return __min_heap_full_inline(heap);
+}
+EXPORT_SYMBOL(__min_heap_full);
+
+void __min_heap_sift_down(min_heap_char *heap, int pos, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
+{
+ __min_heap_sift_down_inline(heap, pos, elem_size, func, args);
+}
+EXPORT_SYMBOL(__min_heap_sift_down);
+
+void __min_heap_sift_up(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args)
+{
+ __min_heap_sift_up_inline(heap, elem_size, idx, func, args);
+}
+EXPORT_SYMBOL(__min_heap_sift_up);
+
+void __min_heapify_all(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
+{
+ __min_heapify_all_inline(heap, elem_size, func, args);
+}
+EXPORT_SYMBOL(__min_heapify_all);
+
+bool __min_heap_pop(min_heap_char *heap, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
+{
+ return __min_heap_pop_inline(heap, elem_size, func, args);
+}
+EXPORT_SYMBOL(__min_heap_pop);
+
+void __min_heap_pop_push(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
+{
+ __min_heap_pop_push_inline(heap, element, elem_size, func, args);
+}
+EXPORT_SYMBOL(__min_heap_pop_push);
+
+bool __min_heap_push(min_heap_char *heap, const void *element, size_t elem_size,
+ const struct min_heap_callbacks *func, void *args)
+{
+ return __min_heap_push_inline(heap, element, elem_size, func, args);
+}
+EXPORT_SYMBOL(__min_heap_push);
+
+bool __min_heap_del(min_heap_char *heap, size_t elem_size, size_t idx,
+ const struct min_heap_callbacks *func, void *args)
+{
+ return __min_heap_del_inline(heap, elem_size, idx, func, args);
+}
+EXPORT_SYMBOL(__min_heap_del);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 02/10] lib min_heap: Optimize min heap by prescaling counters for better performance
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 03/10] lib min_heap: Avoid indirect function call by providing default swap Kuan-Wei Chiu
` (9 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Improve the efficiency of the min heap by prescaling counters,
eliminating the need to repeatedly compute 'index * element_size' when
accessing elements. By doing so, we avoid the overhead associated with
recalculating the byte offset for each heap operation.
However, with prescaling, the calculation for the parent element's
location is no longer as simple as '(i - 1) / 2'. To address this, we
copy the parent function from 'lib/sort.c', which calculates the parent
offset in a branchless manner without using any division instructions.
This optimization should result in a more efficient heap implementation
by reducing the computational overhead of finding parent and child
offsets.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
Tested with test_min_heap module.
include/linux/min_heap.h | 73 +++++++++++++++++++++++++++-------------
1 file changed, 49 insertions(+), 24 deletions(-)
diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
index 0abb21173979..41b092a14238 100644
--- a/include/linux/min_heap.h
+++ b/include/linux/min_heap.h
@@ -38,6 +38,32 @@ struct min_heap_callbacks {
void (*swp)(void *lhs, void *rhs, void *args);
};
+/**
+ * parent - given the offset of the child, find the offset of the parent.
+ * @i: the offset of the heap element whose parent is sought. Non-zero.
+ * @lsbit: a precomputed 1-bit mask, equal to "size & -size"
+ * @size: size of each element
+ *
+ * In terms of array indexes, the parent of element j = @i/@size is simply
+ * (j-1)/2. But when working in byte offsets, we can't use implicit
+ * truncation of integer divides.
+ *
+ * Fortunately, we only need one bit of the quotient, not the full divide.
+ * @size has a least significant bit. That bit will be clear if @i is
+ * an even multiple of @size, and set if it's an odd multiple.
+ *
+ * Logically, we're doing "if (i & lsbit) i -= size;", but since the
+ * branch is unpredictable, it's done with a bit of clever branch-free
+ * code instead.
+ */
+__attribute_const__ __always_inline
+static size_t parent(size_t i, unsigned int lsbit, size_t size)
+{
+ i -= size;
+ i -= size & -(i & lsbit);
+ return i / 2;
+}
+
/* Initialize a min-heap. */
static __always_inline
void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
@@ -78,33 +104,30 @@ static __always_inline
void __min_heap_sift_down_inline(min_heap_char *heap, int pos, size_t elem_size,
const struct min_heap_callbacks *func, void *args)
{
- void *left, *right;
+ const unsigned long lsbit = elem_size & -elem_size;
void *data = heap->data;
- void *root = data + pos * elem_size;
- int i = pos, j;
+ /* pre-scale counters for performance */
+ size_t a = pos * elem_size;
+ size_t b, c, d;
+ size_t n = heap->nr * elem_size;
/* Find the sift-down path all the way to the leaves. */
- for (;;) {
- if (i * 2 + 2 >= heap->nr)
- break;
- left = data + (i * 2 + 1) * elem_size;
- right = data + (i * 2 + 2) * elem_size;
- i = func->less(left, right, args) ? i * 2 + 1 : i * 2 + 2;
- }
+ for (b = a; c = 2 * b + elem_size, (d = c + elem_size) < n;)
+ b = func->less(data + c, data + d, args) ? c : d;
/* Special case for the last leaf with no sibling. */
- if (i * 2 + 2 == heap->nr)
- i = i * 2 + 1;
+ if (d == n)
+ b = c;
/* Backtrack to the correct location. */
- while (i != pos && func->less(root, data + i * elem_size, args))
- i = (i - 1) / 2;
+ while (b != a && func->less(data + a, data + b, args))
+ b = parent(b, lsbit, elem_size);
/* Shift the element into its correct place. */
- j = i;
- while (i != pos) {
- i = (i - 1) / 2;
- func->swp(data + i * elem_size, data + j * elem_size, args);
+ c = b;
+ while (b != a) {
+ b = parent(b, lsbit, elem_size);
+ func->swp(data + b, data + c, args);
}
}
@@ -117,15 +140,17 @@ static __always_inline
void __min_heap_sift_up_inline(min_heap_char *heap, size_t elem_size, size_t idx,
const struct min_heap_callbacks *func, void *args)
{
+ const unsigned long lsbit = elem_size & -elem_size;
void *data = heap->data;
- size_t parent;
+ /* pre-scale counters for performance */
+ size_t a = idx * elem_size, b;
- while (idx) {
- parent = (idx - 1) / 2;
- if (func->less(data + parent * elem_size, data + idx * elem_size, args))
+ while (a) {
+ b = parent(a, lsbit, elem_size);
+ if (func->less(data + b, data + a, args))
break;
- func->swp(data + parent * elem_size, data + idx * elem_size, args);
- idx = parent;
+ func->swp(data + a, data + b, args);
+ a = b;
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 03/10] lib min_heap: Avoid indirect function call by providing default swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 02/10] lib min_heap: Optimize min heap by prescaling counters for better performance Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 04/10] lib/test_min_heap: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
` (8 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
The non-inline min heap API can result in an indirect function call to
the custom swap function. This becomes particularly costly when
CONFIG_MITIGATION_RETPOLINE is enabled, as indirect function calls are
expensive in this case.
To address this, copy the code from lib/sort.c and provide a default
builtin swap implementation that performs element swaps based on the
element size. This change allows most users to avoid the overhead of
indirect function calls, improving efficiency.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
include/linux/min_heap.h | 159 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 156 insertions(+), 3 deletions(-)
diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
index 41b092a14238..e781727c8916 100644
--- a/include/linux/min_heap.h
+++ b/include/linux/min_heap.h
@@ -38,6 +38,147 @@ struct min_heap_callbacks {
void (*swp)(void *lhs, void *rhs, void *args);
};
+/**
+ * is_aligned - is this pointer & size okay for word-wide copying?
+ * @base: pointer to data
+ * @size: size of each element
+ * @align: required alignment (typically 4 or 8)
+ *
+ * Returns true if elements can be copied using word loads and stores.
+ * The size must be a multiple of the alignment, and the base address must
+ * be if we do not have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
+ *
+ * For some reason, gcc doesn't know to optimize "if (a & mask || b & mask)"
+ * to "if ((a | b) & mask)", so we do that by hand.
+ */
+__attribute_const__ __always_inline
+static bool is_aligned(const void *base, size_t size, unsigned char align)
+{
+ unsigned char lsbits = (unsigned char)size;
+
+ (void)base;
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ lsbits |= (unsigned char)(uintptr_t)base;
+#endif
+ return (lsbits & (align - 1)) == 0;
+}
+
+/**
+ * swap_words_32 - swap two elements in 32-bit chunks
+ * @a: pointer to the first element to swap
+ * @b: pointer to the second element to swap
+ * @n: element size (must be a multiple of 4)
+ *
+ * Exchange the two objects in memory. This exploits base+index addressing,
+ * which basically all CPUs have, to minimize loop overhead computations.
+ *
+ * For some reason, on x86 gcc 7.3.0 adds a redundant test of n at the
+ * bottom of the loop, even though the zero flag is still valid from the
+ * subtract (since the intervening mov instructions don't alter the flags).
+ * Gcc 8.1.0 doesn't have that problem.
+ */
+static __always_inline
+void swap_words_32(void *a, void *b, size_t n)
+{
+ do {
+ u32 t = *(u32 *)(a + (n -= 4));
+ *(u32 *)(a + n) = *(u32 *)(b + n);
+ *(u32 *)(b + n) = t;
+ } while (n);
+}
+
+/**
+ * swap_words_64 - swap two elements in 64-bit chunks
+ * @a: pointer to the first element to swap
+ * @b: pointer to the second element to swap
+ * @n: element size (must be a multiple of 8)
+ *
+ * Exchange the two objects in memory. This exploits base+index
+ * addressing, which basically all CPUs have, to minimize loop overhead
+ * computations.
+ *
+ * We'd like to use 64-bit loads if possible. If they're not, emulating
+ * one requires base+index+4 addressing which x86 has but most other
+ * processors do not. If CONFIG_64BIT, we definitely have 64-bit loads,
+ * but it's possible to have 64-bit loads without 64-bit pointers (e.g.
+ * x32 ABI). Are there any cases the kernel needs to worry about?
+ */
+static __always_inline
+void swap_words_64(void *a, void *b, size_t n)
+{
+ do {
+#ifdef CONFIG_64BIT
+ u64 t = *(u64 *)(a + (n -= 8));
+ *(u64 *)(a + n) = *(u64 *)(b + n);
+ *(u64 *)(b + n) = t;
+#else
+ /* Use two 32-bit transfers to avoid base+index+4 addressing */
+ u32 t = *(u32 *)(a + (n -= 4));
+ *(u32 *)(a + n) = *(u32 *)(b + n);
+ *(u32 *)(b + n) = t;
+
+ t = *(u32 *)(a + (n -= 4));
+ *(u32 *)(a + n) = *(u32 *)(b + n);
+ *(u32 *)(b + n) = t;
+#endif
+ } while (n);
+}
+
+/**
+ * swap_bytes - swap two elements a byte at a time
+ * @a: pointer to the first element to swap
+ * @b: pointer to the second element to swap
+ * @n: element size
+ *
+ * This is the fallback if alignment doesn't allow using larger chunks.
+ */
+static __always_inline
+void swap_bytes(void *a, void *b, size_t n)
+{
+ do {
+ char t = ((char *)a)[--n];
+ ((char *)a)[n] = ((char *)b)[n];
+ ((char *)b)[n] = t;
+ } while (n);
+}
+
+/*
+ * The values are arbitrary as long as they can't be confused with
+ * a pointer, but small integers make for the smallest compare
+ * instructions.
+ */
+#define SWAP_WORDS_64 ((void (*)(void *, void *, void *))0)
+#define SWAP_WORDS_32 ((void (*)(void *, void *, void *))1)
+#define SWAP_BYTES ((void (*)(void *, void *, void *))2)
+
+/*
+ * Selects the appropriate swap function based on the element size.
+ */
+static __always_inline
+void *select_swap_func(const void *base, size_t size)
+{
+ if (is_aligned(base, size, 8))
+ return SWAP_WORDS_64;
+ else if (is_aligned(base, size, 4))
+ return SWAP_WORDS_32;
+ else
+ return SWAP_BYTES;
+}
+
+static __always_inline
+void do_swap(void *a, void *b, size_t size, void (*swap_func)(void *lhs, void *rhs, void *args),
+ void *priv)
+{
+ if (swap_func == SWAP_WORDS_64)
+ swap_words_64(a, b, size);
+ else if (swap_func == SWAP_WORDS_32)
+ swap_words_32(a, b, size);
+ else if (swap_func == SWAP_BYTES)
+ swap_bytes(a, b, size);
+ else
+ swap_func(a, b, priv);
+}
+
/**
* parent - given the offset of the child, find the offset of the parent.
* @i: the offset of the heap element whose parent is sought. Non-zero.
@@ -106,11 +247,15 @@ void __min_heap_sift_down_inline(min_heap_char *heap, int pos, size_t elem_size,
{
const unsigned long lsbit = elem_size & -elem_size;
void *data = heap->data;
+ void (*swp)(void *lhs, void *rhs, void *args) = func->swp;
/* pre-scale counters for performance */
size_t a = pos * elem_size;
size_t b, c, d;
size_t n = heap->nr * elem_size;
+ if (!swp)
+ swp = select_swap_func(data, elem_size);
+
/* Find the sift-down path all the way to the leaves. */
for (b = a; c = 2 * b + elem_size, (d = c + elem_size) < n;)
b = func->less(data + c, data + d, args) ? c : d;
@@ -127,7 +272,7 @@ void __min_heap_sift_down_inline(min_heap_char *heap, int pos, size_t elem_size,
c = b;
while (b != a) {
b = parent(b, lsbit, elem_size);
- func->swp(data + b, data + c, args);
+ do_swap(data + b, data + c, elem_size, swp, args);
}
}
@@ -142,14 +287,18 @@ void __min_heap_sift_up_inline(min_heap_char *heap, size_t elem_size, size_t idx
{
const unsigned long lsbit = elem_size & -elem_size;
void *data = heap->data;
+ void (*swp)(void *lhs, void *rhs, void *args) = func->swp;
/* pre-scale counters for performance */
size_t a = idx * elem_size, b;
+ if (!swp)
+ swp = select_swap_func(data, elem_size);
+
while (a) {
b = parent(a, lsbit, elem_size);
if (func->less(data + b, data + a, args))
break;
- func->swp(data + a, data + b, args);
+ do_swap(data + a, data + b, elem_size, swp, args);
a = b;
}
}
@@ -242,15 +391,19 @@ bool __min_heap_del_inline(min_heap_char *heap, size_t elem_size, size_t idx,
const struct min_heap_callbacks *func, void *args)
{
void *data = heap->data;
+ void (*swp)(void *lhs, void *rhs, void *args) = func->swp;
if (WARN_ONCE(heap->nr <= 0, "Popping an empty heap"))
return false;
+ if (!swp)
+ swp = select_swap_func(data, elem_size);
+
/* Place last element at the root (position 0) and then sift down. */
heap->nr--;
if (idx == heap->nr)
return true;
- func->swp(data + (idx * elem_size), data + (heap->nr * elem_size), args);
+ do_swap(data + (idx * elem_size), data + (heap->nr * elem_size), elem_size, swp, args);
__min_heap_sift_up_inline(heap, elem_size, idx, func, args);
__min_heap_sift_down_inline(heap, idx, elem_size, func, args);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 04/10] lib/test_min_heap: Update min_heap_callbacks to use default builtin swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (2 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 03/10] lib min_heap: Avoid indirect function call by providing default swap Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 05/10] perf/core: " Kuan-Wei Chiu
` (7 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Replace the swp function pointer in the min_heap_callbacks of
test_min_heap with NULL, allowing direct usage of the default builtin
swap implementation. This modification simplifies the code and improves
performance by removing unnecessary function indirection.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
Tested with test_min_heap module.
lib/test_min_heap.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/lib/test_min_heap.c b/lib/test_min_heap.c
index 64c877e73b64..e6fbb798558b 100644
--- a/lib/test_min_heap.c
+++ b/lib/test_min_heap.c
@@ -23,14 +23,6 @@ static __init bool greater_than(const void *lhs, const void *rhs, void __always_
return *(int *)lhs > *(int *)rhs;
}
-static __init void swap_ints(void *lhs, void *rhs, void __always_unused *args)
-{
- int temp = *(int *)lhs;
-
- *(int *)lhs = *(int *)rhs;
- *(int *)rhs = temp;
-}
-
static __init int pop_verify_heap(bool min_heap,
struct min_heap_test *heap,
const struct min_heap_callbacks *funcs)
@@ -72,7 +64,7 @@ static __init int test_heapify_all(bool min_heap)
};
struct min_heap_callbacks funcs = {
.less = min_heap ? less_than : greater_than,
- .swp = swap_ints,
+ .swp = NULL,
};
int i, err;
@@ -104,7 +96,7 @@ static __init int test_heap_push(bool min_heap)
};
struct min_heap_callbacks funcs = {
.less = min_heap ? less_than : greater_than,
- .swp = swap_ints,
+ .swp = NULL,
};
int i, temp, err;
@@ -136,7 +128,7 @@ static __init int test_heap_pop_push(bool min_heap)
};
struct min_heap_callbacks funcs = {
.less = min_heap ? less_than : greater_than,
- .swp = swap_ints,
+ .swp = NULL,
};
int i, temp, err;
@@ -175,7 +167,7 @@ static __init int test_heap_del(bool min_heap)
heap.nr = ARRAY_SIZE(values);
struct min_heap_callbacks funcs = {
.less = min_heap ? less_than : greater_than,
- .swp = swap_ints,
+ .swp = NULL,
};
int i, err;
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 05/10] perf/core: Update min_heap_callbacks to use default builtin swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (3 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 04/10] lib/test_min_heap: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 06/10] dm vdo: " Kuan-Wei Chiu
` (6 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
After introducing the default builtin swap implementation, update the
min_heap_callbacks to replace the swp function pointer with NULL. This
change allows the min heap to directly utilize the builtin swap,
simplifying the code.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
kernel/events/core.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index cbf365e67f6e..0e9be5b323e4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3778,18 +3778,11 @@ static bool perf_less_group_idx(const void *l, const void *r, void __always_unus
return le->group_index < re->group_index;
}
-static void swap_ptr(void *l, void *r, void __always_unused *args)
-{
- void **lp = l, **rp = r;
-
- swap(*lp, *rp);
-}
-
DEFINE_MIN_HEAP(struct perf_event *, perf_event_min_heap);
static const struct min_heap_callbacks perf_min_heap = {
.less = perf_less_group_idx,
- .swp = swap_ptr,
+ .swp = NULL,
};
static void __heap_add(struct perf_event_min_heap *heap, struct perf_event *event)
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 06/10] dm vdo: Update min_heap_callbacks to use default builtin swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (4 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 05/10] perf/core: " Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 07/10] bcache: " Kuan-Wei Chiu
` (5 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Replace the swp function pointer in the min_heap_callbacks of dm-vdo
with NULL, allowing direct usage of the default builtin swap
implementation. This modification simplifies the code and improves
performance by removing unnecessary function indirection.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
drivers/md/dm-vdo/repair.c | 2 +-
drivers/md/dm-vdo/slab-depot.c | 10 +---------
2 files changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/md/dm-vdo/repair.c b/drivers/md/dm-vdo/repair.c
index ffff2c999518..8c006fb3afcf 100644
--- a/drivers/md/dm-vdo/repair.c
+++ b/drivers/md/dm-vdo/repair.c
@@ -166,7 +166,7 @@ static void swap_mappings(void *item1, void *item2, void __always_unused *args)
static const struct min_heap_callbacks repair_min_heap = {
.less = mapping_is_less_than,
- .swp = swap_mappings,
+ .swp = NULL,
};
static struct numbered_block_mapping *sort_next_heap_element(struct repair_completion *repair)
diff --git a/drivers/md/dm-vdo/slab-depot.c b/drivers/md/dm-vdo/slab-depot.c
index 274f9ccd072f..ccf7b2eac131 100644
--- a/drivers/md/dm-vdo/slab-depot.c
+++ b/drivers/md/dm-vdo/slab-depot.c
@@ -3301,17 +3301,9 @@ static bool slab_status_is_less_than(const void *item1, const void *item2,
return info1->slab_number < info2->slab_number;
}
-static void swap_slab_statuses(void *item1, void *item2, void __always_unused *args)
-{
- struct slab_status *info1 = item1;
- struct slab_status *info2 = item2;
-
- swap(*info1, *info2);
-}
-
static const struct min_heap_callbacks slab_status_min_heap = {
.less = slab_status_is_less_than,
- .swp = swap_slab_statuses,
+ .swp = NULL,
};
/* Inform the slab actor that a action has finished on some slab; used by apply_to_slabs(). */
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 07/10] bcache: Update min_heap_callbacks to use default builtin swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (5 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 06/10] dm vdo: " Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 08/10] bcachefs: Clean up duplicate min_heap_callbacks declarations Kuan-Wei Chiu
` (4 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Replace the swp function pointer in the min_heap_callbacks of bcache
with NULL, allowing direct usage of the default builtin swap
implementation. This modification simplifies the code and improves
performance by removing unnecessary function indirection.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
drivers/md/bcache/alloc.c | 11 ++---------
drivers/md/bcache/bset.c | 14 +++-----------
drivers/md/bcache/extents.c | 10 +---------
drivers/md/bcache/movinggc.c | 10 +---------
4 files changed, 7 insertions(+), 38 deletions(-)
diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index da50f6661bae..8998e61efa40 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -189,23 +189,16 @@ static inline bool new_bucket_min_cmp(const void *l, const void *r, void *args)
return new_bucket_prio(ca, *lhs) < new_bucket_prio(ca, *rhs);
}
-static inline void new_bucket_swap(void *l, void *r, void __always_unused *args)
-{
- struct bucket **lhs = l, **rhs = r;
-
- swap(*lhs, *rhs);
-}
-
static void invalidate_buckets_lru(struct cache *ca)
{
struct bucket *b;
const struct min_heap_callbacks bucket_max_cmp_callback = {
.less = new_bucket_max_cmp,
- .swp = new_bucket_swap,
+ .swp = NULL,
};
const struct min_heap_callbacks bucket_min_cmp_callback = {
.less = new_bucket_min_cmp,
- .swp = new_bucket_swap,
+ .swp = NULL,
};
ca->heap.nr = 0;
diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index bd97d8626887..68258a16e125 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -1093,14 +1093,6 @@ static inline bool new_btree_iter_cmp(const void *l, const void *r, void __alway
return bkey_cmp(_l->k, _r->k) <= 0;
}
-static inline void new_btree_iter_swap(void *iter1, void *iter2, void __always_unused *args)
-{
- struct btree_iter_set *_iter1 = iter1;
- struct btree_iter_set *_iter2 = iter2;
-
- swap(*_iter1, *_iter2);
-}
-
static inline bool btree_iter_end(struct btree_iter *iter)
{
return !iter->heap.nr;
@@ -1111,7 +1103,7 @@ void bch_btree_iter_push(struct btree_iter *iter, struct bkey *k,
{
const struct min_heap_callbacks callbacks = {
.less = new_btree_iter_cmp,
- .swp = new_btree_iter_swap,
+ .swp = NULL,
};
if (k != end)
@@ -1157,7 +1149,7 @@ static inline struct bkey *__bch_btree_iter_next(struct btree_iter *iter,
struct bkey *ret = NULL;
const struct min_heap_callbacks callbacks = {
.less = cmp,
- .swp = new_btree_iter_swap,
+ .swp = NULL,
};
if (!btree_iter_end(iter)) {
@@ -1231,7 +1223,7 @@ static void btree_mergesort(struct btree_keys *b, struct bset *out,
: bch_ptr_invalid;
const struct min_heap_callbacks callbacks = {
.less = b->ops->sort_cmp,
- .swp = new_btree_iter_swap,
+ .swp = NULL,
};
/* Heapify the iterator, using our comparison function */
diff --git a/drivers/md/bcache/extents.c b/drivers/md/bcache/extents.c
index a7221e5dbe81..4b84fda1530a 100644
--- a/drivers/md/bcache/extents.c
+++ b/drivers/md/bcache/extents.c
@@ -266,20 +266,12 @@ static bool new_bch_extent_sort_cmp(const void *l, const void *r, void __always_
return !(c ? c > 0 : _l->k < _r->k);
}
-static inline void new_btree_iter_swap(void *iter1, void *iter2, void __always_unused *args)
-{
- struct btree_iter_set *_iter1 = iter1;
- struct btree_iter_set *_iter2 = iter2;
-
- swap(*_iter1, *_iter2);
-}
-
static struct bkey *bch_extent_sort_fixup(struct btree_iter *iter,
struct bkey *tmp)
{
const struct min_heap_callbacks callbacks = {
.less = new_bch_extent_sort_cmp,
- .swp = new_btree_iter_swap,
+ .swp = NULL,
};
while (iter->heap.nr > 1) {
struct btree_iter_set *top = iter->heap.data, *i = top + 1;
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index 7f482729c56d..ef6abf33f926 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -190,14 +190,6 @@ static bool new_bucket_cmp(const void *l, const void *r, void __always_unused *a
return GC_SECTORS_USED(*_l) >= GC_SECTORS_USED(*_r);
}
-static void new_bucket_swap(void *l, void *r, void __always_unused *args)
-{
- struct bucket **_l = l;
- struct bucket **_r = r;
-
- swap(*_l, *_r);
-}
-
static unsigned int bucket_heap_top(struct cache *ca)
{
struct bucket *b;
@@ -212,7 +204,7 @@ void bch_moving_gc(struct cache_set *c)
unsigned long sectors_to_move, reserve_sectors;
const struct min_heap_callbacks callbacks = {
.less = new_bucket_cmp,
- .swp = new_bucket_swap,
+ .swp = NULL,
};
if (!c->copy_gc_enabled)
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 08/10] bcachefs: Clean up duplicate min_heap_callbacks declarations
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (6 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 07/10] bcache: " Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 09/10] bcachefs: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
` (3 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Refactor the bcachefs code to remove multiple redundant declarations of
min_heap_callbacks, ensuring that each unique declaration appears only
once.
Link: https://lore.kernel.org/20241017095520.GV16066@noisy.programming.kicks-ass.net
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
fs/bcachefs/clock.c | 19 +++++--------------
fs/bcachefs/ec.c | 19 +++++--------------
2 files changed, 10 insertions(+), 28 deletions(-)
diff --git a/fs/bcachefs/clock.c b/fs/bcachefs/clock.c
index 1d6b691e8da6..1fcfcb5fd44f 100644
--- a/fs/bcachefs/clock.c
+++ b/fs/bcachefs/clock.c
@@ -22,13 +22,13 @@ static inline void io_timer_swp(void *l, void *r, void __always_unused *args)
swap(*_l, *_r);
}
+static const struct min_heap_callbacks callbacks = {
+ .less = io_timer_cmp,
+ .swp = io_timer_swp,
+};
+
void bch2_io_timer_add(struct io_clock *clock, struct io_timer *timer)
{
- const struct min_heap_callbacks callbacks = {
- .less = io_timer_cmp,
- .swp = io_timer_swp,
- };
-
spin_lock(&clock->timer_lock);
if (time_after_eq64((u64) atomic64_read(&clock->now), timer->expire)) {
@@ -48,11 +48,6 @@ void bch2_io_timer_add(struct io_clock *clock, struct io_timer *timer)
void bch2_io_timer_del(struct io_clock *clock, struct io_timer *timer)
{
- const struct min_heap_callbacks callbacks = {
- .less = io_timer_cmp,
- .swp = io_timer_swp,
- };
-
spin_lock(&clock->timer_lock);
for (size_t i = 0; i < clock->timers.nr; i++)
@@ -142,10 +137,6 @@ void bch2_kthread_io_clock_wait(struct io_clock *clock,
static struct io_timer *get_expired_timer(struct io_clock *clock, u64 now)
{
struct io_timer *ret = NULL;
- const struct min_heap_callbacks callbacks = {
- .less = io_timer_cmp,
- .swp = io_timer_swp,
- };
if (clock->timers.nr &&
time_after_eq64(now, clock->timers.data[0]->expire)) {
diff --git a/fs/bcachefs/ec.c b/fs/bcachefs/ec.c
index e410cfe37b1a..c88f10d0606f 100644
--- a/fs/bcachefs/ec.c
+++ b/fs/bcachefs/ec.c
@@ -1057,6 +1057,11 @@ static inline void ec_stripes_heap_swap(void *l, void *r, void *h)
ec_stripes_heap_set_backpointer(_h, j);
}
+static const struct min_heap_callbacks callbacks = {
+ .less = ec_stripes_heap_cmp,
+ .swp = ec_stripes_heap_swap,
+};
+
static void heap_verify_backpointer(struct bch_fs *c, size_t idx)
{
ec_stripes_heap *h = &c->ec_stripes_heap;
@@ -1069,11 +1074,6 @@ static void heap_verify_backpointer(struct bch_fs *c, size_t idx)
void bch2_stripes_heap_del(struct bch_fs *c,
struct stripe *m, size_t idx)
{
- const struct min_heap_callbacks callbacks = {
- .less = ec_stripes_heap_cmp,
- .swp = ec_stripes_heap_swap,
- };
-
mutex_lock(&c->ec_stripes_heap_lock);
heap_verify_backpointer(c, idx);
@@ -1084,11 +1084,6 @@ void bch2_stripes_heap_del(struct bch_fs *c,
void bch2_stripes_heap_insert(struct bch_fs *c,
struct stripe *m, size_t idx)
{
- const struct min_heap_callbacks callbacks = {
- .less = ec_stripes_heap_cmp,
- .swp = ec_stripes_heap_swap,
- };
-
mutex_lock(&c->ec_stripes_heap_lock);
BUG_ON(min_heap_full(&c->ec_stripes_heap));
@@ -1107,10 +1102,6 @@ void bch2_stripes_heap_insert(struct bch_fs *c,
void bch2_stripes_heap_update(struct bch_fs *c,
struct stripe *m, size_t idx)
{
- const struct min_heap_callbacks callbacks = {
- .less = ec_stripes_heap_cmp,
- .swp = ec_stripes_heap_swap,
- };
ec_stripes_heap *h = &c->ec_stripes_heap;
bool do_deletes;
size_t i;
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 09/10] bcachefs: Update min_heap_callbacks to use default builtin swap
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (7 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 08/10] bcachefs: Clean up duplicate min_heap_callbacks declarations Kuan-Wei Chiu
@ 2024-10-20 4:01 ` Kuan-Wei Chiu
2024-10-20 4:02 ` [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction Kuan-Wei Chiu
` (2 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:01 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Replace the swp function pointer in the min_heap_callbacks of bcachefs
with NULL, allowing direct usage of the default builtin swap
implementation. This modification simplifies the code and improves
performance by removing unnecessary function indirection.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
fs/bcachefs/clock.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/fs/bcachefs/clock.c b/fs/bcachefs/clock.c
index 1fcfcb5fd44f..1f8e035d7119 100644
--- a/fs/bcachefs/clock.c
+++ b/fs/bcachefs/clock.c
@@ -14,17 +14,9 @@ static inline bool io_timer_cmp(const void *l, const void *r, void __always_unus
return (*_l)->expire < (*_r)->expire;
}
-static inline void io_timer_swp(void *l, void *r, void __always_unused *args)
-{
- struct io_timer **_l = (struct io_timer **)l;
- struct io_timer **_r = (struct io_timer **)r;
-
- swap(*_l, *_r);
-}
-
static const struct min_heap_callbacks callbacks = {
.less = io_timer_cmp,
- .swp = io_timer_swp,
+ .swp = NULL,
};
void bch2_io_timer_add(struct io_clock *clock, struct io_timer *timer)
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (8 preceding siblings ...)
2024-10-20 4:01 ` [PATCH v2 09/10] bcachefs: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
@ 2024-10-20 4:02 ` Kuan-Wei Chiu
2024-10-28 6:10 ` Bagas Sanjaya
2024-10-21 9:33 ` [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Bagas Sanjaya
2024-10-26 12:44 ` Kuan-Wei Chiu
11 siblings, 1 reply; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-20 4:02 UTC (permalink / raw)
To: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, Kuan-Wei Chiu
Introduce an overview of the min heap API, detailing its usage and
functionality. The documentation aims to provide developers with a
clear understanding of how to implement and utilize min heaps within
the Linux kernel, enhancing the overall accessibility of this data
structure.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
Changes in v2:
- Wrapped lines at 80 columns.
- Updated Example Usage.
- Added documentation for the ability to pass NULL to the swp in
min_heap_callbacks to use the builtin swap.
Documentation/core-api/index.rst | 1 +
Documentation/core-api/min_heap.rst | 300 ++++++++++++++++++++++++++++
2 files changed, 301 insertions(+)
create mode 100644 Documentation/core-api/min_heap.rst
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 6a875743dd4b..563b8fc0002f 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -52,6 +52,7 @@ Library functionality that is used throughout the kernel.
wrappers/atomic_bitops
floating-point
union_find
+ min_heap
Low level entry and exit
========================
diff --git a/Documentation/core-api/min_heap.rst b/Documentation/core-api/min_heap.rst
new file mode 100644
index 000000000000..0c636c8b7aa5
--- /dev/null
+++ b/Documentation/core-api/min_heap.rst
@@ -0,0 +1,300 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Min Heap API
+============
+
+Introduction
+============
+
+The Min Heap API provides a set of functions and macros for managing min-heaps
+in the Linux kernel. A min-heap is a binary tree structure where the value of
+each node is less than or equal to the values of its children, ensuring that
+the smallest element is always at the root.
+
+This document provides a guide to the Min Heap API, detailing how to define and
+use min-heaps. Users should not directly call functions with **__min_heap_*()**
+prefixes, but should instead use the provided macro wrappers.
+
+In addition to the standard version of the functions, the API also includes a
+set of inline versions for performance-critical scenarios. These inline
+functions have the same names as their non-inline counterparts but include an
+**_inline** suffix. For example, **__min_heap_init_inline** and its
+corresponding macro wrapper **min_heap_init_inline**. The inline versions allow
+custom comparison and swap functions to be called directly, rather than through
+indirect function calls. This can significantly reduce overhead, especially
+when CONFIG_MITIGATION_RETPOLINE is enabled, as indirect function calls become
+more expensive. As with the non-inline versions, it is important to use the
+macro wrappers for inline functions instead of directly calling the functions
+themselves.
+
+Data Structures
+===============
+
+Min-Heap Definition
+-------------------
+
+The core data structure for representing a min-heap is defined using the
+**MIN_HEAP_PREALLOCATED** and **DEFINE_MIN_HEAP** macros. These macros allow
+you to define a min-heap with a preallocated buffer or dynamically allocated
+memory.
+
+Example:
+
+.. code-block:: c
+
+ #define MIN_HEAP_PREALLOCATED(_type, _name, _nr)
+ struct _name {
+ int nr; /* Number of elements in the heap */
+ int size; /* Maximum number of elements that can be held */
+ _type *data; /* Pointer to the heap data */
+ _type preallocated[_nr]; /* Static preallocated array */
+ }
+
+ #define DEFINE_MIN_HEAP(_type, _name) MIN_HEAP_PREALLOCATED(_type, _name, 0)
+
+A typical heap structure will include a counter for the number of elements
+(`nr`), the maximum capacity of the heap (`size`), and a pointer to an array of
+elements (`data`). Optionally, you can specify a static array for preallocated
+heap storage using **MIN_HEAP_PREALLOCATED**.
+
+Min Heap Callbacks
+------------------
+
+The **struct min_heap_callbacks** provides customization options for ordering
+elements in the heap and swapping them. It contains two function pointers:
+
+.. code-block:: c
+
+ struct min_heap_callbacks {
+ bool (*less)(const void *lhs, const void *rhs, void *args);
+ void (*swp)(void *lhs, void *rhs, void *args);
+ };
+
+- **less** is the comparison function used to establish the order of elements.
+- **swp** is a function for swapping elements in the heap. If swp is set to
+ NULL, the default swap function will be used, which swaps the elements based on their size
+
+Macro Wrappers
+==============
+
+The following macro wrappers are provided for interacting with the heap in a
+user-friendly manner. Each macro corresponds to a function that operates on the
+heap, and they abstract away direct calls to internal functions.
+
+Each macro accepts various parameters that are detailed below.
+
+Heap Initialization
+--------------------
+
+.. code-block:: c
+
+ min_heap_init(heap, data, size);
+
+- **heap**: A pointer to the min-heap structure to be initialized.
+- **data**: A pointer to the buffer where the heap elements will be stored. If
+ `NULL`, the preallocated buffer within the heap structure will be used.
+- **size**: The maximum number of elements the heap can hold.
+
+This macro initializes the heap, setting its initial state. If `data` is
+`NULL`, the preallocated memory inside the heap structure will be used for
+storage. Otherwise, the user-provided buffer is used. The operation is **O(1)**.
+
+**Inline Version:** min_heap_init_inline(heap, data, size)
+
+Accessing the Top Element
+-------------------------
+
+.. code-block:: c
+
+ element = min_heap_peek(heap);
+
+- **heap**: A pointer to the min-heap from which to retrieve the smallest
+ element.
+
+This macro returns a pointer to the smallest element (the root) of the heap, or
+`NULL` if the heap is empty. The operation is **O(1)**.
+
+**Inline Version:** min_heap_peek_inline(heap)
+
+Heap Insertion
+--------------
+
+.. code-block:: c
+
+ success = min_heap_push(heap, element, callbacks, args);
+
+- **heap**: A pointer to the min-heap into which the element should be inserted.
+- **element**: A pointer to the element to be inserted into the heap.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro inserts an element into the heap. It returns `true` if the insertion
+was successful and `false` if the heap is full. The operation is **O(log n)**.
+
+**Inline Version:** min_heap_push_inline(heap, element, callbacks, args)
+
+Heap Removal
+------------
+
+.. code-block:: c
+
+ success = min_heap_pop(heap, callbacks, args);
+
+- **heap**: A pointer to the min-heap from which to remove the smallest element.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro removes the smallest element (the root) from the heap. It returns
+`true` if the element was successfully removed, or `false` if the heap is
+empty. The operation is **O(log n)**.
+
+**Inline Version:** min_heap_pop_inline(heap, callbacks, args)
+
+Heap Maintenance
+----------------
+
+You can use the following macros to maintain the heap's structure:
+
+.. code-block:: c
+
+ min_heap_sift_down(heap, pos, callbacks, args);
+
+- **heap**: A pointer to the min-heap.
+- **pos**: The index from which to start sifting down.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro restores the heap property by moving the element at the specified
+index (`pos`) down the heap until it is in the correct position. The operation
+is **O(log n)**.
+
+**Inline Version:** min_heap_sift_down_inline(heap, pos, callbacks, args)
+
+.. code-block:: c
+
+ min_heap_sift_up(heap, idx, callbacks, args);
+
+- **heap**: A pointer to the min-heap.
+- **idx**: The index of the element to sift up.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro restores the heap property by moving the element at the specified
+index (`idx`) up the heap. The operation is **O(log n)**.
+
+**Inline Version:** min_heap_sift_up_inline(heap, idx, callbacks, args)
+
+.. code-block:: c
+
+ min_heapify_all(heap, callbacks, args);
+
+- **heap**: A pointer to the min-heap.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro ensures that the entire heap satisfies the heap property. It is
+called when the heap is built from scratch or after many modifications. The
+operation is **O(n)**.
+
+**Inline Version:** min_heapify_all_inline(heap, callbacks, args)
+
+Removing Specific Elements
+--------------------------
+
+.. code-block:: c
+
+ success = min_heap_del(heap, idx, callbacks, args);
+
+- **heap**: A pointer to the min-heap.
+- **idx**: The index of the element to delete.
+- **callbacks**: A pointer to a `struct min_heap_callbacks` providing the
+ `less` and `swp` functions.
+- **args**: Optional arguments passed to the `less` and `swp` functions.
+
+This macro removes an element at the specified index (`idx`) from the heap and
+restores the heap property. The operation is **O(log n)**.
+
+**Inline Version:** min_heap_del_inline(heap, idx, callbacks, args)
+
+Other Utilities
+===============
+
+- **min_heap_full(heap)**: Checks whether the heap is full.
+ Complexity: **O(1)**.
+
+.. code-block:: c
+
+ bool full = min_heap_full(heap);
+
+- `heap`: A pointer to the min-heap to check.
+
+This macro returns `true` if the heap is full, otherwise `false`.
+
+**Inline Version:** min_heap_full_inline(heap)
+
+- **min_heap_empty(heap)**: Checks whether the heap is empty.
+ Complexity: **O(1)**.
+
+.. code-block:: c
+
+ bool empty = min_heap_empty(heap);
+
+- `heap`: A pointer to the min-heap to check.
+
+This macro returns `true` if the heap is empty, otherwise `false`.
+
+**Inline Version:** min_heap_empty_inline(heap)
+
+Example Usage
+=============
+
+An example usage of the min-heap API would involve defining a heap structure,
+initializing it, and inserting and removing elements as needed.
+
+.. code-block:: c
+
+ #include <linux/min_heap.h>
+
+ int my_less_function(const void *lhs, const void *rhs, void *args) {
+ return (*(int *)lhs < *(int *)rhs);
+ }
+
+ struct min_heap_callbacks heap_cb = {
+ .less = my_less_function, /* Comparison function for heap order */
+ .swp = NULL, /* Use default swap function */
+ };
+
+ void example_usage(void) {
+ /* Pre-populate the buffer with elements */
+ int buffer[5] = {5, 2, 8, 1, 3};
+ /* Declare a min-heap */
+ DEFINE_MIN_HEAP(int, my_heap);
+
+ /* Initialize the heap with preallocated buffer and size */
+ min_heap_init(&my_heap, buffer, 5);
+
+ /* Build the heap using min_heapify_all */
+ my_heap.nr = 5; /* Set the number of elements in the heap */
+ min_heapify_all(&my_heap, &heap_cb, NULL);
+
+ /* Peek at the top element (should be 1 in this case) */
+ int *top = min_heap_peek(&my_heap);
+ pr_info("Top element: %d\n", *top);
+
+ /* Pop the top element (1) and get the new top (2) */
+ min_heap_pop(&my_heap, &heap_cb, NULL);
+ top = min_heap_peek(&my_heap);
+ pr_info("New top element: %d\n", *top);
+
+ /* Insert a new element (0) and recheck the top */
+ int new_element = 0;
+ min_heap_push(&my_heap, &new_element, &heap_cb, NULL);
+ top = min_heap_peek(&my_heap);
+ pr_info("Top element after insertion: %d\n", *top);
+ }
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (9 preceding siblings ...)
2024-10-20 4:02 ` [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction Kuan-Wei Chiu
@ 2024-10-21 9:33 ` Bagas Sanjaya
2024-10-21 13:47 ` Kuan-Wei Chiu
2024-10-26 12:44 ` Kuan-Wei Chiu
11 siblings, 1 reply; 19+ messages in thread
From: Bagas Sanjaya @ 2024-10-21 9:33 UTC (permalink / raw)
To: Kuan-Wei Chiu, colyli, kent.overstreet, msakai, corbet, peterz,
mingo, acme, namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc
[-- Attachment #1: Type: text/plain, Size: 804 bytes --]
On Sun, Oct 20, 2024 at 12:01:50PM +0800, Kuan-Wei Chiu wrote:
> Add non-inline versions of the min heap API functions in lib/min_heap.c
> and updates all users outside of kernel/events/core.c to use these
> non-inline versions. To mitigate the performance impact of indirect
> function calls caused by the non-inline versions of the swap and
> compare functions, a builtin swap has been introduced that swaps
> elements based on their size. Additionally, it micro-optimizes the
> efficiency of the min heap by pre-scaling the counter, following the
> same approach as in lib/sort.c. Documentation for the min heap API has
> also been added to the core-api section.
What tree (and commit) this series is based on?
Confused...
--
An old man doll... just what I always wanted! - Clara
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations
2024-10-21 9:33 ` [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Bagas Sanjaya
@ 2024-10-21 13:47 ` Kuan-Wei Chiu
2024-10-28 5:04 ` Bagas Sanjaya
0 siblings, 1 reply; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-21 13:47 UTC (permalink / raw)
To: Bagas Sanjaya
Cc: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, willy, jserv, linux-kernel,
linux-bcache, dm-devel, linux-bcachefs, linux-perf-users,
linux-doc
On Mon, Oct 21, 2024 at 04:33:37PM +0700, Bagas Sanjaya wrote:
> On Sun, Oct 20, 2024 at 12:01:50PM +0800, Kuan-Wei Chiu wrote:
> > Add non-inline versions of the min heap API functions in lib/min_heap.c
> > and updates all users outside of kernel/events/core.c to use these
> > non-inline versions. To mitigate the performance impact of indirect
> > function calls caused by the non-inline versions of the swap and
> > compare functions, a builtin swap has been introduced that swaps
> > elements based on their size. Additionally, it micro-optimizes the
> > efficiency of the min heap by pre-scaling the counter, following the
> > same approach as in lib/sort.c. Documentation for the min heap API has
> > also been added to the core-api section.
>
> What tree (and commit) this series is based on?
>
> Confused...
>
This patchset is based on Linus' tree, commit 715ca9dd687f ("Merge tag
'io_uring-6.12-20241019' of git://git.kernel.dk/linux"). Since it
touches multiple subsystems, I'm not entirely sure which tree I should
base it on. Should it be linux-next, perhaps?
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
` (10 preceding siblings ...)
2024-10-21 9:33 ` [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Bagas Sanjaya
@ 2024-10-26 12:44 ` Kuan-Wei Chiu
11 siblings, 0 replies; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-10-26 12:44 UTC (permalink / raw)
To: kent.overstreet
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc, colyli, msakai,
corbet, peterz, mingo, acme, namhyung, akpm
On Sun, Oct 20, 2024 at 12:01:50PM +0800, Kuan-Wei Chiu wrote:
> Add non-inline versions of the min heap API functions in lib/min_heap.c
> and updates all users outside of kernel/events/core.c to use these
> non-inline versions. To mitigate the performance impact of indirect
> function calls caused by the non-inline versions of the swap and
> compare functions, a builtin swap has been introduced that swaps
> elements based on their size. Additionally, it micro-optimizes the
> efficiency of the min heap by pre-scaling the counter, following the
> same approach as in lib/sort.c. Documentation for the min heap API has
> also been added to the core-api section.
>
Hi Kent,
FWIW, here are the bcachefs CI test results for this patch series:
https://evilpiepirate.org/~testdashboard/ci?user=visitorckw&branch=min-heap-update
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations
2024-10-21 13:47 ` Kuan-Wei Chiu
@ 2024-10-28 5:04 ` Bagas Sanjaya
0 siblings, 0 replies; 19+ messages in thread
From: Bagas Sanjaya @ 2024-10-28 5:04 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, willy, jserv, linux-kernel,
linux-bcache, dm-devel, linux-bcachefs, linux-perf-users,
linux-doc
[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]
On Mon, Oct 21, 2024 at 09:47:45PM +0800, Kuan-Wei Chiu wrote:
> On Mon, Oct 21, 2024 at 04:33:37PM +0700, Bagas Sanjaya wrote:
> > On Sun, Oct 20, 2024 at 12:01:50PM +0800, Kuan-Wei Chiu wrote:
> > > Add non-inline versions of the min heap API functions in lib/min_heap.c
> > > and updates all users outside of kernel/events/core.c to use these
> > > non-inline versions. To mitigate the performance impact of indirect
> > > function calls caused by the non-inline versions of the swap and
> > > compare functions, a builtin swap has been introduced that swaps
> > > elements based on their size. Additionally, it micro-optimizes the
> > > efficiency of the min heap by pre-scaling the counter, following the
> > > same approach as in lib/sort.c. Documentation for the min heap API has
> > > also been added to the core-api section.
> >
> > What tree (and commit) this series is based on?
> >
> > Confused...
> >
> This patchset is based on Linus' tree, commit 715ca9dd687f ("Merge tag
> 'io_uring-6.12-20241019' of git://git.kernel.dk/linux"). Since it
> touches multiple subsystems, I'm not entirely sure which tree I should
> base it on. Should it be linux-next, perhaps?
Nope.
Anyway, series applied for docs review. Thanks!
--
An old man doll... just what I always wanted! - Clara
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction
2024-10-20 4:02 ` [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction Kuan-Wei Chiu
@ 2024-10-28 6:10 ` Bagas Sanjaya
0 siblings, 0 replies; 19+ messages in thread
From: Bagas Sanjaya @ 2024-10-28 6:10 UTC (permalink / raw)
To: Kuan-Wei Chiu, colyli, kent.overstreet, msakai, corbet, peterz,
mingo, acme, namhyung, akpm
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
kan.liang, willy, jserv, linux-kernel, linux-bcache, dm-devel,
linux-bcachefs, linux-perf-users, linux-doc
[-- Attachment #1: Type: text/plain, Size: 504 bytes --]
On Sun, Oct 20, 2024 at 12:02:00PM +0800, Kuan-Wei Chiu wrote:
> Introduce an overview of the min heap API, detailing its usage and
> functionality. The documentation aims to provide developers with a
> clear understanding of how to implement and utilize min heaps within
> the Linux kernel, enhancing the overall accessibility of this data
> structure.
>
The doc LGTM, thanks!
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
--
An old man doll... just what I always wanted! - Clara
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions
2024-10-20 4:01 ` [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions Kuan-Wei Chiu
@ 2024-11-26 13:27 ` Geert Uytterhoeven
2024-11-27 2:59 ` Kuan-Wei Chiu
0 siblings, 1 reply; 19+ messages in thread
From: Geert Uytterhoeven @ 2024-11-26 13:27 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, willy, jserv, linux-kernel,
linux-bcache, dm-devel, linux-bcachefs, linux-perf-users,
linux-doc, open list:KERNEL SELFTEST FRAMEWORK
Hi Kuan-Wei,
On Sun, Oct 20, 2024 at 6:02 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> All current min heap API functions are marked with '__always_inline'.
> However, as the number of users increases, inlining these functions
> everywhere leads to a increase in kernel size.
>
> In performance-critical paths, such as when perf events are enabled and
> min heap functions are called on every context switch, it is important
> to retain the inline versions for optimal performance. To balance this,
> the original inline functions are kept, and additional non-inline
> versions of the functions have been added in lib/min_heap.c.
>
> Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Thanks for your patch, which is now commit 92a8b224b833e82d
("lib/min_heap: introduce non-inline versions of min heap API
functions") upstream.
> --- a/include/linux/min_heap.h
> +++ b/include/linux/min_heap.h
> @@ -50,33 +50,33 @@ void __min_heap_init(min_heap_char *heap, void *data, int size)
> heap->data = heap->preallocated;
> }
>
> -#define min_heap_init(_heap, _data, _size) \
> - __min_heap_init((min_heap_char *)_heap, _data, _size)
> +#define min_heap_init_inline(_heap, _data, _size) \
> + __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
Casting macro parameters without any further checks prevents the
compiler from detecting silly mistakes. Would it be possible to
add safety-nets here and below, using e.g. container_of() or typeof()
checks?
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -777,3 +777,6 @@ config POLYNOMIAL
>
> config FIRMWARE_TABLE
> bool
> +
> +config MIN_HEAP
> + bool
Perhaps tristate? See also below.
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2279,6 +2279,7 @@ config TEST_LIST_SORT
> config TEST_MIN_HEAP
> tristate "Min heap test"
> depends on DEBUG_KERNEL || m
> + select MIN_HEAP
Ideally, tests should not select functionality, to prevent increasing the
attack vector by merely enabling (modular) tests.
In this particular case, just using "depends on MIN_HEAP" is not an
option, as MIN_HEAP is not user-visible, and thus cannot be enabled
by the user on its own. However, making MIN_HEAP tristate could be
a first step for the modular case.
The builtin case is harder to fix, as e.g.
depends on MIN_HEAP || COMPILE_TEST
select MIN_HEAP if COMPILE_TEST
would still trigger a recursive dependency error.
Alternatively, the test could just keep on using the inline variants,
unless CONFIG_MIN_HEAP=y? Or event test both for the latter?
> help
> Enable this to turn on min heap function tests. This test is
> executed only once during system boot (so affects only boot time),
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions
2024-11-26 13:27 ` Geert Uytterhoeven
@ 2024-11-27 2:59 ` Kuan-Wei Chiu
2024-11-27 8:04 ` Geert Uytterhoeven
0 siblings, 1 reply; 19+ messages in thread
From: Kuan-Wei Chiu @ 2024-11-27 2:59 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, willy, jserv, linux-kernel,
linux-bcache, dm-devel, linux-bcachefs, linux-perf-users,
linux-doc, open list:KERNEL SELFTEST FRAMEWORK
Hi Geert,
On Tue, Nov 26, 2024 at 02:27:09PM +0100, Geert Uytterhoeven wrote:
> Hi Kuan-Wei,
>
> On Sun, Oct 20, 2024 at 6:02 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> > All current min heap API functions are marked with '__always_inline'.
> > However, as the number of users increases, inlining these functions
> > everywhere leads to a increase in kernel size.
> >
> > In performance-critical paths, such as when perf events are enabled and
> > min heap functions are called on every context switch, it is important
> > to retain the inline versions for optimal performance. To balance this,
> > the original inline functions are kept, and additional non-inline
> > versions of the functions have been added in lib/min_heap.c.
> >
> > Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
> > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
>
> Thanks for your patch, which is now commit 92a8b224b833e82d
> ("lib/min_heap: introduce non-inline versions of min heap API
> functions") upstream.
>
> > --- a/include/linux/min_heap.h
> > +++ b/include/linux/min_heap.h
>
> > @@ -50,33 +50,33 @@ void __min_heap_init(min_heap_char *heap, void *data, int size)
> > heap->data = heap->preallocated;
> > }
> >
> > -#define min_heap_init(_heap, _data, _size) \
> > - __min_heap_init((min_heap_char *)_heap, _data, _size)
> > +#define min_heap_init_inline(_heap, _data, _size) \
> > + __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
>
> Casting macro parameters without any further checks prevents the
> compiler from detecting silly mistakes. Would it be possible to
> add safety-nets here and below, using e.g. container_of() or typeof()
> checks?
IIUC, the concern is that passing a pointer that is not of type
min_heap might lead to compiler errors being missed. To address this,
one possible solution could be to expand the members of struct min_heap
into individual parameters for the function.
diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
index e781727c8916..ebd577003f0b 100644
--- a/include/linux/min_heap.h
+++ b/include/linux/min_heap.h
@@ -207,18 +207,20 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
/* Initialize a min-heap. */
static __always_inline
-void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
+void __min_heap_init_inline(int *heap_nr, int *heap_size, void **heap_data,
+ void *heap_preallocated, void *data, int size)
{
- heap->nr = 0;
- heap->size = size;
+ *heap_nr = 0;
+ *heap_size = size;
if (data)
- heap->data = data;
+ *heap_data = data;
else
- heap->data = heap->preallocated;
+ *heap_data = heap_preallocated;
}
#define min_heap_init_inline(_heap, _data, _size) \
- __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
+ __min_heap_init_inline(&(_heap)->nr, &(_heap)->size, (void**)&(_heap)->data, \
+ &(_heap)->preallocated, _data, _size)
/* Get the minimum element from the heap. */
static __always_inline
Alternatively, we could use container_of() for type safety.
diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
index e781727c8916..fb96b1b82fb0 100644
--- a/include/linux/min_heap.h
+++ b/include/linux/min_heap.h
@@ -218,7 +218,7 @@ void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
}
#define min_heap_init_inline(_heap, _data, _size) \
- __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
+ __min_heap_init_inline(container_of(&(_heap)->nr, min_heap_char, nr), _data, _size)
/* Get the minimum element from the heap. */
static __always_inline
The first approach has better readability, while the second minimizes
the changes needed. Please let me know your thoughts.
>
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -777,3 +777,6 @@ config POLYNOMIAL
> >
> > config FIRMWARE_TABLE
> > bool
> > +
> > +config MIN_HEAP
> > + bool
>
> Perhaps tristate? See also below.
>
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -2279,6 +2279,7 @@ config TEST_LIST_SORT
> > config TEST_MIN_HEAP
> > tristate "Min heap test"
> > depends on DEBUG_KERNEL || m
> > + select MIN_HEAP
>
> Ideally, tests should not select functionality, to prevent increasing the
> attack vector by merely enabling (modular) tests.
>
Makes sense. Thanks for catching this.
> In this particular case, just using "depends on MIN_HEAP" is not an
> option, as MIN_HEAP is not user-visible, and thus cannot be enabled
> by the user on its own. However, making MIN_HEAP tristate could be
> a first step for the modular case.
>
> The builtin case is harder to fix, as e.g.
>
> depends on MIN_HEAP || COMPILE_TEST
> select MIN_HEAP if COMPILE_TEST
>
> would still trigger a recursive dependency error.
>
> Alternatively, the test could just keep on using the inline variants,
> unless CONFIG_MIN_HEAP=y? Or event test both for the latter?
>
I think that having min_heap_test continue using the inline variants
might be the simplest solution?
Regards,
Kuan-Wei
> > help
> > Enable this to turn on min heap function tests. This test is
> > executed only once during system boot (so affects only boot time),
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions
2024-11-27 2:59 ` Kuan-Wei Chiu
@ 2024-11-27 8:04 ` Geert Uytterhoeven
0 siblings, 0 replies; 19+ messages in thread
From: Geert Uytterhoeven @ 2024-11-27 8:04 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: colyli, kent.overstreet, msakai, corbet, peterz, mingo, acme,
namhyung, akpm, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, willy, jserv, linux-kernel,
linux-bcache, dm-devel, linux-bcachefs, linux-perf-users,
linux-doc, open list:KERNEL SELFTEST FRAMEWORK
Hi Kuan-Wei,
On Wed, Nov 27, 2024 at 3:59 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> On Tue, Nov 26, 2024 at 02:27:09PM +0100, Geert Uytterhoeven wrote:
> > On Sun, Oct 20, 2024 at 6:02 AM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> > > All current min heap API functions are marked with '__always_inline'.
> > > However, as the number of users increases, inlining these functions
> > > everywhere leads to a increase in kernel size.
> > >
> > > In performance-critical paths, such as when perf events are enabled and
> > > min heap functions are called on every context switch, it is important
> > > to retain the inline versions for optimal performance. To balance this,
> > > the original inline functions are kept, and additional non-inline
> > > versions of the functions have been added in lib/min_heap.c.
> > >
> > > Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
> > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> >
> > Thanks for your patch, which is now commit 92a8b224b833e82d
> > ("lib/min_heap: introduce non-inline versions of min heap API
> > functions") upstream.
> >
> > > --- a/include/linux/min_heap.h
> > > +++ b/include/linux/min_heap.h
> >
> > > @@ -50,33 +50,33 @@ void __min_heap_init(min_heap_char *heap, void *data, int size)
> > > heap->data = heap->preallocated;
> > > }
> > >
> > > -#define min_heap_init(_heap, _data, _size) \
> > > - __min_heap_init((min_heap_char *)_heap, _data, _size)
> > > +#define min_heap_init_inline(_heap, _data, _size) \
> > > + __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
> >
> > Casting macro parameters without any further checks prevents the
> > compiler from detecting silly mistakes. Would it be possible to
> > add safety-nets here and below, using e.g. container_of() or typeof()
> > checks?
>
> IIUC, the concern is that passing a pointer that is not of type
> min_heap might lead to compiler errors being missed. To address this,
Exactly.
> one possible solution could be to expand the members of struct min_heap
> into individual parameters for the function.
>
> diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
> index e781727c8916..ebd577003f0b 100644
> --- a/include/linux/min_heap.h
> +++ b/include/linux/min_heap.h
> @@ -207,18 +207,20 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>
> /* Initialize a min-heap. */
> static __always_inline
> -void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
> +void __min_heap_init_inline(int *heap_nr, int *heap_size, void **heap_data,
> + void *heap_preallocated, void *data, int size)
> {
Which means all functions and users must be updated now, and possibly
again in the future (when there is ever a need to change the struct).
> Alternatively, we could use container_of() for type safety.
>
> diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
> index e781727c8916..fb96b1b82fb0 100644
> --- a/include/linux/min_heap.h
> +++ b/include/linux/min_heap.h
> @@ -218,7 +218,7 @@ void __min_heap_init_inline(min_heap_char *heap, void *data, int size)
> }
>
> #define min_heap_init_inline(_heap, _data, _size) \
> - __min_heap_init_inline((min_heap_char *)_heap, _data, _size)
> + __min_heap_init_inline(container_of(&(_heap)->nr, min_heap_char, nr), _data, _size)
>
> /* Get the minimum element from the heap. */
> static __always_inline
>
> The first approach has better readability, while the second minimizes
> the changes needed. Please let me know your thoughts.
container_of() is a well-known idiom in the Linux kernel, so I would go
for this solution, for min_heap_init_inline() and all other functions
currently using such a cast.
> > > --- a/lib/Kconfig
> > > +++ b/lib/Kconfig
> > > @@ -777,3 +777,6 @@ config POLYNOMIAL
> > >
> > > config FIRMWARE_TABLE
> > > bool
> > > +
> > > +config MIN_HEAP
> > > + bool
> >
> > Perhaps tristate? See also below.
> >
> > > --- a/lib/Kconfig.debug
> > > +++ b/lib/Kconfig.debug
> > > @@ -2279,6 +2279,7 @@ config TEST_LIST_SORT
> > > config TEST_MIN_HEAP
> > > tristate "Min heap test"
> > > depends on DEBUG_KERNEL || m
> > > + select MIN_HEAP
> >
> > Ideally, tests should not select functionality, to prevent increasing the
> > attack vector by merely enabling (modular) tests.
> >
> Makes sense. Thanks for catching this.
>
> > In this particular case, just using "depends on MIN_HEAP" is not an
> > option, as MIN_HEAP is not user-visible, and thus cannot be enabled
> > by the user on its own. However, making MIN_HEAP tristate could be
> > a first step for the modular case.
> >
> > The builtin case is harder to fix, as e.g.
> >
> > depends on MIN_HEAP || COMPILE_TEST
> > select MIN_HEAP if COMPILE_TEST
> >
> > would still trigger a recursive dependency error.
> >
> > Alternatively, the test could just keep on using the inline variants,
> > unless CONFIG_MIN_HEAP=y? Or event test both for the latter?
> >
> I think that having min_heap_test continue using the inline variants
> might be the simplest solution?
As lib/min_heap.c contains just wrappers around the inline functions,
that would be fine for me. If/when lib/min_heap.c gains more
functionality, tests for that can be added to lib/test_min_heap.c
inside an #ifdef CONFIG_MIN_HEAP/#endif block.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-11-27 8:04 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-20 4:01 [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 01/10] lib/min_heap: Introduce non-inline versions of min heap API functions Kuan-Wei Chiu
2024-11-26 13:27 ` Geert Uytterhoeven
2024-11-27 2:59 ` Kuan-Wei Chiu
2024-11-27 8:04 ` Geert Uytterhoeven
2024-10-20 4:01 ` [PATCH v2 02/10] lib min_heap: Optimize min heap by prescaling counters for better performance Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 03/10] lib min_heap: Avoid indirect function call by providing default swap Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 04/10] lib/test_min_heap: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 05/10] perf/core: " Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 06/10] dm vdo: " Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 07/10] bcache: " Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 08/10] bcachefs: Clean up duplicate min_heap_callbacks declarations Kuan-Wei Chiu
2024-10-20 4:01 ` [PATCH v2 09/10] bcachefs: Update min_heap_callbacks to use default builtin swap Kuan-Wei Chiu
2024-10-20 4:02 ` [PATCH v2 10/10] Documentation/core-api: Add min heap API introduction Kuan-Wei Chiu
2024-10-28 6:10 ` Bagas Sanjaya
2024-10-21 9:33 ` [PATCH v2 00/10] Enhance min heap API with non-inline functions and optimizations Bagas Sanjaya
2024-10-21 13:47 ` Kuan-Wei Chiu
2024-10-28 5:04 ` Bagas Sanjaya
2024-10-26 12:44 ` Kuan-Wei Chiu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).