public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
From: Kuan-Wei Chiu <visitorckw@gmail.com>
To: Robert Pang <robertpang@google.com>
Cc: Coly Li <colyli@kernel.org>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	linux-bcache@vger.kernel.org
Subject: Re: [PATCH 2/3] lib min_heap: add alternative APIs that use the conventional top-down strategy to sift down elements
Date: Fri, 6 Jun 2025 20:52:41 +0800	[thread overview]
Message-ID: <aELkmbQpRFejwtIl@visitorckw-System-Product-Name> (raw)
In-Reply-To: <20250606071959.1685079-3-robertpang@google.com>

On Fri, Jun 06, 2025 at 12:19:44AM -0700, Robert Pang wrote:
> Add these min_heap functions that re-introduce the conventional top-down
> strategy to sift down elements. This strategy offers significant performance
> improvements for data that are mostly identical. [1]
> 
> - heapify_all_top_down
> - heap_pop_top_down
> - heap_pop_push_top_down
> - heap_del_top_down
> 
> [1] https://lore.kernel.org/linux-bcache/wtfuhfntbi6yorxqtpcs4vg5w67mvyckp2a6jmxuzt2hvbw65t@gznwsae5653d/T/#m155a21be72ff0cc57d825affbcafc77ac5c2dd0d

Nit: I'd prefer using a Link: tag here.

> 
> Signed-off-by: Robert Pang <robertpang@google.com>
> ---
>  include/linux/min_heap.h | 75 ++++++++++++++++++++++++++++++++++++++++
>  lib/min_heap.c           |  7 ++++
>  2 files changed, 82 insertions(+)
> 
> diff --git a/include/linux/min_heap.h b/include/linux/min_heap.h
> index 1fe6772170e7..149069317bb3 100644
> --- a/include/linux/min_heap.h
> +++ b/include/linux/min_heap.h
> @@ -494,4 +494,79 @@ bool __min_heap_del(min_heap_char *heap, size_t elem_size, size_t idx,
>  	__min_heap_del(container_of(&(_heap)->nr, min_heap_char, nr),	\
>  		       __minheap_obj_size(_heap), _idx, _func, _args, __min_heap_sift_down)
>  
> +static __always_inline
> +void __min_heap_sift_down_top_down_inline(min_heap_char *heap, int pos, size_t elem_size,
> +					  const struct min_heap_callbacks *func, void *args)
> +{
> +	void *data = heap->data;
> +	void (*swp)(void *lhs, void *rhs, void *args) = func->swp;
> +	/* pre-scale counters for performance */
> +	size_t a = pos * elem_size;
> +	size_t b, c, d, smallest;
> +	size_t n = heap->nr * elem_size;
> +
> +	if (!swp)
> +		swp = select_swap_func(data, elem_size);
> +
> +	for (;;) {
> +		if (2 * a + elem_size >= n)
> +			break;
> +
> +		c = 2 * a + elem_size;
> +		b = a;
> +		smallest = b;
> +		if (func->less(data + c, data + smallest, args))
> +			smallest = c;
> +
> +		if (c + elem_size < n) {
> +			d = c + elem_size;
> +			if (func->less(data + d, data + smallest, args))
> +				smallest = d;
> +		}
> +		if (smallest == b)
> +			break;
> +		do_swap(data + smallest, data + b, elem_size, swp, args);
> +		a = (smallest == c) ? c : d;
> +	}
> +}

The logic looks correct, but we actually only need variables a, b, and
c. The use of d and the extra nested if seem unnecessary. I think the
following version is shorter and easier to understand:

for (;;) {
	b = 2 * a + elem_size;
	c = b + elem_size;
	smallest = a;

	if (b >= n)
		break;

	if (func->less(data + b, data + smallest, args))
		smallest = b;

	if (c < n && func->less(data + c, data + smallest, args))
		smallest = c;

	if (smallest == a)
		break;

	do_swap(data + a, data + smallest, elem_size, swp, args);
	a = smallest;
}

> +
> +#define min_heap_sift_down_top_down_inline(_heap, _pos, _func, _args)	\
> +	__min_heap_sift_down_top_down_inline(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +					     _pos, __minheap_obj_size(_heap), _func, _args)
> +#define min_heapify_all_top_down_inline(_heap, _func, _args)	\
> +	__min_heapify_all_inline(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +				 __minheap_obj_size(_heap), _func, _args,	\
> +				 __min_heap_sift_down_top_down_inline)
> +#define min_heap_pop_top_down_inline(_heap, _func, _args)	\
> +	__min_heap_pop_inline(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +			      __minheap_obj_size(_heap), _func, _args,	\
> +			      __min_heap_sift_down_top_down_inline)
> +#define min_heap_pop_push_top_down_inline(_heap, _element, _func, _args)	\
> +	__min_heap_pop_push_inline(container_of(&(_heap)->nr, min_heap_char, nr), _element,	\
> +				   __minheap_obj_size(_heap), _func, _args,	\
> +				   __min_heap_sift_down_top_down_inline)
> +#define min_heap_del_top_down_inline(_heap, _idx, _func, _args)	\
> +	__min_heap_del_inline(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +			      __minheap_obj_size(_heap), _idx, _func, _args,	\
> +			      __min_heap_sift_down_top_down_inline))
> +
> +void __min_heap_sift_down_top_down(min_heap_char *heap, int pos, size_t elem_size,
> +                                   const struct min_heap_callbacks *func, void *args);
> +
> +#define min_heap_sift_down_top_down(_heap, _pos, _func, _args)	\
> +	__min_heap_sift_down(container_of(&(_heap)->nr, min_heap_char, nr), _pos,	\
> +			     __minheap_obj_size(_heap), _func, _args)
> +#define min_heapify_all_top_down(_heap, _func, _args)	\
> +	__min_heapify_all(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +			  __minheap_obj_size(_heap), _func, _args, __min_heap_sift_down_top_down)
> +#define min_heap_pop_top_down(_heap, _func, _args)	\
> +	__min_heap_pop(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +		       __minheap_obj_size(_heap), _func, _args, __min_heap_sift_down_top_down)
> +#define min_heap_pop_push_top_down(_heap, _element, _func, _args)	\
> +	__min_heap_pop_push(container_of(&(_heap)->nr, min_heap_char, nr), _element,	\
> +			    __minheap_obj_size(_heap), _func, _args, __min_heap_sift_down_top_down)
> +#define min_heap_del_top_down(_heap, _idx, _func, _args)	\
> +	__min_heap_del(container_of(&(_heap)->nr, min_heap_char, nr),	\
> +		       __minheap_obj_size(_heap), _idx, _func, _args, __min_heap_sift_down_top_down)
> +

I think we should document in Documentation/core-api/min_heap.rst why
the *_top_down variants exist and how to choose between them.
Otherwise, it could be confusing for future users.

Regards,
Kuan-Wei

>  #endif /* _LINUX_MIN_HEAP_H */
> diff --git a/lib/min_heap.c b/lib/min_heap.c
> index 4ec425788783..a10d3a7cc525 100644
> --- a/lib/min_heap.c
> +++ b/lib/min_heap.c
> @@ -27,6 +27,13 @@ void __min_heap_sift_down(min_heap_char *heap, int pos, size_t elem_size,
>  }
>  EXPORT_SYMBOL(__min_heap_sift_down);
>  
> +void __min_heap_sift_down_top_down(min_heap_char *heap, int pos, size_t elem_size,
> +				   const struct min_heap_callbacks *func, void *args)
> +{
> +	__min_heap_sift_down_top_down_inline(heap, pos, elem_size, func, args);
> +}
> +EXPORT_SYMBOL(__min_heap_sift_down_top_down);
> +
>  void __min_heap_sift_up(min_heap_char *heap, size_t elem_size, size_t idx,
>  			const struct min_heap_callbacks *func, void *args)
>  {
> -- 
> 2.50.0.rc1.591.g9c95f17f64-goog
> 

  reply	other threads:[~2025-06-06 12:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-06  7:19 [PATCH 0/3] bcache: Fix the tail IO latency regression due to the use of lib min_heap Robert Pang
2025-06-06  7:19 ` [PATCH 1/3] lib min_heap: refactor min_heap to allow the alternative sift-down function to be used Robert Pang
2025-06-06  7:19 ` [PATCH 2/3] lib min_heap: add alternative APIs that use the conventional top-down strategy to sift down elements Robert Pang
2025-06-06 12:52   ` Kuan-Wei Chiu [this message]
2025-06-06  7:19 ` [PATCH 3/3] bcache: Fix the tail IO latency regression due to the use of lib min_heap Robert Pang
2025-06-06 13:01   ` Kuan-Wei Chiu
2025-06-10 12:44     ` Robert Pang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aELkmbQpRFejwtIl@visitorckw-System-Product-Name \
    --to=visitorckw@gmail.com \
    --cc=colyli@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcache@vger.kernel.org \
    --cc=robertpang@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox