linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arunpravin Paneer Selvam <arunpravin.paneerselvam@amd.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: christian.koenig@amd.com, matthew.auld@intel.com,
	jani.nikula@linux.intel.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, alexander.deucher@amd.com
Subject: Re: [PATCH v5 1/2] drm/buddy: Optimize free block management with RB tree
Date: Tue, 2 Sep 2025 11:39:57 +0530	[thread overview]
Message-ID: <5f999a32-db26-4964-8152-ac06da8beea4@amd.com> (raw)
In-Reply-To: <20250901194151.GJ4067720@noisy.programming.kicks-ass.net>


On 9/2/2025 1:11 AM, Peter Zijlstra wrote:
> On Tue, Sep 02, 2025 at 12:26:04AM +0530, Arunpravin Paneer Selvam wrote:
>> Replace the freelist (O(n)) used for free block management with a
>> red-black tree, providing more efficient O(log n) search, insert,
>> and delete operations. This improves scalability and performance
>> when managing large numbers of free blocks per order (e.g., hundreds
>> or thousands).
> Did you consider the interval tree?

In this allocator, free blocks are tracked individually by order and not 
as arbitrary ranges. The

operations are keyed insert/delete/lookup, for which an rbtree is 
sufficient and simper, AFAIK.

>
>
>> @@ -41,23 +43,53 @@ static void drm_block_free(struct drm_buddy *mm,
>>   	kmem_cache_free(slab_blocks, block);
>>   }
>>   
>> -static void list_insert_sorted(struct drm_buddy *mm,
>> -			       struct drm_buddy_block *block)
>> +static void rbtree_insert(struct drm_buddy *mm,
>> +			  struct drm_buddy_block *block)
>>   {
>> +	struct rb_root *root = &mm->free_tree[drm_buddy_block_order(block)];
>> +	struct rb_node **link = &root->rb_node;
>> +	struct rb_node *parent = NULL;
>>   	struct drm_buddy_block *node;
>> -	struct list_head *head;
>> +	u64 offset;
>> +
>> +	offset = drm_buddy_block_offset(block);
>>   
>> -	head = &mm->free_list[drm_buddy_block_order(block)];
>> -	if (list_empty(head)) {
>> -		list_add(&block->link, head);
>> -		return;
>> +	while (*link) {
>> +		parent = *link;
>> +		node = rb_entry(parent, struct drm_buddy_block, rb);
>> +
>> +		if (offset < drm_buddy_block_offset(node))
>> +			link = &parent->rb_left;
>> +		else
>> +			link = &parent->rb_right;
>>   	}
>>   
>> -	list_for_each_entry(node, head, link)
>> -		if (drm_buddy_block_offset(block) < drm_buddy_block_offset(node))
>> -			break;
>> +	rb_link_node(&block->rb, parent, link);
>> +	rb_insert_color(&block->rb, root);
>> +}
> static inline bool __drm_bb_less(const struct drm_buddy_block *a,
> 				 const struct drm_buddy_block *b)
> {
> 	return drm_buddy_block_offset(a) < drm_buddy_block_offset(b);
> }
>
> #define __node_2_drm_bb(node) rb_entry((node), struct drm_buddy_block, rb)
>
> static inline bool rb_drm_bb_less(struct rb_node *a, const struct rb_node *b)
> {
> 	return __drm_bb_less(__node_2_drm_bb(a), __node_2_drm_bb(b));
> }
>
> static void rbtree_insert(struct drm_buddy *mm, struct drm_buddy_block *block)
> {
> 	rb_add(block->rb, &mm->free_tree[drm_buddy_block_order(block)], rb_drm_bb_less);
> }
>
>> +
>> +static void rbtree_remove(struct drm_buddy *mm,
>> +			  struct drm_buddy_block *block)
>> +{
>> +	struct rb_root *root;
>> +
>> +	root = &mm->free_tree[drm_buddy_block_order(block)];
>> +	rb_erase(&block->rb, root);
>>   
>> -	__list_add(&block->link, node->link.prev, &node->link);
>> +	RB_CLEAR_NODE(&block->rb);
>> +}
>> +
>> +static inline struct drm_buddy_block *
>> +rbtree_last_entry(struct drm_buddy *mm, unsigned int order)
>> +{
>> +	struct rb_node *node = rb_last(&mm->free_tree[order]);
>> +
>> +	return node ? rb_entry(node, struct drm_buddy_block, rb) : NULL;
>> +}
> rb_add_cached() caches the leftmost entry, if you invert the key, the
> last is first.

With inversion, the in-tree ordering changes from natural ascending 
offsets to descending,

which can break assumptions in existing buddy allocator code that 
expects ascending order.

>
>> diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
>> index 8d2ba3749866..17190bb4837c 100644
>> --- a/include/linux/rbtree.h
>> +++ b/include/linux/rbtree.h
>> @@ -79,6 +79,62 @@ static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent
>>   	   ____ptr ? rb_entry(____ptr, type, member) : NULL; \
>>   	})
>>   
>> +/**
>> + * rbtree_for_each_entry - iterate in-order over rb_root of given type
>> + *
>> + * @pos:	the 'type *' to use as a loop cursor.
>> + * @root:	'rb_root *' of the rbtree.
>> + * @member:	the name of the rb_node field within 'type'.
>> + */
>> +#define rbtree_for_each_entry(pos, root, member) \
>> +	for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member); \
>> +	     (pos); \
>> +	     (pos) = rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member))
>> +
>> +/**
>> + * rbtree_reverse_for_each_entry - iterate in reverse in-order over rb_root
>> + * of given type
>> + *
>> + * @pos:	the 'type *' to use as a loop cursor.
>> + * @root:	'rb_root *' of the rbtree.
>> + * @member:	the name of the rb_node field within 'type'.
>> + */
>> +#define rbtree_reverse_for_each_entry(pos, root, member) \
>> +	for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member); \
>> +	     (pos); \
>> +	     (pos) = rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member))
>> +
>> +/**
>> + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe against removal
>> + *
>> + * @pos:	the 'type *' to use as a loop cursor
>> + * @n:		another 'type *' to use as temporary storage
>> + * @root:	'rb_root *' of the rbtree
>> + * @member:	the name of the rb_node field within 'type'
>> + */
>> +#define rbtree_for_each_entry_safe(pos, n, root, member) \
>> +	for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \
>> +	     (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL; \
>> +	     (pos); \
>> +	     (pos) = (n), \
>> +	     (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL)
>> +
>> +/**
>> + * rbtree_reverse_for_each_entry_safe - iterate in reverse in-order over rb_root
>> + * safe against removal
>> + *
>> + * @pos:	the struct type * to use as a loop cursor.
>> + * @n:		another struct type * to use as temporary storage.
>> + * @root:	pointer to struct rb_root to iterate.
>> + * @member:	name of the rb_node field within the struct.
>> + */
>> +#define rbtree_reverse_for_each_entry_safe(pos, n, root, member) \
>> +	for ((pos) = rb_entry_safe(rb_last(root), typeof(*(pos)), member), \
>> +	     (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL; \
>> +	     (pos); \
>> +	     (pos) = (n), \
>> +	     (n) = (pos) ? rb_entry_safe(rb_prev(&(pos)->member), typeof(*(pos)), member) : NULL)
>> +
> Not really a fan of these. That's typically a sign you're doing it
> wrong. Full tree iteration is actually slower than linked list.

I understand your concern about full-tree iteration being slower than a 
list walk. In our current use cases, though,

the cost is not on the hot path and performance is comparable or even 
better to list traversal. We occasionally need

to walk the full set of blocks to perform specific operations, and these 
macros make that code simpler and

less error-prone. They aren't meant to replace targeted lookups or 
bounded walks, just to cover where a full

traversal is necessary.

Thanks,

Arun.


  reply	other threads:[~2025-09-02  6:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-01 18:56 [PATCH v5 1/2] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-01 19:41 ` Peter Zijlstra
2025-09-02  6:09   ` Arunpravin Paneer Selvam [this message]
2025-09-02 10:23 ` Jani Nikula
2025-09-02 10:35   ` Arunpravin Paneer Selvam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f999a32-db26-4964-8152-ac06da8beea4@amd.com \
    --to=arunpravin.paneerselvam@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.auld@intel.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).