Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Arunpravin Paneer Selvam <arunpravin.paneerselvam@amd.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: matthew.auld@intel.com, jani.nikula@linux.intel.com,
	samuel.pitoiset@gmail.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, alexander.deucher@amd.com,
	stable@vger.kernel.org
Subject: Re: [PATCH v5 1/3] drm/buddy: Optimize free block management with RB tree
Date: Wed, 10 Sep 2025 16:22:53 +0200	[thread overview]
Message-ID: <12f56eda-a6a5-47bf-b89e-0cfa8443a439@amd.com> (raw)
In-Reply-To: <fcbf6ae8-f9ab-4723-8df4-16d2f0f62c3f@amd.com>

On 10.09.25 14:37, Arunpravin Paneer Selvam wrote:
> Hi Christian,
> 
> On 9/9/2025 9:55 PM, Christian König wrote:
>> On 09.09.25 16:05, Peter Zijlstra wrote:
>>> On Tue, Sep 09, 2025 at 02:04:30PM +0200, Christian König wrote:
>>>> Hi Arun,
>>>>
>>>> On 09.09.25 11:56, Arunpravin Paneer Selvam wrote:
>>>> [SNIP]
>>>>
>>>>> +/**
>>>>> + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe against removal
>>>>> + *
>>>>> + * @pos:    the 'type *' to use as a loop cursor
>>>>> + * @n:        another 'type *' to use as temporary storage
>>>>> + * @root:    'rb_root *' of the rbtree
>>>>> + * @member:    the name of the rb_node field within 'type'
>>>>> + */
>>>>> +#define rbtree_for_each_entry_safe(pos, n, root, member) \
>>>>> +    for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \
>>>>> +         (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL; \
>>>>> +         (pos); \
>>>>> +         (pos) = (n), \
>>>>> +         (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), typeof(*(pos)), member) : NULL)
>>>> As far as I know exactly that operation does not work on an R/B tree.
>>>>
>>>> See the _safe() variants of the for_each_ macros are usually used to iterate over a container while being able to remove entries.
>>>>
>>>> But because of the potential re-balance storing just the next entry is not sufficient for an R/B tree to do that as far as I know.
>>>>
>>>> Please explain how exactly you want to use this macro.
> Thanks for the pointer, yes, this will not work on RB tree. We need a reverse safe variant for use in the force_merge() function similar to the
> list_for_each_entry_safe_reverse() macro in list.h. The reason is that in force_merge(), we remove the block from the free tree before invoking
> drm_buddy_free(), which merges and frees buddy blocks to form a larger block.
>>> So I don't much like these iterators; I've said so before. Either we
>>> should introduce a properly threaded rb-tree (where the NULL child
>>> pointers encode a linked list), or simply keep a list_head next to the
>>> rb_node and use that.
>> I agree, something is clearly fishy here.
>>
>>> The rb_{next,prev}() things are O(ln n), in the worst case they do a
>>> full traversal up the tree and a full traversal down the other branch.
>> Yeah from the logic that is exactly what is supposed to happen in the __force_merge() function.
>>
>> The question is rather why does that function exists in the first place? The operation doesn't look logical to me.
>>
>> For drm_buddy_reset_clear() and drm_buddy_fini() we should use rbtree_postorder_for_each_entry_safe() instead.
>>
>> And during normal allocation __force_merge() should never be used.
> In normal allocation, the force_merge() function is used when no free blocks of the requested order are available. In such cases,
> smaller blocks must be merged on demand to satisfy the allocation. Mainly, this does not involve traversing the entire tree to
> merge all blocks, but only merging as needed. For example, if the requested order is 6, and the minimum order is 5, drm_buddy_alloc_blocks()
> will first attempt to allocate an order-6 block. If none are available, it will try to allocate two order-5 blocks. If those are also unavailable, it will
> invoke force_merge() to merge lower order blocks (4, 3, 2, 1, 0) in order to coalesce into a higher-order block of order 5.

Yeah and exactly that is what should never be necessary in the first place.

The idea of a buddy allocator is that blocks are merged when they are freed and not on demand.

The only use case for the force_merge() I can see is when cleared blocks are merged with non-cleared ones, but that is orthogonal to the discussion here.

> 
> In drm_buddy_fini(), force_merge() is called to ensure all blocks are merged before tearing down the allocator. This guarantees that all
> mm->roots are freed and not held by the driver at shutdown. If any blocks remain allocated, drm_buddy_fini() will issue a warning.
> 
> In drm_buddy_reset_clear(), which is invoked at device suspend/resume, it is an ideal place to call force_merge(). This ensures that all
> possible blocks are merged before resetting the clear state, thereby reducing fragmentation and improving allocation efficiency after resume.

That's where rbtree_postorder_for_each_entry_safe() should be used.

> I tried using this rbtree_postorder_for_each_entry_safe() macro in the force_merge() and it works, but we also a need a reverse variant
> since in normal allocation we dont want to disturb the lower addresses.

I don't get what you mean here.

Regards,
Christian.

> 
> Thanks,
> Arun.
>>
>>> That said; given 'next' will remain an existing node, only the 'pos'
>>> node gets removed, rb_next() will still work correctly, even in the face
>>> of rebalance.
>> Good to know!
>>
>> Regards,
>> Christian.
> 


  reply	other threads:[~2025-09-10 14:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-09  9:56 [PATCH v5 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-09 10:20 ` Arunpravin Paneer Selvam
2025-09-09 12:04 ` Christian König
2025-09-09 14:05   ` Peter Zijlstra
2025-09-09 16:25     ` Christian König
2025-09-10 12:37       ` Arunpravin Paneer Selvam
2025-09-10 14:22         ` Christian König [this message]
2025-09-23  9:23           ` Arunpravin Paneer Selvam
2025-09-23  8:54     ` Arunpravin Paneer Selvam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12f56eda-a6a5-47bf-b89e-0cfa8443a439@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=arunpravin.paneerselvam@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=peterz@infradead.org \
    --cc=samuel.pitoiset@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox