From: Arunpravin Paneer Selvam <arunpravin.paneerselvam@amd.com>
To: Matthew Auld <matthew.auld@intel.com>,
christian.koenig@amd.com, dri-devel@lists.freedesktop.org,
amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
intel-xe@lists.freedesktop.org
Cc: alexander.deucher@amd.com, jani.nikula@linux.intel.com,
peterz@infradead.org, samuel.pitoiset@gmail.com
Subject: Re: [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation
Date: Wed, 1 Oct 2025 09:29:47 +0530 [thread overview]
Message-ID: <f17b2883-125c-41b7-8be6-714da5597abd@amd.com> (raw)
In-Reply-To: <f2db7484-efd4-459a-93b2-b6097d7e86f5@intel.com>
On 9/26/2025 4:30 PM, Matthew Auld wrote:
> On 23/09/2025 10:02, Arunpravin Paneer Selvam wrote:
>> Add KUnit test cases that create severe memory fragmentation and
>> measure allocation/free performance.
>>
>> The tests simulate two scenarios -
>>
>> 1. Allocation under severe fragmentation
>> - Allocate the entire 4 GiB space as 8 KiB blocks with 64 KiB
>> alignment,
>> split them into two groups and free with mixed flags to block
>> coalescing.
>> - Repeatedly allocate and free 64 KiB blocks while timing the loop.
>> - Freelist runtime: 76475 ms(76.5 seconds), soft-lockup triggered.
>> RB-tree runtime: 186 ms.
>>
>> 2. Reverse free order under fragmentation
>> - Create a similarly fragmented space, free half the blocks, reverse
>> the order of the remainder, and release them with the cleared
>> flag.
>> - Freelist runtime: 85620 ms(85.6 seconds).
>> RB-tree runtime: 114 ms.
>>
>> Signed-off-by: Arunpravin Paneer Selvam
>> <Arunpravin.PaneerSelvam@amd.com>
>> ---
>> drivers/gpu/drm/tests/drm_buddy_test.c | 110 +++++++++++++++++++++++++
>> 1 file changed, 110 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c
>> b/drivers/gpu/drm/tests/drm_buddy_test.c
>> index 7a0e523651f0..19b49fb6ec19 100644
>> --- a/drivers/gpu/drm/tests/drm_buddy_test.c
>> +++ b/drivers/gpu/drm/tests/drm_buddy_test.c
>> @@ -21,6 +21,115 @@ static inline u64 get_size(int order, u64
>> chunk_size)
>> return (1 << order) * chunk_size;
>> }
>> +static void drm_test_buddy_fragmentation_performance(struct kunit
>> *test)
>> +{
>> + const unsigned long max_acceptable_time_ms = 1000;
>> + struct drm_buddy_block *block, *tmp;
>> + int num_blocks, i, ret, count = 0;
>> + LIST_HEAD(allocated_blocks);
>> + unsigned long elapsed_ms;
>> + LIST_HEAD(reverse_list);
>> + LIST_HEAD(test_blocks);
>> + LIST_HEAD(clear_list);
>> + LIST_HEAD(dirty_list);
>> + LIST_HEAD(free_list);
>> + struct drm_buddy mm;
>> + u64 mm_size = SZ_4G;
>> + ktime_t start, end;
>> +
>> + /*
>> + * Allocation under severe fragmentation
>> + *
>> + * Create severe fragmentation by allocating the entire 4 GiB
>> address space
>> + * as tiny 8 KiB blocks but forcing a 64 KiB alignment. The
>> resulting pattern
>> + * leaves many scattered holes. Split the allocations into two
>> groups and
>> + * return them with different flags to block coalescing, then
>> repeatedly
>> + * allocate and free 64 KiB blocks while timing the loop. This
>> stresses how
>> + * quickly the allocator can satisfy larger, aligned requests
>> from a pool of
>> + * highly fragmented space.
>> + */
>> + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_init(&mm, mm_size, SZ_4K),
>> + "buddy_init failed\n");
>> +
>> + num_blocks = mm_size / SZ_64K;
>> +
>> + start = ktime_get();
>> + /* Allocate with maximum fragmentation - 8K blocks with 64K
>> alignment */
>> + for (i = 0; i < num_blocks; i++)
>> + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0,
>> mm_size, SZ_8K, SZ_64K,
>> + &allocated_blocks, 0),
>> + "buddy_alloc hit an error size=%u\n", SZ_8K);
>> +
>> + list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
>> + if (count % 4 == 0 || count % 4 == 3)
>> + list_move_tail(&block->link, &clear_list);
>> + else
>> + list_move_tail(&block->link, &dirty_list);
>> + count++;
>> + }
>> +
>> + /* Free with different flags to ensure no coalescing */
>> + drm_buddy_free_list(&mm, &clear_list, DRM_BUDDY_CLEARED);
>> + drm_buddy_free_list(&mm, &dirty_list, 0);
>> +
>> + for (i = 0; i < num_blocks; i++)
>> + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0,
>> mm_size, SZ_64K, SZ_64K,
>> + &test_blocks, 0),
>> + "buddy_alloc hit an error size=%u\n", SZ_64K);
>> + drm_buddy_free_list(&mm, &test_blocks, 0);
>> +
>> + end = ktime_get();
>> + elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> + /* Performance validation */
>> + KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
>> + "Fragmented allocation took %lu ms (max acceptable:
>> %lu ms)",
>> + elapsed_ms, max_acceptable_time_ms);
>> + drm_buddy_fini(&mm);
>> +
>> + /*
>> + * Reverse free order under fragmentation
>> + *
>> + * Construct a fragmented 4 GiB space by allocating every 8 KiB
>> block with
>> + * 64 KiB alignment, creating a dense scatter of small regions.
>> Half of the
>> + * blocks are selectively freed to form sparse gaps, while the
>> remaining
>> + * allocations are preserved, reordered in reverse, and released
>> back with
>> + * the cleared flag. This models a pathological reverse-ordered
>> free pattern
>> + * and measures how quickly the allocator can merge and reclaim
>> space when
>> + * deallocation occurs in the opposite order of allocation,
>> exposing the
>> + * cost difference between a linear freelist scan and an ordered
>> tree lookup.
>> + */
>> + ret = drm_buddy_init(&mm, mm_size, SZ_4K);
>> + KUNIT_ASSERT_EQ(test, ret, 0);
>> +
>> + start = ktime_get();
>> + /* Allocate maximum fragmentation */
>> + for (i = 0; i < num_blocks; i++)
>> + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0,
>> mm_size, SZ_8K, SZ_64K,
>> + &allocated_blocks, 0),
>> + "buddy_alloc hit an error size=%u\n", SZ_8K);
>> +
>> + list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
>> + if (count % 2 == 0)
>> + list_move_tail(&block->link, &free_list);
>> + count++;
>> + }
>> + drm_buddy_free_list(&mm, &free_list, DRM_BUDDY_CLEARED);
>> +
>> + list_for_each_entry_safe_reverse(block, tmp, &allocated_blocks,
>> link)
>> + list_move(&block->link, &reverse_list);
>> + drm_buddy_free_list(&mm, &reverse_list, DRM_BUDDY_CLEARED);
>> +
>> + end = ktime_get();
>> + elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> +
>> + /* Performance validation */
>> + KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
>> + "Reverse-ordered free took %lu ms (max acceptable:
>> %lu ms)",
>> + elapsed_ms, max_acceptable_time_ms);
>
> Sorry for the delay. We are pretty sure these time asserts are not
> going to be flaky over many thousands of runs across different types
> of machines (maybe some underpowered atom)?
yes, correct. I have updated the performance test to avoid the hard
coded timing thresholds. And the test now measures and reports execution
time instead of enforcing a 1000ms limit,
since run times vary across machines. This ensures the test remains
portable and stable, while still exposing the performance data for
regression tracking.
Regards,
Arun.
>
> Assuming not a concern,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>
>> +
>> + drm_buddy_fini(&mm);
>> +}
>> +
>> static void drm_test_buddy_alloc_range_bias(struct kunit *test)
>> {
>> u32 mm_size, size, ps, bias_size, bias_start, bias_end, bias_rem;
>> @@ -772,6 +881,7 @@ static struct kunit_case drm_buddy_tests[] = {
>> KUNIT_CASE(drm_test_buddy_alloc_contiguous),
>> KUNIT_CASE(drm_test_buddy_alloc_clear),
>> KUNIT_CASE(drm_test_buddy_alloc_range_bias),
>> + KUNIT_CASE(drm_test_buddy_fragmentation_performance),
>> {}
>> };
>
next prev parent reply other threads:[~2025-10-01 4:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 9:02 [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-23 9:02 ` [PATCH v7 2/3] drm/buddy: Separate clear and dirty free block trees Arunpravin Paneer Selvam
2025-09-26 17:08 ` Matthew Auld
2025-10-04 8:30 ` Arunpravin Paneer Selvam
2025-09-23 9:02 ` [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation Arunpravin Paneer Selvam
2025-09-26 11:00 ` Matthew Auld
2025-10-01 3:59 ` Arunpravin Paneer Selvam [this message]
2025-09-23 9:11 ` ✗ CI.checkpatch: warning for series starting with [v7,1/3] drm/buddy: Optimize free block management with RB tree Patchwork
2025-09-23 9:12 ` ✓ CI.KUnit: success " Patchwork
2025-09-23 9:53 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-23 10:31 ` ✓ Xe.CI.Full: " Patchwork
2025-09-26 5:41 ` [PATCH v7 1/3] " Arunpravin Paneer Selvam
2025-09-26 16:05 ` Matthew Auld
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f17b2883-125c-41b7-8be6-714da5597abd@amd.com \
--to=arunpravin.paneerselvam@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=peterz@infradead.org \
--cc=samuel.pitoiset@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox