AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Arunpravin Paneer Selvam <arunpravin.paneerselvam@amd.com>
To: Matthew Auld <matthew.auld@intel.com>,
	christian.koenig@amd.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org
Cc: alexander.deucher@amd.com, jani.nikula@linux.intel.com,
	peterz@infradead.org, samuel.pitoiset@gmail.com
Subject: Re: [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation
Date: Wed, 1 Oct 2025 09:29:47 +0530	[thread overview]
Message-ID: <f17b2883-125c-41b7-8be6-714da5597abd@amd.com> (raw)
In-Reply-To: <f2db7484-efd4-459a-93b2-b6097d7e86f5@intel.com>



On 9/26/2025 4:30 PM, Matthew Auld wrote:
> On 23/09/2025 10:02, Arunpravin Paneer Selvam wrote:
>> Add KUnit test cases that create severe memory fragmentation and
>> measure allocation/free performance.
>>
>> The tests simulate two scenarios -
>>
>> 1. Allocation under severe fragmentation
>>     - Allocate the entire 4 GiB space as 8 KiB blocks with 64 KiB 
>> alignment,
>>       split them into two groups and free with mixed flags to block 
>> coalescing.
>>     - Repeatedly allocate and free 64 KiB blocks while timing the loop.
>>     - Freelist runtime: 76475 ms(76.5 seconds), soft-lockup triggered.
>>       RB-tree runtime: 186 ms.
>>
>> 2. Reverse free order under fragmentation
>>     - Create a similarly fragmented space, free half the blocks, reverse
>>       the order of the remainder, and release them with the cleared 
>> flag.
>>     - Freelist runtime: 85620 ms(85.6 seconds).
>>       RB-tree runtime: 114 ms.
>>
>> Signed-off-by: Arunpravin Paneer Selvam 
>> <Arunpravin.PaneerSelvam@amd.com>
>> ---
>>   drivers/gpu/drm/tests/drm_buddy_test.c | 110 +++++++++++++++++++++++++
>>   1 file changed, 110 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c 
>> b/drivers/gpu/drm/tests/drm_buddy_test.c
>> index 7a0e523651f0..19b49fb6ec19 100644
>> --- a/drivers/gpu/drm/tests/drm_buddy_test.c
>> +++ b/drivers/gpu/drm/tests/drm_buddy_test.c
>> @@ -21,6 +21,115 @@ static inline u64 get_size(int order, u64 
>> chunk_size)
>>       return (1 << order) * chunk_size;
>>   }
>>   +static void drm_test_buddy_fragmentation_performance(struct kunit 
>> *test)
>> +{
>> +    const unsigned long max_acceptable_time_ms = 1000;
>> +    struct drm_buddy_block *block, *tmp;
>> +    int num_blocks, i, ret, count = 0;
>> +    LIST_HEAD(allocated_blocks);
>> +    unsigned long elapsed_ms;
>> +    LIST_HEAD(reverse_list);
>> +    LIST_HEAD(test_blocks);
>> +    LIST_HEAD(clear_list);
>> +    LIST_HEAD(dirty_list);
>> +    LIST_HEAD(free_list);
>> +    struct drm_buddy mm;
>> +    u64 mm_size = SZ_4G;
>> +    ktime_t start, end;
>> +
>> +    /*
>> +     * Allocation under severe fragmentation
>> +     *
>> +     * Create severe fragmentation by allocating the entire 4 GiB 
>> address space
>> +     * as tiny 8 KiB blocks but forcing a 64 KiB alignment. The 
>> resulting pattern
>> +     * leaves many scattered holes. Split the allocations into two 
>> groups and
>> +     * return them with different flags to block coalescing, then 
>> repeatedly
>> +     * allocate and free 64 KiB blocks while timing the loop. This 
>> stresses how
>> +     * quickly the allocator can satisfy larger, aligned requests 
>> from a pool of
>> +     * highly fragmented space.
>> +     */
>> +    KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_init(&mm, mm_size, SZ_4K),
>> +                   "buddy_init failed\n");
>> +
>> +    num_blocks = mm_size / SZ_64K;
>> +
>> +    start = ktime_get();
>> +    /* Allocate with maximum fragmentation - 8K blocks with 64K 
>> alignment */
>> +    for (i = 0; i < num_blocks; i++)
>> +        KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, 
>> mm_size, SZ_8K, SZ_64K,
>> +                                    &allocated_blocks, 0),
>> +                    "buddy_alloc hit an error size=%u\n", SZ_8K);
>> +
>> +    list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
>> +        if (count % 4 == 0 || count % 4 == 3)
>> +            list_move_tail(&block->link, &clear_list);
>> +        else
>> +            list_move_tail(&block->link, &dirty_list);
>> +        count++;
>> +    }
>> +
>> +    /* Free with different flags to ensure no coalescing */
>> +    drm_buddy_free_list(&mm, &clear_list, DRM_BUDDY_CLEARED);
>> +    drm_buddy_free_list(&mm, &dirty_list, 0);
>> +
>> +    for (i = 0; i < num_blocks; i++)
>> +        KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, 
>> mm_size, SZ_64K, SZ_64K,
>> +                                    &test_blocks, 0),
>> +                    "buddy_alloc hit an error size=%u\n", SZ_64K);
>> +    drm_buddy_free_list(&mm, &test_blocks, 0);
>> +
>> +    end = ktime_get();
>> +    elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> +    /* Performance validation */
>> +    KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
>> +                "Fragmented allocation took %lu ms (max acceptable: 
>> %lu ms)",
>> +                elapsed_ms, max_acceptable_time_ms);
>> +    drm_buddy_fini(&mm);
>> +
>> +    /*
>> +     * Reverse free order under fragmentation
>> +     *
>> +     * Construct a fragmented 4 GiB space by allocating every 8 KiB 
>> block with
>> +     * 64 KiB alignment, creating a dense scatter of small regions. 
>> Half of the
>> +     * blocks are selectively freed to form sparse gaps, while the 
>> remaining
>> +     * allocations are preserved, reordered in reverse, and released 
>> back with
>> +     * the cleared flag. This models a pathological reverse-ordered 
>> free pattern
>> +     * and measures how quickly the allocator can merge and reclaim 
>> space when
>> +     * deallocation occurs in the opposite order of allocation, 
>> exposing the
>> +     * cost difference between a linear freelist scan and an ordered 
>> tree lookup.
>> +     */
>> +    ret = drm_buddy_init(&mm, mm_size, SZ_4K);
>> +    KUNIT_ASSERT_EQ(test, ret, 0);
>> +
>> +    start = ktime_get();
>> +    /* Allocate maximum fragmentation */
>> +    for (i = 0; i < num_blocks; i++)
>> +        KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, 
>> mm_size, SZ_8K, SZ_64K,
>> +                                    &allocated_blocks, 0),
>> +                    "buddy_alloc hit an error size=%u\n", SZ_8K);
>> +
>> +    list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
>> +        if (count % 2 == 0)
>> +            list_move_tail(&block->link, &free_list);
>> +        count++;
>> +    }
>> +    drm_buddy_free_list(&mm, &free_list, DRM_BUDDY_CLEARED);
>> +
>> +    list_for_each_entry_safe_reverse(block, tmp, &allocated_blocks, 
>> link)
>> +        list_move(&block->link, &reverse_list);
>> +    drm_buddy_free_list(&mm, &reverse_list, DRM_BUDDY_CLEARED);
>> +
>> +    end = ktime_get();
>> +    elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> +
>> +    /* Performance validation */
>> +    KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
>> +                "Reverse-ordered free took %lu ms (max acceptable: 
>> %lu ms)",
>> +                elapsed_ms, max_acceptable_time_ms);
>
> Sorry for the delay. We are pretty sure these time asserts are not 
> going to be flaky over many thousands of runs across different types 
> of machines (maybe some underpowered atom)?
yes, correct. I have updated the performance test to avoid the hard 
coded timing thresholds. And the test now measures and reports execution 
time instead of enforcing a 1000ms limit,
since run times vary across machines. This ensures the test remains 
portable and stable, while still exposing the performance data for 
regression tracking.

Regards,
Arun.
>
> Assuming not a concern,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>
>> +
>> +    drm_buddy_fini(&mm);
>> +}
>> +
>>   static void drm_test_buddy_alloc_range_bias(struct kunit *test)
>>   {
>>       u32 mm_size, size, ps, bias_size, bias_start, bias_end, bias_rem;
>> @@ -772,6 +881,7 @@ static struct kunit_case drm_buddy_tests[] = {
>>       KUNIT_CASE(drm_test_buddy_alloc_contiguous),
>>       KUNIT_CASE(drm_test_buddy_alloc_clear),
>>       KUNIT_CASE(drm_test_buddy_alloc_range_bias),
>> +    KUNIT_CASE(drm_test_buddy_fragmentation_performance),
>>       {}
>>   };
>


  reply	other threads:[~2025-10-01  4:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23  9:02 [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-23  9:02 ` [PATCH v7 2/3] drm/buddy: Separate clear and dirty free block trees Arunpravin Paneer Selvam
2025-09-26 17:08   ` Matthew Auld
2025-10-04  8:30     ` Arunpravin Paneer Selvam
2025-09-23  9:02 ` [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation Arunpravin Paneer Selvam
2025-09-26 11:00   ` Matthew Auld
2025-10-01  3:59     ` Arunpravin Paneer Selvam [this message]
2025-09-26  5:41 ` [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-26 16:05 ` Matthew Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f17b2883-125c-41b7-8be6-714da5597abd@amd.com \
    --to=arunpravin.paneerselvam@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=peterz@infradead.org \
    --cc=samuel.pitoiset@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox