Re: [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation

AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Auld <matthew.auld@intel.com>
To: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>,
	christian.koenig@amd.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org
Cc: alexander.deucher@amd.com, jani.nikula@linux.intel.com,
	peterz@infradead.org, samuel.pitoiset@gmail.com
Subject: Re: [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation
Date: Fri, 26 Sep 2025 12:00:12 +0100	[thread overview]
Message-ID: <f2db7484-efd4-459a-93b2-b6097d7e86f5@intel.com> (raw)
In-Reply-To: <20250923090242.60649-3-Arunpravin.PaneerSelvam@amd.com>

On 23/09/2025 10:02, Arunpravin Paneer Selvam wrote:
> Add KUnit test cases that create severe memory fragmentation and
> measure allocation/free performance.
> 
> The tests simulate two scenarios -
> 
> 1. Allocation under severe fragmentation
>     - Allocate the entire 4 GiB space as 8 KiB blocks with 64 KiB alignment,
>       split them into two groups and free with mixed flags to block coalescing.
>     - Repeatedly allocate and free 64 KiB blocks while timing the loop.
>     - Freelist runtime: 76475 ms(76.5 seconds), soft-lockup triggered.
>       RB-tree runtime: 186 ms.
> 
> 2. Reverse free order under fragmentation
>     - Create a similarly fragmented space, free half the blocks, reverse
>       the order of the remainder, and release them with the cleared flag.
>     - Freelist runtime: 85620 ms(85.6 seconds).
>       RB-tree runtime: 114 ms.
> 
> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
> ---
>   drivers/gpu/drm/tests/drm_buddy_test.c | 110 +++++++++++++++++++++++++
>   1 file changed, 110 insertions(+)
> 
> diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c
> index 7a0e523651f0..19b49fb6ec19 100644
> --- a/drivers/gpu/drm/tests/drm_buddy_test.c
> +++ b/drivers/gpu/drm/tests/drm_buddy_test.c
> @@ -21,6 +21,115 @@ static inline u64 get_size(int order, u64 chunk_size)
>   	return (1 << order) * chunk_size;
>   }
>   
> +static void drm_test_buddy_fragmentation_performance(struct kunit *test)
> +{
> +	const unsigned long max_acceptable_time_ms = 1000;
> +	struct drm_buddy_block *block, *tmp;
> +	int num_blocks, i, ret, count = 0;
> +	LIST_HEAD(allocated_blocks);
> +	unsigned long elapsed_ms;
> +	LIST_HEAD(reverse_list);
> +	LIST_HEAD(test_blocks);
> +	LIST_HEAD(clear_list);
> +	LIST_HEAD(dirty_list);
> +	LIST_HEAD(free_list);
> +	struct drm_buddy mm;
> +	u64 mm_size = SZ_4G;
> +	ktime_t start, end;
> +
> +	/*
> +	 * Allocation under severe fragmentation
> +	 *
> +	 * Create severe fragmentation by allocating the entire 4 GiB address space
> +	 * as tiny 8 KiB blocks but forcing a 64 KiB alignment. The resulting pattern
> +	 * leaves many scattered holes. Split the allocations into two groups and
> +	 * return them with different flags to block coalescing, then repeatedly
> +	 * allocate and free 64 KiB blocks while timing the loop. This stresses how
> +	 * quickly the allocator can satisfy larger, aligned requests from a pool of
> +	 * highly fragmented space.
> +	 */
> +	KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_init(&mm, mm_size, SZ_4K),
> +			       "buddy_init failed\n");
> +
> +	num_blocks = mm_size / SZ_64K;
> +
> +	start = ktime_get();
> +	/* Allocate with maximum fragmentation - 8K blocks with 64K alignment */
> +	for (i = 0; i < num_blocks; i++)
> +		KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, SZ_8K, SZ_64K,
> +								    &allocated_blocks, 0),
> +					"buddy_alloc hit an error size=%u\n", SZ_8K);
> +
> +	list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
> +		if (count % 4 == 0 || count % 4 == 3)
> +			list_move_tail(&block->link, &clear_list);
> +		else
> +			list_move_tail(&block->link, &dirty_list);
> +		count++;
> +	}
> +
> +	/* Free with different flags to ensure no coalescing */
> +	drm_buddy_free_list(&mm, &clear_list, DRM_BUDDY_CLEARED);
> +	drm_buddy_free_list(&mm, &dirty_list, 0);
> +
> +	for (i = 0; i < num_blocks; i++)
> +		KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, SZ_64K, SZ_64K,
> +								    &test_blocks, 0),
> +					"buddy_alloc hit an error size=%u\n", SZ_64K);
> +	drm_buddy_free_list(&mm, &test_blocks, 0);
> +
> +	end = ktime_get();
> +	elapsed_ms = ktime_to_ms(ktime_sub(end, start));
> +	/* Performance validation */
> +	KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
> +			    "Fragmented allocation took %lu ms (max acceptable: %lu ms)",
> +			    elapsed_ms, max_acceptable_time_ms);
> +	drm_buddy_fini(&mm);
> +
> +	/*
> +	 * Reverse free order under fragmentation
> +	 *
> +	 * Construct a fragmented 4 GiB space by allocating every 8 KiB block with
> +	 * 64 KiB alignment, creating a dense scatter of small regions. Half of the
> +	 * blocks are selectively freed to form sparse gaps, while the remaining
> +	 * allocations are preserved, reordered in reverse, and released back with
> +	 * the cleared flag. This models a pathological reverse-ordered free pattern
> +	 * and measures how quickly the allocator can merge and reclaim space when
> +	 * deallocation occurs in the opposite order of allocation, exposing the
> +	 * cost difference between a linear freelist scan and an ordered tree lookup.
> +	 */
> +	ret = drm_buddy_init(&mm, mm_size, SZ_4K);
> +	KUNIT_ASSERT_EQ(test, ret, 0);
> +
> +	start = ktime_get();
> +	/* Allocate maximum fragmentation */
> +	for (i = 0; i < num_blocks; i++)
> +		KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, SZ_8K, SZ_64K,
> +								    &allocated_blocks, 0),
> +					"buddy_alloc hit an error size=%u\n", SZ_8K);
> +
> +	list_for_each_entry_safe(block, tmp, &allocated_blocks, link) {
> +		if (count % 2 == 0)
> +			list_move_tail(&block->link, &free_list);
> +		count++;
> +	}
> +	drm_buddy_free_list(&mm, &free_list, DRM_BUDDY_CLEARED);
> +
> +	list_for_each_entry_safe_reverse(block, tmp, &allocated_blocks, link)
> +		list_move(&block->link, &reverse_list);
> +	drm_buddy_free_list(&mm, &reverse_list, DRM_BUDDY_CLEARED);
> +
> +	end = ktime_get();
> +	elapsed_ms = ktime_to_ms(ktime_sub(end, start));
> +
> +	/* Performance validation */
> +	KUNIT_EXPECT_LT_MSG(test, elapsed_ms, max_acceptable_time_ms,
> +			    "Reverse-ordered free took %lu ms (max acceptable: %lu ms)",
> +			    elapsed_ms, max_acceptable_time_ms);

Sorry for the delay. We are pretty sure these time asserts are not going 
to be flaky over many thousands of runs across different types of 
machines (maybe some underpowered atom)?

Assuming not a concern,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> +
> +	drm_buddy_fini(&mm);
> +}
> +
>   static void drm_test_buddy_alloc_range_bias(struct kunit *test)
>   {
>   	u32 mm_size, size, ps, bias_size, bias_start, bias_end, bias_rem;
> @@ -772,6 +881,7 @@ static struct kunit_case drm_buddy_tests[] = {
>   	KUNIT_CASE(drm_test_buddy_alloc_contiguous),
>   	KUNIT_CASE(drm_test_buddy_alloc_clear),
>   	KUNIT_CASE(drm_test_buddy_alloc_range_bias),
> +	KUNIT_CASE(drm_test_buddy_fragmentation_performance),
>   	{}
>   };
>

next prev parent reply	other threads:[~2025-09-26 11:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23  9:02 [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-23  9:02 ` [PATCH v7 2/3] drm/buddy: Separate clear and dirty free block trees Arunpravin Paneer Selvam
2025-09-26 17:08   ` Matthew Auld
2025-10-04  8:30     ` Arunpravin Paneer Selvam
2025-09-23  9:02 ` [PATCH v7 3/3] drm/buddy: Add KUnit tests for allocator performance under fragmentation Arunpravin Paneer Selvam
2025-09-26 11:00   ` Matthew Auld [this message]
2025-10-01  3:59     ` Arunpravin Paneer Selvam
2025-09-26  5:41 ` [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree Arunpravin Paneer Selvam
2025-09-26 16:05 ` Matthew Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2db7484-efd4-459a-93b2-b6097d7e86f5@intel.com \
    --to=matthew.auld@intel.com \
    --cc=Arunpravin.PaneerSelvam@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=samuel.pitoiset@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox