[PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
@ 2026-05-27 11:29 Arunpravin Paneer Selvam
  2026-05-27 11:29 ` [PATCH v4 2/2] gpu/tests/buddy: add clear-tracker allocation latency benchmarks Arunpravin Paneer Selvam
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-05-27 11:29 UTC (permalink / raw)
  To: matthew.auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher, Arunpravin Paneer Selvam

The current buddy allocator maintains separate clear_tree[] and
dirty_tree[] rbtrees per order, preventing coalescing between cleared
and dirty buddies. Under mixed workloads, this creates a merge barrier:
adjacent buddies frequently end up split across trees, forcing reliance
on __force_merge() during allocation.

__force_merge() performs an O(N x max_order) scan under the VRAM manager
lock, leading to allocation stalls and failures for large contiguous
requests even when sufficient total free memory is available.

Solution

Replace the dual-tree design with:
- A single free_tree[order] rbtree for dirty and mixed free blocks
  (fully cleared free blocks float outside this tree)
- A lightweight out-of-band clear tracker (gpu_clear_tracker)

Fully cleared free blocks are tracked outside the buddy trees using an
augmented interval rbtree, enabling O(log E) lookup of the largest
cleared extents.

Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
of clear/dirty state. This removes the merge barrier and eliminates the
need for __force_merge().

Benefits

- Correct high-order allocations after mixed clear/dirty workloads
- Elimination of O(N x max_order) merge cost from the allocation path
- O(log E) cleared-extent lookup replacing O(N) scans
- Predictable allocation latency under fragmentation
- Reduced complexity with a single tree per order

Test:
dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000

Below data is from /sys/kernel/debug/dri/1/amdgpu_vram_mm:

Base (dual-tree), before VKCTS test:
  order- 6 free:   6 MiB,  blocks: 26
  order- 5 free:   1 MiB,  blocks: 15
  order- 4 free: 960 KiB,  blocks: 15
  order- 3 free:   5 MiB,  blocks: 171
  order- 2 free:   2 MiB,  blocks: 176
  order- 1 free:   1 MiB,  blocks: 165
  order- 0 free:  16 KiB,  blocks: 4

Base (dual-tree), after VKCTS test:
  order- 6 free: 768 KiB,  blocks: 3
  order- 5 free: 499 MiB,  blocks: 3999
  order- 4 free: 250 MiB,  blocks: 4001
  order- 3 free: 129 MiB,  blocks: 4157
  order- 2 free:  65 MiB,  blocks: 4161
  order- 1 free:  63 MiB,  blocks: 8138
  order- 0 free:  20 KiB,  blocks: 5

Clear tracker, before VKCTS test:
  order- 6 free:   4 MiB,  blocks: 19
  order- 5 free:   2 MiB,  blocks: 18
  order- 4 free: 704 KiB,  blocks: 11
  order- 3 free:   5 MiB,  blocks: 168
  order- 2 free:   2 MiB,  blocks: 174
  order- 1 free:   1 MiB,  blocks: 167
  order- 0 free:  32 KiB,  blocks: 8

Clear tracker, after VKCTS test:
  order- 6 free:   4 MiB,  blocks: 19
  order- 5 free:   2 MiB,  blocks: 18
  order- 4 free: 704 KiB,  blocks: 11
  order- 3 free:   5 MiB,  blocks: 168
  order- 2 free:   2 MiB,  blocks: 174
  order- 1 free:   1 MiB,  blocks: 167
  order- 0 free:  28 KiB,  blocks: 7

v2:
 - Code-style cleanup and minor refactoring
 - Renamed locals for clarity

v3:
 - Keep cleared blocks inside free_tree[] instead of floating them.
 - Add subtree_has_dirty rbtree augment for O(log N) dirty-first walk.

v4:
 - Fixed checkpatch warnings.
 - Optimized gpu_buddy_reset_clear() to a single post-order walk that
   flips block headers and recomputes the rbtree augment in one pass.
 - Propagate subtree_max_size top-down in insert_extent() so ancestors
   are not left with stale values on no-rotation inserts. (sashiko)
 - Drop the whole extent in gpu_clear_tracker_mark_dirty() when the
   inside-split allocation fails, avoiding a stale clear claim. (sashiko)
 - Make gpu_clear_tracker_find() alignment-aware and fall back to the
   dirty tree on steered failure to avoid spurious -ENOSPC. (sashiko)

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
---
 drivers/gpu/buddy.c                | 1164 ++++++++++++++++++----------
 drivers/gpu/drm/drm_buddy.c        |   12 +-
 drivers/gpu/tests/gpu_buddy_test.c |   18 +-
 include/linux/gpu_buddy.h          |   64 +-
 4 files changed, 829 insertions(+), 429 deletions(-)

diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index eb1457376307..dca66cc43959 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -8,6 +8,7 @@
 #include <linux/kmemleak.h>
 #include <linux/module.h>
 #include <linux/sizes.h>
+#include <linux/slab.h>
 
 #include <linux/gpu_buddy.h>
 
@@ -35,6 +36,364 @@
 
 static struct kmem_cache *slab_blocks;
 
+static struct kmem_cache *slab_extents;
+
+/*
+ * Clear tracker
+ * -------------
+ *
+ * The clear tracker maintains an augmented interval rbtree of contiguous
+ * cleared (zeroed) address ranges, decoupled from the buddy free trees.
+ * Each node covers a maximal coalesced run; adjacent extents are merged
+ * on insertion so the tree always holds the smallest possible number of
+ * extents.  The augmentation field @subtree_max_size lets the allocator
+ * locate the largest cleared extent in O(log E).
+ */
+
+static u64 extent_size(struct gpu_clear_extent *clear_extent)
+{
+	return clear_extent->end - clear_extent->start;
+}
+
+RB_DECLARE_CALLBACKS_MAX(static, gpu_clear_augment_cb,
+			 struct gpu_clear_extent, rb,
+			 u64, subtree_max_size,
+			 extent_size)
+
+static struct gpu_clear_extent *extent_alloc(void)
+{
+	return kmem_cache_zalloc(slab_extents, GFP_KERNEL);
+}
+
+static void extent_free(struct gpu_clear_extent *clear_extent)
+{
+	kmem_cache_free(slab_extents, clear_extent);
+}
+
+/* Return the rightmost extent whose start is strictly below @offset. */
+static struct gpu_clear_extent *
+prev_extent(struct gpu_clear_tracker *clear_tracker, u64 offset)
+{
+	struct rb_node *rb = clear_tracker->root.rb_node;
+	struct gpu_clear_extent *clear_extent = NULL;
+
+	while (rb) {
+		struct gpu_clear_extent *tmp_extent =
+			rb_entry(rb, struct gpu_clear_extent, rb);
+
+		if (tmp_extent->start < offset) {
+			clear_extent = tmp_extent;
+			rb = rb->rb_right;
+		} else {
+			rb = rb->rb_left;
+		}
+	}
+
+	return clear_extent;
+}
+
+/* Return the leftmost extent whose start is at or above @offset. */
+static struct gpu_clear_extent *
+next_extent(struct gpu_clear_tracker *clear_tracker, u64 offset)
+{
+	struct rb_node *rb = clear_tracker->root.rb_node;
+	struct gpu_clear_extent *clear_extent = NULL;
+
+	while (rb) {
+		struct gpu_clear_extent *tmp_extent =
+			rb_entry(rb, struct gpu_clear_extent, rb);
+
+		if (tmp_extent->start >= offset) {
+			clear_extent = tmp_extent;
+			rb = rb->rb_left;
+		} else {
+			rb = rb->rb_right;
+		}
+	}
+
+	return clear_extent;
+}
+
+static void insert_extent(struct gpu_clear_tracker *clear_tracker,
+			  struct gpu_clear_extent *clear_extent)
+{
+	struct rb_node **link = &clear_tracker->root.rb_node;
+	struct rb_node *parent = NULL;
+	u64 size = extent_size(clear_extent);
+
+	while (*link) {
+		struct gpu_clear_extent *tmp_extent;
+
+		parent = *link;
+		tmp_extent = rb_entry(parent, struct gpu_clear_extent, rb);
+
+		if (tmp_extent->subtree_max_size < size)
+			tmp_extent->subtree_max_size = size;
+
+		if (clear_extent->start < tmp_extent->start)
+			link = &parent->rb_left;
+		else
+			link = &parent->rb_right;
+	}
+
+	clear_extent->subtree_max_size = size;
+	rb_link_node(&clear_extent->rb, parent, link);
+	rb_insert_augmented(&clear_extent->rb, &clear_tracker->root, &gpu_clear_augment_cb);
+}
+
+static void remove_extent(struct gpu_clear_tracker *clear_tracker,
+			  struct gpu_clear_extent *clear_extent)
+{
+	rb_erase_augmented(&clear_extent->rb, &clear_tracker->root, &gpu_clear_augment_cb);
+	RB_CLEAR_NODE(&clear_extent->rb);
+}
+
+static void gpu_clear_tracker_init(struct gpu_clear_tracker *clear_tracker)
+{
+	clear_tracker->root = RB_ROOT;
+	clear_tracker->total_clear = 0;
+}
+
+static void gpu_clear_tracker_fini(struct gpu_clear_tracker *clear_tracker)
+{
+	struct rb_node *rb;
+
+	while ((rb = rb_first(&clear_tracker->root))) {
+		struct gpu_clear_extent *clear_extent =
+			rb_entry(rb, struct gpu_clear_extent, rb);
+
+		remove_extent(clear_tracker, clear_extent);
+		extent_free(clear_extent);
+	}
+
+	clear_tracker->total_clear = 0;
+}
+
+/*
+ * Mark the range [start, start + size] as cleared. Merge with the neighbour on
+ * each side if they are contiguous, so the tree never holds two adjacent ranges.
+ */
+static void gpu_clear_tracker_mark_clear(struct gpu_clear_tracker *clear_tracker,
+					 u64 start, u64 size)
+{
+	struct gpu_clear_extent *left, *right, *clear_extent;
+	u64 end = start + size;
+
+	if (!size)
+		return;
+
+	/* Find contiguous neighbours, if any. */
+	left = prev_extent(clear_tracker, start);
+	if (left && left->end != start)
+		left = NULL;
+
+	right = next_extent(clear_tracker, end);
+	if (right && right->start != end)
+		right = NULL;
+
+	if (left && right) {
+		/* Merge left + new + right into a single extent. */
+		remove_extent(clear_tracker, left);
+		remove_extent(clear_tracker, right);
+		left->end = right->end;
+		extent_free(right);
+		insert_extent(clear_tracker, left);
+	} else if (left) {
+		/* Extend left neighbour rightwards. */
+		remove_extent(clear_tracker, left);
+		left->end = end;
+		insert_extent(clear_tracker, left);
+	} else if (right) {
+		/* Extend right neighbour leftwards. */
+		remove_extent(clear_tracker, right);
+		right->start = start;
+		insert_extent(clear_tracker, right);
+	} else {
+		/* Standalone extent. */
+		clear_extent = extent_alloc();
+		if (!clear_extent)
+			return;
+
+		clear_extent->start = start;
+		clear_extent->end   = end;
+		insert_extent(clear_tracker, clear_extent);
+	}
+
+	clear_tracker->total_clear += size;
+}
+
+/*
+ * Mark the range [start, start + size] as dirty. Remove the range from every
+ * overlapping clear extent, splitting one extent in two if the dirty range
+ * falls strictly inside it.
+ */
+static void gpu_clear_tracker_mark_dirty(struct gpu_clear_tracker *clear_tracker,
+					 u64 start, u64 size)
+{
+	struct gpu_clear_extent *clear_extent, *next;
+	u64 end = start + size;
+
+	if (!size)
+		return;
+
+	clear_extent = prev_extent(clear_tracker, start + 1);
+	if (!clear_extent)
+		clear_extent = next_extent(clear_tracker, start);
+
+	while (clear_extent && clear_extent->start < end) {
+		struct rb_node *next_node = rb_next(&clear_extent->rb);
+		u64 extent_start = clear_extent->start;
+		u64 extent_end = clear_extent->end;
+
+		if (next_node)
+			next = rb_entry(next_node, struct gpu_clear_extent, rb);
+		else
+			next = NULL;
+
+		/* Skip a non-overlapping neighbour returned by prev_extent(). */
+		if (extent_end <= start) {
+			clear_extent = next;
+			continue;
+		}
+
+		if (extent_start < start && extent_end > end) {
+			/* Dirty range falls strictly inside: split into left + right. */
+			struct gpu_clear_extent *right = extent_alloc();
+
+			if (!right) {
+				remove_extent(clear_tracker, clear_extent);
+				extent_free(clear_extent);
+
+				clear_tracker->total_clear -=
+					(extent_end - extent_start);
+
+				clear_extent = next;
+				continue;
+			}
+
+			remove_extent(clear_tracker, clear_extent);
+
+			clear_extent->end = start;
+			right->start = end;
+			right->end   = extent_end;
+
+			insert_extent(clear_tracker, clear_extent);
+			insert_extent(clear_tracker, right);
+
+			clear_tracker->total_clear -= size;
+		} else if (extent_start >= start && extent_end <= end) {
+			/* Extent fully covered: drop it. */
+			remove_extent(clear_tracker, clear_extent);
+			extent_free(clear_extent);
+
+			clear_tracker->total_clear -= (extent_end - extent_start);
+		} else if (extent_start < start) {
+			/* Extent overlaps from the left: trim its right end. */
+			remove_extent(clear_tracker, clear_extent);
+			clear_extent->end = start;
+			insert_extent(clear_tracker, clear_extent);
+
+			clear_tracker->total_clear -= (extent_end - start);
+		} else {
+			/* Extent overlaps from the right: trim its left end. */
+			remove_extent(clear_tracker, clear_extent);
+			clear_extent->start = end;
+			insert_extent(clear_tracker, clear_extent);
+
+			clear_tracker->total_clear -= (end - extent_start);
+		}
+
+		clear_extent = next;
+	}
+}
+
+/*
+ * Returns true if the range [start, start + size] lies entirely within
+ * a single clear extent in the tracker, i.e. the whole range is known
+ * to be cleared.
+ */
+static bool gpu_clear_tracker_is_clear(struct gpu_clear_tracker *clear_tracker,
+				       u64 start, u64 size)
+{
+	struct gpu_clear_extent *clear_extent;
+	u64 end = start + size;
+
+	clear_extent = prev_extent(clear_tracker, start + 1);
+	if (!clear_extent)
+		return false;
+
+	return clear_extent->start <= start && clear_extent->end >= end;
+}
+
+static struct rb_node *
+clear_tracker_descend_right(struct rb_node *node, u64 min_size)
+{
+	while (node->rb_right) {
+		struct gpu_clear_extent *tmp_extent;
+
+		tmp_extent = rb_entry(node->rb_right, struct gpu_clear_extent, rb);
+
+		if (tmp_extent->subtree_max_size < min_size)
+			break;
+		node = node->rb_right;
+	}
+
+	return node;
+}
+
+static struct gpu_clear_extent *
+gpu_clear_tracker_find(struct gpu_clear_tracker *clear_tracker, u64 min_size)
+{
+	struct rb_node *rb = clear_tracker->root.rb_node;
+	struct gpu_clear_extent *root_extent;
+	struct rb_node *parent;
+
+	if (WARN_ON(!min_size || !is_power_of_2(min_size)))
+		return NULL;
+
+	if (!rb)
+		return NULL;
+
+	root_extent = rb_entry(rb, struct gpu_clear_extent, rb);
+	if (root_extent->subtree_max_size < min_size)
+		return NULL;
+
+	rb = clear_tracker_descend_right(rb, min_size);
+
+	while (rb) {
+		struct gpu_clear_extent *clear_extent;
+		u64 aligned_start;
+
+		clear_extent = rb_entry(rb, struct gpu_clear_extent, rb);
+		aligned_start = ALIGN(clear_extent->start, min_size);
+
+		/* Check if a naturally aligned min_size block fits. */
+		if (aligned_start <= clear_extent->end &&
+		    clear_extent->end - aligned_start >= min_size)
+			return clear_extent;
+
+		if (rb->rb_left) {
+			struct gpu_clear_extent *tmp_extent;
+
+			tmp_extent = rb_entry(rb->rb_left, struct gpu_clear_extent, rb);
+			if (tmp_extent->subtree_max_size >= min_size) {
+				rb = clear_tracker_descend_right(rb->rb_left, min_size);
+				continue;
+			}
+		}
+
+		/* Walk up until we exit a node via its right child. */
+		parent = rb_parent(rb);
+		while (parent && parent->rb_right != rb) {
+			rb = parent;
+			parent = rb_parent(rb);
+		}
+		rb = parent;
+	}
+
+	return NULL;
+}
+
 static unsigned int
 gpu_buddy_block_state(struct gpu_buddy_block *block)
 {
@@ -67,10 +426,93 @@ static unsigned int gpu_buddy_block_offset_alignment(struct gpu_buddy_block *blo
 	return __ffs64(offset);
 }
 
-RB_DECLARE_CALLBACKS_MAX(static, gpu_buddy_augment_cb,
-			 struct gpu_buddy_block, rb,
-			 unsigned int, subtree_max_alignment,
-			 gpu_buddy_block_offset_alignment);
+static inline bool
+gpu_buddy_block_is_dirty(struct gpu_buddy_block *block)
+{
+	return !gpu_buddy_block_is_clear(block);
+}
+
+static inline void gpu_buddy_augment_compute(struct gpu_buddy_block *block)
+{
+	struct gpu_buddy_block *right;
+	struct gpu_buddy_block *left;
+	unsigned int max_align;
+	bool has_dirty;
+
+	max_align = gpu_buddy_block_offset_alignment(block);
+	has_dirty = gpu_buddy_block_is_dirty(block);
+
+	left = rb_entry_safe(block->rb.rb_left, struct gpu_buddy_block, rb);
+	if (left) {
+		if (left->subtree_max_alignment > max_align)
+			max_align = left->subtree_max_alignment;
+
+		has_dirty |= left->subtree_has_dirty;
+	}
+
+	right = rb_entry_safe(block->rb.rb_right, struct gpu_buddy_block, rb);
+	if (right) {
+		if (right->subtree_max_alignment > max_align)
+			max_align = right->subtree_max_alignment;
+
+		has_dirty |= right->subtree_has_dirty;
+	}
+
+	block->subtree_max_alignment = max_align;
+	block->subtree_has_dirty = has_dirty;
+}
+
+static void gpu_buddy_augment_propagate(struct rb_node *rb, struct rb_node *stop)
+{
+	while (rb != stop) {
+		struct gpu_buddy_block *block;
+		unsigned int old_align;
+		bool old_has_dirty;
+
+		block = rb_entry(rb, struct gpu_buddy_block, rb);
+		old_align = block->subtree_max_alignment;
+		old_has_dirty = block->subtree_has_dirty;
+
+		gpu_buddy_augment_compute(block);
+		if (block->subtree_max_alignment == old_align &&
+		    block->subtree_has_dirty == old_has_dirty)
+			break;
+
+		rb = rb_parent(&block->rb);
+	}
+}
+
+static void gpu_buddy_augment_copy(struct rb_node *rb_old, struct rb_node *rb_new)
+{
+	struct gpu_buddy_block *old;
+	struct gpu_buddy_block *new;
+
+	old = rb_entry(rb_old, struct gpu_buddy_block, rb);
+	new = rb_entry(rb_new, struct gpu_buddy_block, rb);
+
+	new->subtree_max_alignment = old->subtree_max_alignment;
+	new->subtree_has_dirty = old->subtree_has_dirty;
+}
+
+static void gpu_buddy_augment_rotate(struct rb_node *rb_old, struct rb_node *rb_new)
+{
+	struct gpu_buddy_block *old;
+	struct gpu_buddy_block *new;
+
+	old = rb_entry(rb_old, struct gpu_buddy_block, rb);
+	new = rb_entry(rb_new, struct gpu_buddy_block, rb);
+
+	new->subtree_max_alignment = old->subtree_max_alignment;
+	new->subtree_has_dirty = old->subtree_has_dirty;
+
+	gpu_buddy_augment_compute(old);
+}
+
+static const struct rb_augment_callbacks gpu_buddy_augment_cb = {
+	.propagate = gpu_buddy_augment_propagate,
+	.copy      = gpu_buddy_augment_copy,
+	.rotate    = gpu_buddy_augment_rotate,
+};
 
 static struct gpu_buddy_block *gpu_block_alloc(struct gpu_buddy *mm,
 					       struct gpu_buddy_block *parent,
@@ -101,13 +543,6 @@ static void gpu_block_free(struct gpu_buddy *mm,
 	kmem_cache_free(slab_blocks, block);
 }
 
-static enum gpu_buddy_free_tree
-get_block_tree(struct gpu_buddy_block *block)
-{
-	return gpu_buddy_block_is_clear(block) ?
-	       GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
-}
-
 static struct gpu_buddy_block *
 rbtree_get_free_block(const struct rb_node *node)
 {
@@ -120,24 +555,61 @@ rbtree_last_free_block(struct rb_root *root)
 	return rbtree_get_free_block(rb_last(root));
 }
 
-static bool rbtree_is_empty(struct rb_root *root)
+/*
+ * Find the rightmost (highest-offset) free block in @root that is itself
+ * dirty, by descending the tree using the subtree_has_dirty augment to
+ * skip subtrees that contain only cleared blocks.  Returns NULL if no
+ * dirty block exists in the tree.
+ */
+static struct gpu_buddy_block *
+rbtree_last_dirty_free_block(struct rb_root *root)
 {
-	return RB_EMPTY_ROOT(root);
+	struct gpu_buddy_block *block = NULL;
+	struct rb_node *node = root->rb_node;
+
+	while (node) {
+		struct gpu_buddy_block *right_block;
+		struct gpu_buddy_block *node_block;
+
+		node_block = rbtree_get_free_block(node);
+		right_block = rbtree_get_free_block(node->rb_right);
+
+		/*
+		 * Prefer the rightmost subtree that contains a dirty block;
+		 * fall back to the current node if it is itself dirty;
+		 * otherwise descend left.
+		 */
+		if (right_block && right_block->subtree_has_dirty) {
+			node = node->rb_right;
+			continue;
+		}
+
+		if (gpu_buddy_block_is_dirty(node_block)) {
+			block = node_block;
+			break;
+		}
+
+		node = node->rb_left;
+	}
+
+	return block;
 }
 
 static void rbtree_insert(struct gpu_buddy *mm,
-			  struct gpu_buddy_block *block,
-			  enum gpu_buddy_free_tree tree)
+			  struct gpu_buddy_block *block)
 {
 	struct rb_node **link, *parent = NULL;
-	unsigned int block_alignment, order;
 	struct gpu_buddy_block *node;
+	unsigned int block_alignment;
 	struct rb_root *root;
+	unsigned int order;
+	bool block_dirty;
 
 	order = gpu_buddy_block_order(block);
 	block_alignment = gpu_buddy_block_offset_alignment(block);
+	block_dirty = gpu_buddy_block_is_dirty(block);
 
-	root = &mm->free_trees[tree][order];
+	root = &mm->free_tree[order];
 	link = &root->rb_node;
 
 	while (*link) {
@@ -147,10 +619,12 @@ static void rbtree_insert(struct gpu_buddy *mm,
 		 * Manual augmentation update during insertion traversal. Required
 		 * because rb_insert_augmented() only calls rotate callback during
 		 * rotations. This ensures all ancestors on the insertion path have
-		 * correct subtree_max_alignment values.
+		 * correct subtree_max_alignment / subtree_has_dirty values.
 		 */
 		if (node->subtree_max_alignment < block_alignment)
 			node->subtree_max_alignment = block_alignment;
+		if (block_dirty)
+			node->subtree_has_dirty = true;
 
 		if (gpu_buddy_block_offset(block) < gpu_buddy_block_offset(node))
 			link = &parent->rb_left;
@@ -159,6 +633,7 @@ static void rbtree_insert(struct gpu_buddy *mm,
 	}
 
 	block->subtree_max_alignment = block_alignment;
+	block->subtree_has_dirty = block_dirty;
 	rb_link_node(&block->rb, parent, link);
 	rb_insert_augmented(&block->rb, root, &gpu_buddy_augment_cb);
 }
@@ -167,26 +642,11 @@ static void rbtree_remove(struct gpu_buddy *mm,
 			  struct gpu_buddy_block *block)
 {
 	unsigned int order = gpu_buddy_block_order(block);
-	enum gpu_buddy_free_tree tree;
-	struct rb_root *root;
-
-	tree = get_block_tree(block);
-	root = &mm->free_trees[tree][order];
 
-	rb_erase_augmented(&block->rb, root, &gpu_buddy_augment_cb);
+	rb_erase_augmented(&block->rb, &mm->free_tree[order], &gpu_buddy_augment_cb);
 	RB_CLEAR_NODE(&block->rb);
 }
 
-static void clear_reset(struct gpu_buddy_block *block)
-{
-	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
-}
-
-static void mark_cleared(struct gpu_buddy_block *block)
-{
-	block->header |= GPU_BUDDY_HEADER_CLEAR;
-}
-
 static void mark_allocated(struct gpu_buddy *mm,
 			   struct gpu_buddy_block *block)
 {
@@ -199,13 +659,17 @@ static void mark_allocated(struct gpu_buddy *mm,
 static void mark_free(struct gpu_buddy *mm,
 		      struct gpu_buddy_block *block)
 {
-	enum gpu_buddy_free_tree tree;
-
 	block->header &= ~GPU_BUDDY_HEADER_STATE;
 	block->header |= GPU_BUDDY_FREE;
 
-	tree = get_block_tree(block);
-	rbtree_insert(mm, block, tree);
+	if (gpu_clear_tracker_is_clear(&mm->clear,
+				       gpu_buddy_block_offset(block),
+				       gpu_buddy_block_size(mm, block)))
+		block->header |= GPU_BUDDY_HEADER_CLEAR;
+	else
+		block->header &= ~GPU_BUDDY_HEADER_CLEAR;
+
+	rbtree_insert(mm, block);
 }
 
 static void mark_split(struct gpu_buddy *mm,
@@ -243,36 +707,18 @@ __get_buddy(struct gpu_buddy_block *block)
 }
 
 static unsigned int __gpu_buddy_free(struct gpu_buddy *mm,
-				     struct gpu_buddy_block *block,
-				     bool force_merge)
+				     struct gpu_buddy_block *block)
 {
 	struct gpu_buddy_block *parent;
 	unsigned int order;
 
 	while ((parent = block->parent)) {
-		struct gpu_buddy_block *buddy;
-
-		buddy = __get_buddy(block);
+		struct gpu_buddy_block *buddy = __get_buddy(block);
 
 		if (!gpu_buddy_block_is_free(buddy))
 			break;
 
-		if (!force_merge) {
-			/*
-			 * Check the block and its buddy clear state and exit
-			 * the loop if they both have the dissimilar state.
-			 */
-			if (gpu_buddy_block_is_clear(block) !=
-			    gpu_buddy_block_is_clear(buddy))
-				break;
-
-			if (gpu_buddy_block_is_clear(block))
-				mark_cleared(parent);
-		}
-
 		rbtree_remove(mm, buddy);
-		if (force_merge && gpu_buddy_block_is_clear(buddy))
-			mm->clear_avail -= gpu_buddy_block_size(mm, buddy);
 
 		gpu_block_free(mm, block);
 		gpu_block_free(mm, buddy);
@@ -286,66 +732,15 @@ static unsigned int __gpu_buddy_free(struct gpu_buddy *mm,
 	return order;
 }
 
-static int __force_merge(struct gpu_buddy *mm,
-			 u64 start,
-			 u64 end,
-			 unsigned int min_order)
+static void undo_partial_split(struct gpu_buddy *mm,
+			       struct gpu_buddy_block *block)
 {
-	unsigned int tree, order;
-	int i;
+	struct gpu_buddy_block *buddy = __get_buddy(block);
 
-	if (!min_order)
-		return -ENOMEM;
-
-	if (min_order > mm->max_order)
-		return -EINVAL;
-
-	for_each_free_tree(tree) {
-		for (i = min_order - 1; i >= 0; i--) {
-			struct rb_node *iter = rb_last(&mm->free_trees[tree][i]);
-
-			while (iter) {
-				struct gpu_buddy_block *block, *buddy;
-				u64 block_start, block_end;
-
-				block = rbtree_get_free_block(iter);
-				iter = rb_prev(iter);
-
-				if (!block || !block->parent)
-					continue;
-
-				block_start = gpu_buddy_block_offset(block);
-				block_end = block_start + gpu_buddy_block_size(mm, block) - 1;
-
-				if (!contains(start, end, block_start, block_end))
-					continue;
-
-				buddy = __get_buddy(block);
-				if (!gpu_buddy_block_is_free(buddy))
-					continue;
-
-				gpu_buddy_assert(gpu_buddy_block_is_clear(block) !=
-						 gpu_buddy_block_is_clear(buddy));
-
-				/*
-				 * Advance to the next node when the current node is the buddy,
-				 * as freeing the block will also remove its buddy from the tree.
-				 */
-				if (iter == &buddy->rb)
-					iter = rb_prev(iter);
-
-				rbtree_remove(mm, block);
-				if (gpu_buddy_block_is_clear(block))
-					mm->clear_avail -= gpu_buddy_block_size(mm, block);
-
-				order = __gpu_buddy_free(mm, block, true);
-				if (order >= min_order)
-					return 0;
-			}
-		}
-	}
-
-	return -ENOMEM;
+	if (buddy &&
+	    gpu_buddy_block_is_free(block) &&
+	    gpu_buddy_block_is_free(buddy))
+		__gpu_buddy_free(mm, block);
 }
 
 /**
@@ -362,7 +757,7 @@ static int __force_merge(struct gpu_buddy *mm,
  */
 int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
 {
-	unsigned int i, j, root_count = 0;
+	unsigned int root_count = 0;
 	u64 offset = 0;
 
 	if (size < chunk_size)
@@ -384,22 +779,13 @@ int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
 
 	BUG_ON(mm->max_order > GPU_BUDDY_MAX_ORDER);
 
-	mm->free_trees = kmalloc_array(GPU_BUDDY_MAX_FREE_TREES,
-				       sizeof(*mm->free_trees),
-				       GFP_KERNEL);
-	if (!mm->free_trees)
+	mm->free_tree = kcalloc(mm->max_order + 1,
+				sizeof(struct rb_root),
+				GFP_KERNEL);
+	if (!mm->free_tree)
 		return -ENOMEM;
 
-	for_each_free_tree(i) {
-		mm->free_trees[i] = kmalloc_array(mm->max_order + 1,
-						  sizeof(struct rb_root),
-						  GFP_KERNEL);
-		if (!mm->free_trees[i])
-			goto out_free_tree;
-
-		for (j = 0; j <= mm->max_order; ++j)
-			mm->free_trees[i][j] = RB_ROOT;
-	}
+	gpu_clear_tracker_init(&mm->clear);
 
 	mm->n_roots = hweight64(size);
 
@@ -447,9 +833,8 @@ int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
 		gpu_block_free(mm, mm->roots[root_count]);
 	kfree(mm->roots);
 out_free_tree:
-	while (i--)
-		kfree(mm->free_trees[i]);
-	kfree(mm->free_trees);
+	gpu_clear_tracker_fini(&mm->clear);
+	kfree(mm->free_tree);
 	return -ENOMEM;
 }
 EXPORT_SYMBOL(gpu_buddy_init);
@@ -463,7 +848,7 @@ EXPORT_SYMBOL(gpu_buddy_init);
  */
 void gpu_buddy_fini(struct gpu_buddy *mm)
 {
-	u64 root_size, size, start;
+	u64 root_size, size;
 	unsigned int order;
 	int i;
 
@@ -471,22 +856,17 @@ void gpu_buddy_fini(struct gpu_buddy *mm)
 
 	for (i = 0; i < mm->n_roots; ++i) {
 		order = ilog2(size) - ilog2(mm->chunk_size);
-		start = gpu_buddy_block_offset(mm->roots[i]);
-		__force_merge(mm, start, start + size, order);
+		root_size = mm->chunk_size << order;
 
 		gpu_buddy_assert(gpu_buddy_block_is_free(mm->roots[i]));
-
 		gpu_block_free(mm, mm->roots[i]);
-
-		root_size = mm->chunk_size << order;
 		size -= root_size;
 	}
 
 	gpu_buddy_assert(mm->avail == mm->size);
 
-	for_each_free_tree(i)
-		kfree(mm->free_trees[i]);
-	kfree(mm->free_trees);
+	gpu_clear_tracker_fini(&mm->clear);
+	kfree(mm->free_tree);
 	kfree(mm->roots);
 }
 EXPORT_SYMBOL(gpu_buddy_fini);
@@ -512,13 +892,6 @@ static int split_block(struct gpu_buddy *mm,
 	}
 
 	mark_split(mm, block);
-
-	if (gpu_buddy_block_is_clear(block)) {
-		mark_cleared(block->left);
-		mark_cleared(block->right);
-		clear_reset(block);
-	}
-
 	mark_free(mm, block->left);
 	mark_free(mm, block->right);
 
@@ -536,42 +909,33 @@ static int split_block(struct gpu_buddy *mm,
  */
 void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear)
 {
-	enum gpu_buddy_free_tree src_tree, dst_tree;
-	u64 root_size, size, start;
-	unsigned int order;
-	int i;
+	unsigned int i;
 
 	gpu_buddy_driver_lock_held(mm);
-	size = mm->size;
-	for (i = 0; i < mm->n_roots; ++i) {
-		order = ilog2(size) - ilog2(mm->chunk_size);
-		start = gpu_buddy_block_offset(mm->roots[i]);
-		__force_merge(mm, start, start + size, order);
-
-		root_size = mm->chunk_size << order;
-		size -= root_size;
-	}
 
-	src_tree = is_clear ? GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
-	dst_tree = is_clear ? GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
+	gpu_clear_tracker_fini(&mm->clear);
+	gpu_clear_tracker_init(&mm->clear);
 
 	for (i = 0; i <= mm->max_order; ++i) {
-		struct rb_root *root = &mm->free_trees[src_tree][i];
 		struct gpu_buddy_block *block, *tmp;
 
-		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
-			rbtree_remove(mm, block);
+		rbtree_postorder_for_each_entry_safe(block, tmp,
+						     &mm->free_tree[i], rb) {
 			if (is_clear) {
-				mark_cleared(block);
-				mm->clear_avail += gpu_buddy_block_size(mm, block);
-			} else {
-				clear_reset(block);
-				mm->clear_avail -= gpu_buddy_block_size(mm, block);
+				if (!gpu_buddy_block_is_clear(block))
+					block->header |= GPU_BUDDY_HEADER_CLEAR;
+				gpu_clear_tracker_mark_clear(&mm->clear,
+							     gpu_buddy_block_offset(block),
+							     gpu_buddy_block_size(mm, block));
+			} else if (gpu_buddy_block_is_clear(block)) {
+				block->header &= ~GPU_BUDDY_HEADER_CLEAR;
 			}
 
-			rbtree_insert(mm, block, dst_tree);
+			gpu_buddy_augment_compute(block);
 		}
 	}
+
+	mm->clear_avail = mm->clear.total_clear;
 }
 EXPORT_SYMBOL(gpu_buddy_reset_clear);
 
@@ -584,13 +948,23 @@ EXPORT_SYMBOL(gpu_buddy_reset_clear);
 void gpu_buddy_free_block(struct gpu_buddy *mm,
 			  struct gpu_buddy_block *block)
 {
+	bool was_clear = gpu_buddy_block_is_clear(block);
+	u64 size   = gpu_buddy_block_size(mm, block);
+	u64 offset = gpu_buddy_block_offset(block);
+
 	gpu_buddy_driver_lock_held(mm);
+
 	BUG_ON(!gpu_buddy_block_is_allocated(block));
-	mm->avail += gpu_buddy_block_size(mm, block);
-	if (gpu_buddy_block_is_clear(block))
-		mm->clear_avail += gpu_buddy_block_size(mm, block);
 
-	__gpu_buddy_free(mm, block, false);
+	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
+	mm->avail += size;
+
+	if (was_clear) {
+		gpu_clear_tracker_mark_clear(&mm->clear, offset, size);
+		mm->clear_avail = mm->clear.total_clear;
+	}
+
+	__gpu_buddy_free(mm, block);
 }
 EXPORT_SYMBOL(gpu_buddy_free_block);
 
@@ -604,10 +978,15 @@ static void __gpu_buddy_free_list(struct gpu_buddy *mm,
 	gpu_buddy_assert(!(mark_dirty && mark_clear));
 
 	list_for_each_entry_safe(block, on, objects, link) {
+		/*
+		 * Propagate the caller's clear/dirty intent onto the block header
+		 * before handing it to gpu_buddy_free_block(), which will then
+		 * update the clear tracker accordingly.
+		 */
 		if (mark_clear)
-			mark_cleared(block);
+			block->header |= GPU_BUDDY_HEADER_CLEAR;
 		else if (mark_dirty)
-			clear_reset(block);
+			block->header &= ~GPU_BUDDY_HEADER_CLEAR;
 		gpu_buddy_free_block(mm, block);
 		cond_resched();
 	}
@@ -643,23 +1022,14 @@ void gpu_buddy_free_list(struct gpu_buddy *mm,
 }
 EXPORT_SYMBOL(gpu_buddy_free_list);
 
-static bool block_incompatible(struct gpu_buddy_block *block, unsigned int flags)
-{
-	bool needs_clear = flags & GPU_BUDDY_CLEAR_ALLOCATION;
-
-	return needs_clear != gpu_buddy_block_is_clear(block);
-}
-
 static struct gpu_buddy_block *
 __alloc_range_bias(struct gpu_buddy *mm,
 		   u64 start, u64 end,
 		   unsigned int order,
-		   unsigned long flags,
-		   bool fallback)
+		   unsigned long flags)
 {
 	u64 req_size = mm->chunk_size << order;
 	struct gpu_buddy_block *block;
-	struct gpu_buddy_block *buddy;
 	LIST_HEAD(dfs);
 	int err;
 	int i;
@@ -702,9 +1072,6 @@ __alloc_range_bias(struct gpu_buddy *mm,
 				continue;
 		}
 
-		if (!fallback && block_incompatible(block, flags))
-			continue;
-
 		if (contains(start, end, block_start, block_end) &&
 		    order == gpu_buddy_block_order(block)) {
 			/*
@@ -722,68 +1089,55 @@ __alloc_range_bias(struct gpu_buddy *mm,
 				goto err_undo;
 		}
 
-		list_add(&block->right->tmp_link, &dfs);
 		list_add(&block->left->tmp_link, &dfs);
+		list_add(&block->right->tmp_link, &dfs);
 	} while (1);
 
 	return ERR_PTR(-ENOSPC);
 
 err_undo:
-	/*
-	 * We really don't want to leave around a bunch of split blocks, since
-	 * bigger is better, so make sure we merge everything back before we
-	 * free the allocated blocks.
-	 */
-	buddy = __get_buddy(block);
-	if (buddy &&
-	    (gpu_buddy_block_is_free(block) &&
-	     gpu_buddy_block_is_free(buddy)))
-		__gpu_buddy_free(mm, block, false);
+	undo_partial_split(mm, block);
 	return ERR_PTR(err);
 }
 
-static struct gpu_buddy_block *
-__gpu_buddy_alloc_range_bias(struct gpu_buddy *mm,
-			     u64 start, u64 end,
-			     unsigned int order,
-			     unsigned long flags)
-{
-	struct gpu_buddy_block *block;
-	bool fallback = false;
-
-	block = __alloc_range_bias(mm, start, end, order,
-				   flags, fallback);
-	if (IS_ERR(block))
-		return __alloc_range_bias(mm, start, end, order,
-					  flags, !fallback);
-
-	return block;
-}
-
 static struct gpu_buddy_block *
 get_maxblock(struct gpu_buddy *mm,
 	     unsigned int order,
-	     enum gpu_buddy_free_tree tree)
+	     unsigned long flags)
 {
-	struct gpu_buddy_block *max_block = NULL, *block = NULL;
-	struct rb_root *root;
+	struct gpu_buddy_block *max_block;
+	struct gpu_buddy_block *block;
+	bool prefer_clear;
 	unsigned int i;
 
+	max_block = NULL;
+	prefer_clear = flags & GPU_BUDDY_CLEAR_ALLOCATION;
+
 	for (i = order; i <= mm->max_order; ++i) {
-		root = &mm->free_trees[tree][i];
-		block = rbtree_last_free_block(root);
+		if (prefer_clear)
+			block = rbtree_last_free_block(&mm->free_tree[i]);
+		else
+			block = rbtree_last_dirty_free_block(&mm->free_tree[i]);
+
 		if (!block)
 			continue;
 
-		if (!max_block) {
+		if (!max_block ||
+		    gpu_buddy_block_offset(block) > gpu_buddy_block_offset(max_block))
 			max_block = block;
+	}
+
+	if (max_block || prefer_clear)
+		return max_block;
+
+	for (i = order; i <= mm->max_order; ++i) {
+		block = rbtree_last_free_block(&mm->free_tree[i]);
+		if (!block)
 			continue;
-		}
 
-		if (gpu_buddy_block_offset(block) >
-		    gpu_buddy_block_offset(max_block)) {
+		if (!max_block ||
+		    gpu_buddy_block_offset(block) > gpu_buddy_block_offset(max_block))
 			max_block = block;
-		}
 	}
 
 	return max_block;
@@ -795,45 +1149,37 @@ alloc_from_freetree(struct gpu_buddy *mm,
 		    unsigned long flags)
 {
 	struct gpu_buddy_block *block = NULL;
-	struct rb_root *root;
-	enum gpu_buddy_free_tree tree;
 	unsigned int tmp;
 	int err;
 
-	tree = (flags & GPU_BUDDY_CLEAR_ALLOCATION) ?
-		GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
-
 	if (flags & GPU_BUDDY_TOPDOWN_ALLOCATION) {
-		block = get_maxblock(mm, order, tree);
+		block = get_maxblock(mm, order, flags);
 		if (block)
-			/* Store the obtained block order */
 			tmp = gpu_buddy_block_order(block);
-	} else {
+	} else if (!(flags & GPU_BUDDY_CLEAR_ALLOCATION)) {
 		for (tmp = order; tmp <= mm->max_order; ++tmp) {
-			/* Get RB tree root for this order and tree */
-			root = &mm->free_trees[tree][tmp];
-			block = rbtree_last_free_block(root);
+			block = rbtree_last_dirty_free_block(&mm->free_tree[tmp]);
 			if (block)
 				break;
 		}
-	}
-
-	if (!block) {
-		/* Try allocating from the other tree */
-		tree = (tree == GPU_BUDDY_CLEAR_TREE) ?
-			GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
-
+		if (!block) {
+			for (tmp = order; tmp <= mm->max_order; ++tmp) {
+				block = rbtree_last_free_block(&mm->free_tree[tmp]);
+				if (block)
+					break;
+			}
+		}
+	} else {
 		for (tmp = order; tmp <= mm->max_order; ++tmp) {
-			root = &mm->free_trees[tree][tmp];
-			block = rbtree_last_free_block(root);
+			block = rbtree_last_free_block(&mm->free_tree[tmp]);
 			if (block)
 				break;
 		}
-
-		if (!block)
-			return ERR_PTR(-ENOSPC);
 	}
 
+	if (!block)
+		return ERR_PTR(-ENOSPC);
+
 	BUG_ON(!gpu_buddy_block_is_free(block));
 
 	while (tmp != order) {
@@ -841,14 +1187,18 @@ alloc_from_freetree(struct gpu_buddy *mm,
 		if (unlikely(err))
 			goto err_undo;
 
-		block = block->right;
+		if (!(flags & GPU_BUDDY_CLEAR_ALLOCATION) &&
+		    gpu_buddy_block_is_clear(block->right))
+			block = block->left;
+		else
+			block = block->right;
 		tmp--;
 	}
 	return block;
 
 err_undo:
 	if (tmp != order)
-		__gpu_buddy_free(mm, block, false);
+		__gpu_buddy_free(mm, block);
 	return ERR_PTR(err);
 }
 
@@ -869,12 +1219,11 @@ static bool gpu_buddy_subtree_can_satisfy(struct rb_node *node,
 
 static struct gpu_buddy_block *
 gpu_buddy_find_block_aligned(struct gpu_buddy *mm,
-			     enum gpu_buddy_free_tree tree,
 			     unsigned int order,
 			     unsigned int alignment,
 			     unsigned long flags)
 {
-	struct rb_root *root = &mm->free_trees[tree][order];
+	struct rb_root *root = &mm->free_tree[order];
 	struct rb_node *rb = root->rb_node;
 
 	while (rb) {
@@ -912,8 +1261,6 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
 {
 	struct gpu_buddy_block *block = NULL;
 	unsigned int order, tmp, alignment;
-	struct gpu_buddy_block *buddy;
-	enum gpu_buddy_free_tree tree;
 	unsigned long pages;
 	int err;
 
@@ -921,19 +1268,8 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
 	pages = size >> ilog2(mm->chunk_size);
 	order = fls(pages) - 1;
 
-	tree = (flags & GPU_BUDDY_CLEAR_ALLOCATION) ?
-		GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
-
 	for (tmp = order; tmp <= mm->max_order; ++tmp) {
-		block = gpu_buddy_find_block_aligned(mm, tree, tmp,
-						     alignment, flags);
-		if (!block) {
-			tree = (tree == GPU_BUDDY_CLEAR_TREE) ?
-				GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
-			block = gpu_buddy_find_block_aligned(mm, tree, tmp,
-							     alignment, flags);
-		}
-
+		block = gpu_buddy_find_block_aligned(mm, tmp, alignment, flags);
 		if (block)
 			break;
 	}
@@ -960,27 +1296,18 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
 	return block;
 
 err_undo:
-	/*
-	 * We really don't want to leave around a bunch of split blocks, since
-	 * bigger is better, so make sure we merge everything back before we
-	 * free the allocated blocks.
-	 */
-	buddy = __get_buddy(block);
-	if (buddy &&
-	    (gpu_buddy_block_is_free(block) &&
-	     gpu_buddy_block_is_free(buddy)))
-		__gpu_buddy_free(mm, block, false);
+	undo_partial_split(mm, block);
 	return ERR_PTR(err);
 }
 
 static int __alloc_range(struct gpu_buddy *mm,
 			 struct list_head *dfs,
 			 u64 start, u64 size,
+			 unsigned long flags,
 			 struct list_head *blocks,
 			 u64 *total_allocated_on_err)
 {
 	struct gpu_buddy_block *block;
-	struct gpu_buddy_block *buddy;
 	u64 total_allocated = 0;
 	LIST_HEAD(allocated);
 	u64 end;
@@ -1013,16 +1340,25 @@ static int __alloc_range(struct gpu_buddy *mm,
 
 		if (contains(start, end, block_start, block_end)) {
 			if (gpu_buddy_block_is_free(block)) {
+				u64 bsize = gpu_buddy_block_size(mm, block);
+				u64 boff  = gpu_buddy_block_offset(block);
+
 				mark_allocated(mm, block);
-				total_allocated += gpu_buddy_block_size(mm, block);
-				mm->avail -= gpu_buddy_block_size(mm, block);
-				if (gpu_buddy_block_is_clear(block))
-					mm->clear_avail -= gpu_buddy_block_size(mm, block);
+				total_allocated += bsize;
+				mm->avail -= bsize;
+
+				block->header &= ~GPU_BUDDY_HEADER_CLEAR;
+				if (gpu_clear_tracker_is_clear(&mm->clear,
+							       boff, bsize)) {
+					if (flags & GPU_BUDDY_CLEAR_ALLOCATION)
+						block->header |= GPU_BUDDY_HEADER_CLEAR;
+				}
+				gpu_clear_tracker_mark_dirty(&mm->clear,
+							     boff, bsize);
+				mm->clear_avail = mm->clear.total_clear;
+
 				list_add_tail(&block->link, &allocated);
 				continue;
-			} else if (!mm->clear_avail) {
-				err = -ENOSPC;
-				goto err_free;
 			}
 		}
 
@@ -1046,16 +1382,7 @@ static int __alloc_range(struct gpu_buddy *mm,
 	return 0;
 
 err_undo:
-	/*
-	 * We really don't want to leave around a bunch of split blocks, since
-	 * bigger is better, so make sure we merge everything back before we
-	 * free the allocated blocks.
-	 */
-	buddy = __get_buddy(block);
-	if (buddy &&
-	    (gpu_buddy_block_is_free(block) &&
-	     gpu_buddy_block_is_free(buddy)))
-		__gpu_buddy_free(mm, block, false);
+	undo_partial_split(mm, block);
 
 err_free:
 	if (err == -ENOSPC && total_allocated_on_err) {
@@ -1071,6 +1398,7 @@ static int __alloc_range(struct gpu_buddy *mm,
 static int __gpu_buddy_alloc_range(struct gpu_buddy *mm,
 				   u64 start,
 				   u64 size,
+				   unsigned long flags,
 				   u64 *total_allocated_on_err,
 				   struct list_head *blocks)
 {
@@ -1080,20 +1408,23 @@ static int __gpu_buddy_alloc_range(struct gpu_buddy *mm,
 	for (i = 0; i < mm->n_roots; ++i)
 		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
 
-	return __alloc_range(mm, &dfs, start, size,
+	return __alloc_range(mm, &dfs, start, size, flags,
 			     blocks, total_allocated_on_err);
 }
 
 static int __alloc_contig_try_harder(struct gpu_buddy *mm,
 				     u64 size,
 				     u64 min_block_size,
+				     unsigned long flags,
 				     struct list_head *blocks)
 {
 	u64 rhs_offset, lhs_offset, lhs_size, filled;
 	struct gpu_buddy_block *block;
-	unsigned int tree, order;
 	LIST_HEAD(blocks_lhs);
+	struct rb_root *root;
+	struct rb_node *iter;
 	unsigned long pages;
+	unsigned int order;
 	u64 modify_size;
 	int err;
 
@@ -1103,45 +1434,40 @@ static int __alloc_contig_try_harder(struct gpu_buddy *mm,
 	if (order == 0)
 		return -ENOSPC;
 
-	for_each_free_tree(tree) {
-		struct rb_root *root;
-		struct rb_node *iter;
-
-		root = &mm->free_trees[tree][order];
-		if (rbtree_is_empty(root))
-			continue;
+	root = &mm->free_tree[order];
+	if (RB_EMPTY_ROOT(root))
+		return -ENOSPC;
 
-		iter = rb_last(root);
-		while (iter) {
-			block = rbtree_get_free_block(iter);
-
-			/* Allocate blocks traversing RHS */
-			rhs_offset = gpu_buddy_block_offset(block);
-			err =  __gpu_buddy_alloc_range(mm, rhs_offset, size,
-						       &filled, blocks);
-			if (!err || err != -ENOSPC)
-				return err;
-
-			lhs_size = max((size - filled), min_block_size);
-			if (!IS_ALIGNED(lhs_size, min_block_size))
-				lhs_size = round_up(lhs_size, min_block_size);
-
-			/* Allocate blocks traversing LHS */
-			lhs_offset = gpu_buddy_block_offset(block) - lhs_size;
-			err =  __gpu_buddy_alloc_range(mm, lhs_offset, lhs_size,
-						       NULL, &blocks_lhs);
-			if (!err) {
-				list_splice(&blocks_lhs, blocks);
-				return 0;
-			} else if (err != -ENOSPC) {
-				gpu_buddy_free_list_internal(mm, blocks);
-				return err;
-			}
-			/* Free blocks for the next iteration */
+	iter = rb_last(root);
+	while (iter) {
+		block = rbtree_get_free_block(iter);
+
+		/* Allocate blocks traversing RHS */
+		rhs_offset = gpu_buddy_block_offset(block);
+		err =  __gpu_buddy_alloc_range(mm, rhs_offset, size,
+					       flags, &filled, blocks);
+		if (!err || err != -ENOSPC)
+			return err;
+
+		lhs_size = max((size - filled), min_block_size);
+		if (!IS_ALIGNED(lhs_size, min_block_size))
+			lhs_size = round_up(lhs_size, min_block_size);
+
+		/* Allocate blocks traversing LHS */
+		lhs_offset = gpu_buddy_block_offset(block) - lhs_size;
+		err =  __gpu_buddy_alloc_range(mm, lhs_offset, lhs_size,
+					       flags, NULL, &blocks_lhs);
+		if (!err) {
+			list_splice(&blocks_lhs, blocks);
+			return 0;
+		} else if (err != -ENOSPC) {
 			gpu_buddy_free_list_internal(mm, blocks);
-
-			iter = rb_prev(iter);
+			return err;
 		}
+		/* Free blocks for the next iteration */
+		gpu_buddy_free_list_internal(mm, blocks);
+
+		iter = rb_prev(iter);
 	}
 
 	return -ENOSPC;
@@ -1175,6 +1501,7 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 	struct gpu_buddy_block *block;
 	u64 block_start, block_end;
 	LIST_HEAD(dfs);
+	bool was_clear;
 	u64 new_start;
 	int err;
 
@@ -1217,22 +1544,38 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 	}
 
 	list_del(&block->link);
+
+	was_clear = gpu_buddy_block_is_clear(block);
+	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
+
+	if (was_clear) {
+		gpu_clear_tracker_mark_clear(&mm->clear,
+					     gpu_buddy_block_offset(block),
+					     gpu_buddy_block_size(mm, block));
+		mm->clear_avail = mm->clear.total_clear;
+	}
+
 	mark_free(mm, block);
 	mm->avail += gpu_buddy_block_size(mm, block);
-	if (gpu_buddy_block_is_clear(block))
-		mm->clear_avail += gpu_buddy_block_size(mm, block);
 
 	/* Prevent recursively freeing this node */
 	parent = block->parent;
 	block->parent = NULL;
 
 	list_add(&block->tmp_link, &dfs);
-	err =  __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL);
+	err =  __alloc_range(mm, &dfs, new_start, new_size,
+			     was_clear ? GPU_BUDDY_CLEAR_ALLOCATION : 0,
+			     blocks, NULL);
 	if (err) {
 		mark_allocated(mm, block);
 		mm->avail -= gpu_buddy_block_size(mm, block);
-		if (gpu_buddy_block_is_clear(block))
-			mm->clear_avail -= gpu_buddy_block_size(mm, block);
+		if (was_clear) {
+			gpu_clear_tracker_mark_dirty(&mm->clear,
+						     gpu_buddy_block_offset(block),
+						     gpu_buddy_block_size(mm, block));
+			mm->clear_avail = mm->clear.total_clear;
+			block->header |= GPU_BUDDY_HEADER_CLEAR;
+		}
 		list_add(&block->link, blocks);
 	}
 
@@ -1241,6 +1584,21 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 }
 EXPORT_SYMBOL(gpu_buddy_block_trim);
 
+static bool clear_steer_window(struct gpu_buddy *mm, u64 min_sz,
+			       u64 *start, u64 *end, unsigned long *flags)
+{
+	struct gpu_clear_extent *ext =
+		gpu_clear_tracker_find(&mm->clear, min_sz);
+
+	if (!ext)
+		return false;
+
+	*start  = ext->start;
+	*end    = ext->end;
+	*flags |= GPU_BUDDY_RANGE_ALLOCATION;
+	return true;
+}
+
 static struct gpu_buddy_block *
 __gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 			 u64 start, u64 end,
@@ -1248,18 +1606,32 @@ __gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 			 unsigned int order,
 			 unsigned long flags)
 {
-	if (flags & GPU_BUDDY_RANGE_ALLOCATION)
+	struct gpu_buddy_block *block;
+	bool steered = false;
+
+	/* Steer cleared allocations to a cleared extent that fits the order */
+	if (!(flags & GPU_BUDDY_RANGE_ALLOCATION) &&
+	    (flags & GPU_BUDDY_CLEAR_ALLOCATION) && mm->clear_avail)
+		steered = clear_steer_window(mm, mm->chunk_size << order,
+					     &start, &end, &flags);
+
+	if (flags & GPU_BUDDY_RANGE_ALLOCATION) {
 		/* Allocate traversing within the range */
-		return  __gpu_buddy_alloc_range_bias(mm, start, end,
-						     order, flags);
-	else if (size < min_block_size)
+		block = __alloc_range_bias(mm, start, end, order, flags);
+		if (!IS_ERR(block) || !steered)
+			return block;
+
+		flags &= ~GPU_BUDDY_RANGE_ALLOCATION;
+	}
+
+	if (size < min_block_size)
 		/* Allocate from an offset-aligned region without size rounding */
 		return gpu_buddy_offset_aligned_allocation(mm, size,
 							   min_block_size,
 							   flags);
-	else
-		/* Allocate from freetree */
-		return alloc_from_freetree(mm, order, flags);
+
+	/* Allocate from freetree */
+	return alloc_from_freetree(mm, order, flags);
 }
 
 /**
@@ -1320,7 +1692,7 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 		if (!IS_ALIGNED(start | end, min_block_size))
 			return -EINVAL;
 
-		return __gpu_buddy_alloc_range(mm, start, size, NULL, blocks);
+		return __gpu_buddy_alloc_range(mm, start, size, flags, NULL, blocks);
 	}
 
 	original_size = size;
@@ -1346,7 +1718,8 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 		if ((flags & GPU_BUDDY_CONTIGUOUS_ALLOCATION) &&
 		    !(flags & GPU_BUDDY_RANGE_ALLOCATION))
 			return __alloc_contig_try_harder(mm, original_size,
-							 original_min_size, blocks);
+							 original_min_size,
+							 flags, blocks);
 
 		return -EINVAL;
 	}
@@ -1361,8 +1734,6 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 		BUG_ON(size >= min_block_size && order < min_order);
 
 		do {
-			unsigned int fallback_order;
-
 			block = __gpu_buddy_alloc_blocks(mm, start,
 							 end,
 							 size,
@@ -1372,48 +1743,46 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 			if (!IS_ERR(block))
 				break;
 
-			if (size < min_block_size) {
-				fallback_order = order;
-			} else if (order == min_order) {
-				fallback_order = min_order;
-			} else {
+			if (size >= min_block_size && order > min_order) {
 				order--;
 				continue;
 			}
 
-			/* Try allocation through force merge method */
-			if (mm->clear_avail &&
-			    !__force_merge(mm, start, end, fallback_order)) {
-				block = __gpu_buddy_alloc_blocks(mm, start,
-								 end,
-								 size,
-								 min_block_size,
-								 fallback_order,
-								 flags);
-				if (!IS_ERR(block)) {
-					order = fallback_order;
-					break;
-				}
-			}
-
 			/*
 			 * Try contiguous block allocation through
 			 * try harder method.
 			 */
 			if (flags & GPU_BUDDY_CONTIGUOUS_ALLOCATION &&
-			    !(flags & GPU_BUDDY_RANGE_ALLOCATION))
-				return __alloc_contig_try_harder(mm,
-								 original_size,
-								 original_min_size,
-								 blocks);
+			    !(flags & GPU_BUDDY_RANGE_ALLOCATION)) {
+				err = __alloc_contig_try_harder(mm,
+								original_size,
+								original_min_size,
+								flags,
+								blocks);
+				if (!err)
+					return 0;
+				if (err != -ENOSPC)
+					return err;
+				goto err_free;
+			}
 			err = -ENOSPC;
 			goto err_free;
 		} while (1);
 
 		mark_allocated(mm, block);
 		mm->avail -= gpu_buddy_block_size(mm, block);
-		if (gpu_buddy_block_is_clear(block))
-			mm->clear_avail -= gpu_buddy_block_size(mm, block);
+
+		block->header &= ~GPU_BUDDY_HEADER_CLEAR;
+		if (flags & GPU_BUDDY_CLEAR_ALLOCATION &&
+		    gpu_clear_tracker_is_clear(&mm->clear,
+					       gpu_buddy_block_offset(block),
+					       gpu_buddy_block_size(mm, block)))
+			block->header |= GPU_BUDDY_HEADER_CLEAR;
+
+		gpu_clear_tracker_mark_dirty(&mm->clear,
+					     gpu_buddy_block_offset(block),
+					     gpu_buddy_block_size(mm, block));
+		mm->clear_avail = mm->clear.total_clear;
 		kmemleak_update_trace(block);
 		list_add_tail(&block->link, &allocated);
 
@@ -1492,31 +1861,30 @@ void gpu_buddy_print(struct gpu_buddy *mm)
 	for (order = mm->max_order; order >= 0; order--) {
 		struct gpu_buddy_block *block, *tmp;
 		struct rb_root *root;
-		u64 count = 0, free;
-		unsigned int tree;
-
-		for_each_free_tree(tree) {
-			root = &mm->free_trees[tree][order];
+		u64 count = 0, clear = 0, free;
 
-			rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
-				BUG_ON(!gpu_buddy_block_is_free(block));
-				count++;
-			}
+		root = &mm->free_tree[order];
+		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
+			BUG_ON(!gpu_buddy_block_is_free(block));
+			count++;
+			if (gpu_buddy_block_is_clear(block))
+				clear++;
 		}
 
 		free = count * (mm->chunk_size << order);
 		if (free < SZ_1M)
-			pr_info("order-%2d free: %8llu KiB, blocks: %llu\n",
-				order, free >> 10, count);
+			pr_info("order-%2d free: %8llu KiB, blocks: %llu (clear: %llu)\n",
+				order, free >> 10, count, clear);
 		else
-			pr_info("order-%2d free: %8llu MiB, blocks: %llu\n",
-				order, free >> 20, count);
+			pr_info("order-%2d free: %8llu MiB, blocks: %llu (clear: %llu)\n",
+				order, free >> 20, count, clear);
 	}
 }
 EXPORT_SYMBOL(gpu_buddy_print);
 
 static void gpu_buddy_module_exit(void)
 {
+	kmem_cache_destroy(slab_extents);
 	kmem_cache_destroy(slab_blocks);
 }
 
@@ -1526,6 +1894,12 @@ static int __init gpu_buddy_module_init(void)
 	if (!slab_blocks)
 		return -ENOMEM;
 
+	slab_extents = KMEM_CACHE(gpu_clear_extent, 0);
+	if (!slab_extents) {
+		kmem_cache_destroy(slab_blocks);
+		return -ENOMEM;
+	}
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index faa025498de4..a89c392a155a 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -50,15 +50,11 @@ void drm_buddy_print(struct gpu_buddy *mm, struct drm_printer *p)
 		struct gpu_buddy_block *block, *tmp;
 		struct rb_root *root;
 		u64 count = 0, free;
-		unsigned int tree;
 
-		for_each_free_tree(tree) {
-			root = &mm->free_trees[tree][order];
-
-			rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
-				BUG_ON(!gpu_buddy_block_is_free(block));
-				count++;
-			}
+		root = &mm->free_tree[order];
+		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
+			BUG_ON(!gpu_buddy_block_is_free(block));
+			count++;
 		}
 
 		drm_printf(p, "order-%2d ", order);
diff --git a/drivers/gpu/tests/gpu_buddy_test.c b/drivers/gpu/tests/gpu_buddy_test.c
index 7df5c2ae83bb..e0d24a4542b2 100644
--- a/drivers/gpu/tests/gpu_buddy_test.c
+++ b/drivers/gpu/tests/gpu_buddy_test.c
@@ -78,15 +78,11 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
 		}
 
 		for (order = mm.max_order; order >= 0 && !root; order--) {
-			for (tree = 0; tree < 2; tree++) {
-				node = mm.free_trees[tree][order].rb_node;
-				if (node) {
-					root = container_of(node,
-							    struct gpu_buddy_block,
-							    rb);
-					break;
-				}
-			}
+			node = mm.free_tree[order].rb_node;
+			if (node)
+				root = container_of(node,
+						    struct gpu_buddy_block,
+						    rb);
 		}
 
 		KUNIT_ASSERT_NOT_NULL(test, root);
@@ -97,8 +93,8 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
 		gpu_buddy_free_list(&mm, &allocated[i], 0);
 
 		for (order = 0; order <= mm.max_order; order++) {
-			for (tree = 0; tree < 2; tree++) {
-				node = mm.free_trees[tree][order].rb_node;
+			{
+				node = mm.free_tree[order].rb_node;
 				if (!node)
 					continue;
 
diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
index 71941a039648..07da1aa4865b 100644
--- a/include/linux/gpu_buddy.h
+++ b/include/linux/gpu_buddy.h
@@ -67,15 +67,6 @@
  */
 #define GPU_BUDDY_TRIM_DISABLE			BIT(5)
 
-enum gpu_buddy_free_tree {
-	GPU_BUDDY_CLEAR_TREE = 0,
-	GPU_BUDDY_DIRTY_TREE,
-	GPU_BUDDY_MAX_FREE_TREES,
-};
-
-#define for_each_free_tree(tree) \
-	for ((tree) = 0; (tree) < GPU_BUDDY_MAX_FREE_TREES; (tree)++)
-
 /**
  * struct gpu_buddy_block - Block within a buddy allocator
  *
@@ -103,6 +94,14 @@ struct gpu_buddy_block {
 #define   GPU_BUDDY_ALLOCATED	   (1 << 10)
 #define   GPU_BUDDY_FREE	   (2 << 10)
 #define   GPU_BUDDY_SPLIT	   (3 << 10)
+/*
+ * GPU_BUDDY_HEADER_CLEAR has two roles:
+ *  - FREE state:      set when the block's full range is cleared (tracker
+ *                     confirmed).  Cleared free blocks float in the buddy
+ *                     tree and are NOT inserted into free_tree[].
+ *  - ALLOCATED state: set when the block was served from cleared memory,
+ *                     informing the caller that no GPU clear pass is needed.
+ */
 #define GPU_BUDDY_HEADER_CLEAR  GENMASK_ULL(9, 9)
 /* Free to be used, if needed in the future */
 #define GPU_BUDDY_HEADER_UNUSED GENMASK_ULL(8, 6)
@@ -130,11 +129,44 @@ struct gpu_buddy_block {
 /* private: */
 	struct list_head tmp_link;
 	unsigned int subtree_max_alignment;
+	bool subtree_has_dirty;
 };
 
 /* Order-zero must be at least SZ_4K */
 #define GPU_BUDDY_MAX_ORDER (63 - 12)
 
+/**
+ * struct gpu_clear_extent - a contiguous cleared (zeroed) address range
+ *
+ * Tracks a single contiguous address range whose memory content is known
+ * to be zeroed.  Extents are non-overlapping and stored in an augmented
+ * red-black tree sorted by @start.  The augmented value @subtree_max_size
+ * allows O(log N) search for an extent of at least a given size.
+ */
+struct gpu_clear_extent {
+/* private: */
+	struct rb_node	rb;
+	u64		start;
+	u64		end;
+	u64		subtree_max_size;
+};
+
+/**
+ * struct gpu_clear_tracker - tracks cleared (zeroed) address intervals
+ *
+ * Maintains a set of non-overlapping cleared extents as an augmented
+ * red-black tree.  The tracker is embedded inside struct gpu_buddy and
+ * replaces the former dual (clear/dirty) free-tree scheme.
+ *
+ * @total_clear: Total bytes of cleared memory currently tracked.
+ */
+struct gpu_clear_tracker {
+/* private: */
+	struct rb_root	root;
+/* public: */
+	u64		total_clear;
+};
+
 /**
  * struct gpu_buddy - GPU binary buddy allocator
  *
@@ -154,18 +186,20 @@ struct gpu_buddy_block {
  * @avail: Total free space currently available for allocation in bytes.
  * @clear_avail: Free space available in the clear tree (zeroed memory) in bytes.
  *               This is a subset of @avail.
+ * @clear: Tracker of cleared address ranges (decoupled from free_tree).
  * @lock_dep_map: Annotates gpu_buddy API with a driver provided lock.
  */
 struct gpu_buddy {
 /* private: */
+	struct gpu_clear_tracker clear;
 	/*
-	 * Array of red-black trees for free block management.
-	 * Indexed as free_trees[clear/dirty][order] where:
-	 * - Index 0 (GPU_BUDDY_CLEAR_TREE): blocks with zeroed content
-	 * - Index 1 (GPU_BUDDY_DIRTY_TREE): blocks with unknown content
-	 * Each tree holds free blocks of the corresponding order.
+	 * One RB-tree per order containing all free blocks (clear and
+	 * dirty alike).  The augment field subtree_has_dirty lets dirty
+	 * allocations skip subtrees with no dirty inventory in O(log N).
+	 * Cleared free blocks coexist here but are also indexed by the
+	 * @clear tracker for fast CLEAR_ALLOCATION lookups.
 	 */
-	struct rb_root **free_trees;
+	struct rb_root *free_tree;
 	/*
 	 * Array of root blocks representing the top-level blocks of the
 	 * binary tree(s). Multiple roots exist when the total size is not

base-commit: 3c3c5fb9b36836d279ebe370189d68a0a3387362
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/2] gpu/tests/buddy: add clear-tracker allocation latency benchmarks
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
@ 2026-05-27 11:29 ` Arunpravin Paneer Selvam
  2026-05-27 14:30 ` ✗ CI.checkpatch: warning for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Patchwork
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-05-27 11:29 UTC (permalink / raw)
  To: matthew.auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher, Arunpravin Paneer Selvam

Add gpu_test_buddy_clear_tracker_performance test case that measures
allocation latency before and after replacing the dual-tree /
force_merge design with a decoupled clear tracker.

Two scenarios are covered.

1. Single contiguous allocation after fragmentation. A 4 GiB pool is
   filled with 4 KiB blocks and freed in alternating clear/dirty order,
   so every buddy pair ends up split across the two trees and cannot
   coalesce at free() time. A single contiguous 4 GiB allocation then
   takes ~61 ms on the dual-tree design (the alloc path has to invoke
   __force_merge() to climb back up to max_order) and ~25 ms with the
   clear tracker (the pool is already coalesced at free() time).

2. Repeated allocations from a fragmented pool. Same 4 GiB pool, freed
   with even-indexed blocks cleared and odd-indexed dirty so every
   adjacent buddy pair sits on opposite sides of the merge barrier.
   16384 x 256 KiB allocations then take ~80 ms on the dual-tree
   design (each alloc pays the __force_merge() cost) and ~39 ms with
   the clear tracker (free-time merging makes each alloc an O(log N)
   split).

v2:
 - Removed unwanted sub tests

v3:
 - Pass GPU_BUDDY_CONTIGUOUS_ALLOCATION on the timed full-pool alloc so
   the benchmark actually exercises the contiguous (force_merge) path
   instead of silently falling back to smaller blocks. (sashiko)

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
---
 drivers/gpu/tests/gpu_buddy_test.c | 128 ++++++++++++++++++++++++++---
 1 file changed, 118 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/tests/gpu_buddy_test.c b/drivers/gpu/tests/gpu_buddy_test.c
index e0d24a4542b2..1669c628847f 100644
--- a/drivers/gpu/tests/gpu_buddy_test.c
+++ b/drivers/gpu/tests/gpu_buddy_test.c
@@ -38,7 +38,7 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
 	};
 	struct list_head allocated[ARRAY_SIZE(alignments)];
 	unsigned int i, max_subtree_align = 0;
-	int ret, tree, order;
+	int ret, order;
 	struct gpu_buddy mm;
 
 	KUNIT_ASSERT_FALSE_MSG(test, gpu_buddy_init(&mm, mm_size, SZ_4K),
@@ -93,15 +93,13 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
 		gpu_buddy_free_list(&mm, &allocated[i], 0);
 
 		for (order = 0; order <= mm.max_order; order++) {
-			{
-				node = mm.free_tree[order].rb_node;
-				if (!node)
-					continue;
-
-				block = container_of(node, struct gpu_buddy_block, rb);
-				max_subtree_align = max(max_subtree_align,
-							block->subtree_max_alignment);
-			}
+			node = mm.free_tree[order].rb_node;
+			if (!node)
+				continue;
+
+			block = container_of(node, struct gpu_buddy_block, rb);
+			max_subtree_align = max(max_subtree_align,
+						block->subtree_max_alignment);
 		}
 
 		KUNIT_EXPECT_GE(test, max_subtree_align, ilog2(alignments[i]));
@@ -285,6 +283,115 @@ static void gpu_test_buddy_fragmentation_performance(struct kunit *test)
 	gpu_buddy_fini(&mm);
 }
 
+static void gpu_test_buddy_clear_tracker_performance(struct kunit *test)
+{
+	struct gpu_buddy_block *block, *tmp;
+	unsigned long elapsed_ms;
+	LIST_HEAD(clear_blocks);
+	LIST_HEAD(dirty_blocks);
+	LIST_HEAD(allocated);
+	struct gpu_buddy mm;
+	LIST_HEAD(results);
+	ktime_t start, end;
+	int i, count;
+
+	/*
+	 * Contiguous alloc latency after alternating clear/dirty fragmentation
+	 *
+	 * Fill a 4 GiB pool with 4 KiB allocations, partition them into
+	 * alternating cleared and dirty sets, then free both.  In the old
+	 * dual-tree design every adjacent buddy pair has one cleared half and
+	 * one dirty half, so the pair sits on opposite sides of the clear/dirty
+	 * merge barrier and cannot be coalesced at free() time.  The pool
+	 * stays fully fragmented and the subsequent contiguous 4 GiB allocation
+	 * has to invoke __force_merge() to climb back up to max_order before
+	 * it can succeed.  With the clear-tracker design buddy pairs coalesce
+	 * unconditionally during free(), so the pool is already at max_order
+	 * before the timed alloc begins and __force_merge() is not needed.
+	 */
+	KUNIT_ASSERT_FALSE_MSG(test, gpu_buddy_init(&mm, SZ_4G, SZ_4K),
+			       "buddy_init failed\n");
+
+	for (i = 0; i < SZ_4G / SZ_4K; i++)
+		KUNIT_ASSERT_FALSE_MSG(test,
+				       gpu_buddy_alloc_blocks(&mm, 0, SZ_4G, SZ_4K, SZ_4K,
+							      &allocated, 0),
+				       "buddy_alloc hit an error size=%u\n", SZ_4K);
+
+	count = 0;
+	list_for_each_entry_safe(block, tmp, &allocated, link) {
+		if (count++ % 2 == 0)
+			list_move_tail(&block->link, &clear_blocks);
+		else
+			list_move_tail(&block->link, &dirty_blocks);
+	}
+
+	gpu_buddy_free_list(&mm, &clear_blocks, GPU_BUDDY_CLEARED);
+	gpu_buddy_free_list(&mm, &dirty_blocks, 0);
+
+	start = ktime_get();
+	KUNIT_ASSERT_FALSE_MSG(test,
+			       gpu_buddy_alloc_blocks(&mm, 0, SZ_4G, SZ_4G, SZ_4K,
+						      &results,
+						      GPU_BUDDY_CONTIGUOUS_ALLOCATION),
+			       "contiguous alloc failed\n");
+	end = ktime_get();
+	elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+
+	kunit_info(test, "Contiguous alloc after fragmentation: %lu ms\n",
+		   elapsed_ms);
+
+	gpu_buddy_free_list(&mm, &results, 0);
+	gpu_buddy_fini(&mm);
+
+	/*
+	 * Repeated alloc throughput from a maximally fragmented pool
+	 *
+	 * Fill a 4 GiB pool with 4 KiB allocations, free even-indexed blocks
+	 * as cleared and odd-indexed blocks as dirty.  The alternating pattern
+	 * ensures every adjacent buddy pair has one cleared half and one dirty
+	 * half, so each pair lands on opposite sides of the old merge barrier.
+	 * Each of the 16 384 x 256 KiB allocations in the timed loop has to
+	 * pay the __force_merge() cost on the alloc path under the old design.
+	 * With the clear-tracker design the pool collapses to one max_order
+	 * block during free(), so each alloc is a simple O(log N) split.
+	 */
+	KUNIT_ASSERT_FALSE_MSG(test, gpu_buddy_init(&mm, SZ_4G, SZ_4K),
+			       "buddy_init failed\n");
+
+	for (i = 0; i < SZ_4G / SZ_4K; i++)
+		KUNIT_ASSERT_FALSE_MSG(test,
+				       gpu_buddy_alloc_blocks(&mm, 0, SZ_4G, SZ_4K, SZ_4K,
+							      &allocated, 0),
+				       "buddy_alloc hit an error size=%u\n", SZ_4K);
+
+	count = 0;
+	list_for_each_entry_safe(block, tmp, &allocated, link) {
+		if (count++ % 2 == 0)
+			list_move_tail(&block->link, &clear_blocks);
+		else
+			list_move_tail(&block->link, &dirty_blocks);
+	}
+
+	gpu_buddy_free_list(&mm, &clear_blocks, GPU_BUDDY_CLEARED);
+	gpu_buddy_free_list(&mm, &dirty_blocks, 0);
+
+	start = ktime_get();
+	for (i = 0; i < SZ_4G / SZ_256K; i++)
+		KUNIT_ASSERT_FALSE_MSG(test,
+				       gpu_buddy_alloc_blocks(&mm, 0, SZ_4G, SZ_256K, SZ_4K,
+							      &results, 0),
+				       "buddy_alloc hit an error size=%u\n", SZ_256K);
+	end = ktime_get();
+	elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+
+	kunit_info(test, "Repeated 256 KiB allocs from fragmented pool: %lu ms\n",
+		   elapsed_ms);
+
+	gpu_buddy_free_list(&mm, &results, 0);
+	gpu_buddy_fini(&mm);
+}
+
 static void gpu_test_buddy_alloc_range_bias(struct kunit *test)
 {
 	u32 mm_size, size, ps, bias_size, bias_start, bias_end, bias_rem;
@@ -1398,6 +1505,7 @@ static struct kunit_case gpu_buddy_tests[] = {
 	KUNIT_CASE(gpu_test_buddy_alloc_range),
 	KUNIT_CASE(gpu_test_buddy_alloc_range_bias),
 	KUNIT_CASE_SLOW(gpu_test_buddy_fragmentation_performance),
+	KUNIT_CASE_SLOW(gpu_test_buddy_clear_tracker_performance),
 	KUNIT_CASE(gpu_test_buddy_alloc_exceeds_max_order),
 	KUNIT_CASE(gpu_test_buddy_offset_aligned_allocation),
 	KUNIT_CASE(gpu_test_buddy_subtree_offset_alignment_stress),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ✗ CI.checkpatch: warning for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
  2026-05-27 11:29 ` [PATCH v4 2/2] gpu/tests/buddy: add clear-tracker allocation latency benchmarks Arunpravin Paneer Selvam
@ 2026-05-27 14:30 ` Patchwork
  2026-05-27 14:32 ` ✓ CI.KUnit: success " Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2026-05-27 14:30 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam; +Cc: intel-xe

== Series Details ==

Series: series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
URL   : https://patchwork.freedesktop.org/series/167369/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
061140b9bc586ae7f40abc1249c97e1cc72d1b9d
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 4137df3a2adf1592ae3dbd4433cd0de29c7f51fd
Author: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Date:   Wed May 27 16:59:02 2026 +0530

    gpu/tests/buddy: add clear-tracker allocation latency benchmarks
    
    Add gpu_test_buddy_clear_tracker_performance test case that measures
    allocation latency before and after replacing the dual-tree /
    force_merge design with a decoupled clear tracker.
    
    Two scenarios are covered.
    
    1. Single contiguous allocation after fragmentation. A 4 GiB pool is
       filled with 4 KiB blocks and freed in alternating clear/dirty order,
       so every buddy pair ends up split across the two trees and cannot
       coalesce at free() time. A single contiguous 4 GiB allocation then
       takes ~61 ms on the dual-tree design (the alloc path has to invoke
       __force_merge() to climb back up to max_order) and ~25 ms with the
       clear tracker (the pool is already coalesced at free() time).
    
    2. Repeated allocations from a fragmented pool. Same 4 GiB pool, freed
       with even-indexed blocks cleared and odd-indexed dirty so every
       adjacent buddy pair sits on opposite sides of the merge barrier.
       16384 x 256 KiB allocations then take ~80 ms on the dual-tree
       design (each alloc pays the __force_merge() cost) and ~39 ms with
       the clear tracker (free-time merging makes each alloc an O(log N)
       split).
    
    v2:
     - Removed unwanted sub tests
    
    v3:
     - Pass GPU_BUDDY_CONTIGUOUS_ALLOCATION on the timed full-pool alloc so
       the benchmark actually exercises the contiguous (force_merge) path
       instead of silently falling back to smaller blocks. (sashiko)
    
    Cc: Matthew Auld <matthew.auld@intel.com>
    Cc: Christian König <christian.koenig@amd.com>
    Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
+ /mt/dim checkpatch 5390f2273d45bb259d88508828018c0fbbb79d32 drm-intel
9eba0d209827 gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
-:1732: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#1732: FILE: drivers/gpu/buddy.c:1868:
+			BUG_ON(!gpu_buddy_block_is_free(block));

-:1791: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#1791: FILE: drivers/gpu/drm/drm_buddy.c:56:
+			BUG_ON(!gpu_buddy_block_is_free(block));

total: 0 errors, 2 warnings, 0 checks, 1765 lines checked
4137df3a2adf gpu/tests/buddy: add clear-tracker allocation latency benchmarks



^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ CI.KUnit: success for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
  2026-05-27 11:29 ` [PATCH v4 2/2] gpu/tests/buddy: add clear-tracker allocation latency benchmarks Arunpravin Paneer Selvam
  2026-05-27 14:30 ` ✗ CI.checkpatch: warning for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Patchwork
@ 2026-05-27 14:32 ` Patchwork
  2026-05-27 15:24 ` ✓ Xe.CI.BAT: " Patchwork
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2026-05-27 14:32 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam; +Cc: intel-xe

== Series Details ==

Series: series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
URL   : https://patchwork.freedesktop.org/series/167369/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[14:30:51] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[14:30:56] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[14:31:27] Starting KUnit Kernel (1/1)...
[14:31:27] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[14:31:27] ================== guc_buf (11 subtests) ===================
[14:31:27] [PASSED] test_smallest
[14:31:27] [PASSED] test_largest
[14:31:27] [PASSED] test_granular
[14:31:27] [PASSED] test_unique
[14:31:27] [PASSED] test_overlap
[14:31:27] [PASSED] test_reusable
[14:31:27] [PASSED] test_too_big
[14:31:27] [PASSED] test_flush
[14:31:27] [PASSED] test_lookup
[14:31:27] [PASSED] test_data
[14:31:27] [PASSED] test_class
[14:31:27] ===================== [PASSED] guc_buf =====================
[14:31:27] =================== guc_dbm (7 subtests) ===================
[14:31:27] [PASSED] test_empty
[14:31:27] [PASSED] test_default
[14:31:27] ======================== test_size  ========================
[14:31:27] [PASSED] 4
[14:31:27] [PASSED] 8
[14:31:27] [PASSED] 32
[14:31:27] [PASSED] 256
[14:31:27] ==================== [PASSED] test_size ====================
[14:31:27] ======================= test_reuse  ========================
[14:31:27] [PASSED] 4
[14:31:27] [PASSED] 8
[14:31:27] [PASSED] 32
[14:31:27] [PASSED] 256
[14:31:27] =================== [PASSED] test_reuse ====================
[14:31:27] =================== test_range_overlap  ====================
[14:31:27] [PASSED] 4
[14:31:27] [PASSED] 8
[14:31:27] [PASSED] 32
[14:31:27] [PASSED] 256
[14:31:27] =============== [PASSED] test_range_overlap ================
[14:31:27] =================== test_range_compact  ====================
[14:31:27] [PASSED] 4
[14:31:27] [PASSED] 8
[14:31:27] [PASSED] 32
[14:31:27] [PASSED] 256
[14:31:27] =============== [PASSED] test_range_compact ================
[14:31:27] ==================== test_range_spare  =====================
[14:31:27] [PASSED] 4
[14:31:27] [PASSED] 8
[14:31:27] [PASSED] 32
[14:31:27] [PASSED] 256
[14:31:27] ================ [PASSED] test_range_spare =================
[14:31:27] ===================== [PASSED] guc_dbm =====================
[14:31:27] =================== guc_idm (6 subtests) ===================
[14:31:27] [PASSED] bad_init
[14:31:27] [PASSED] no_init
[14:31:27] [PASSED] init_fini
[14:31:27] [PASSED] check_used
[14:31:27] [PASSED] check_quota
[14:31:27] [PASSED] check_all
[14:31:27] ===================== [PASSED] guc_idm =====================
[14:31:27] ================== no_relay (3 subtests) ===================
[14:31:27] [PASSED] xe_drops_guc2pf_if_not_ready
[14:31:27] [PASSED] xe_drops_guc2vf_if_not_ready
[14:31:27] [PASSED] xe_rejects_send_if_not_ready
[14:31:27] ==================== [PASSED] no_relay =====================
[14:31:27] ================== pf_relay (14 subtests) ==================
[14:31:27] [PASSED] pf_rejects_guc2pf_too_short
[14:31:27] [PASSED] pf_rejects_guc2pf_too_long
[14:31:27] [PASSED] pf_rejects_guc2pf_no_payload
[14:31:27] [PASSED] pf_fails_no_payload
[14:31:27] [PASSED] pf_fails_bad_origin
[14:31:27] [PASSED] pf_fails_bad_type
[14:31:27] [PASSED] pf_txn_reports_error
[14:31:27] [PASSED] pf_txn_sends_pf2guc
[14:31:27] [PASSED] pf_sends_pf2guc
[14:31:27] [SKIPPED] pf_loopback_nop
[14:31:27] [SKIPPED] pf_loopback_echo
[14:31:27] [SKIPPED] pf_loopback_fail
[14:31:27] [SKIPPED] pf_loopback_busy
[14:31:27] [SKIPPED] pf_loopback_retry
[14:31:27] ==================== [PASSED] pf_relay =====================
[14:31:27] ================== vf_relay (3 subtests) ===================
[14:31:27] [PASSED] vf_rejects_guc2vf_too_short
[14:31:27] [PASSED] vf_rejects_guc2vf_too_long
[14:31:27] [PASSED] vf_rejects_guc2vf_no_payload
[14:31:27] ==================== [PASSED] vf_relay =====================
[14:31:27] ================ pf_gt_config (9 subtests) =================
[14:31:27] [PASSED] fair_contexts_1vf
[14:31:27] [PASSED] fair_doorbells_1vf
[14:31:27] [PASSED] fair_ggtt_1vf
[14:31:27] ====================== fair_vram_1vf  ======================
[14:31:27] [PASSED] 3.50 GiB
[14:31:27] [PASSED] 11.5 GiB
[14:31:27] [PASSED] 15.5 GiB
[14:31:27] [PASSED] 31.5 GiB
[14:31:27] [PASSED] 63.5 GiB
[14:31:27] [PASSED] 1.91 GiB
[14:31:27] ================== [PASSED] fair_vram_1vf ==================
[14:31:27] ================ fair_vram_1vf_admin_only  =================
[14:31:27] [PASSED] 3.50 GiB
[14:31:27] [PASSED] 11.5 GiB
[14:31:27] [PASSED] 15.5 GiB
[14:31:27] [PASSED] 31.5 GiB
[14:31:27] [PASSED] 63.5 GiB
[14:31:27] [PASSED] 1.91 GiB
[14:31:27] ============ [PASSED] fair_vram_1vf_admin_only =============
[14:31:27] ====================== fair_contexts  ======================
[14:31:27] [PASSED] 1 VF
[14:31:27] [PASSED] 2 VFs
[14:31:27] [PASSED] 3 VFs
[14:31:27] [PASSED] 4 VFs
[14:31:27] [PASSED] 5 VFs
[14:31:27] [PASSED] 6 VFs
[14:31:27] [PASSED] 7 VFs
[14:31:27] [PASSED] 8 VFs
[14:31:27] [PASSED] 9 VFs
[14:31:27] [PASSED] 10 VFs
[14:31:27] [PASSED] 11 VFs
[14:31:27] [PASSED] 12 VFs
[14:31:27] [PASSED] 13 VFs
[14:31:27] [PASSED] 14 VFs
[14:31:27] [PASSED] 15 VFs
[14:31:27] [PASSED] 16 VFs
[14:31:27] [PASSED] 17 VFs
[14:31:27] [PASSED] 18 VFs
[14:31:27] [PASSED] 19 VFs
[14:31:27] [PASSED] 20 VFs
[14:31:27] [PASSED] 21 VFs
[14:31:27] [PASSED] 22 VFs
[14:31:27] [PASSED] 23 VFs
[14:31:27] [PASSED] 24 VFs
[14:31:27] [PASSED] 25 VFs
[14:31:27] [PASSED] 26 VFs
[14:31:27] [PASSED] 27 VFs
[14:31:27] [PASSED] 28 VFs
[14:31:27] [PASSED] 29 VFs
[14:31:27] [PASSED] 30 VFs
[14:31:27] [PASSED] 31 VFs
[14:31:27] [PASSED] 32 VFs
[14:31:27] [PASSED] 33 VFs
[14:31:27] [PASSED] 34 VFs
[14:31:27] [PASSED] 35 VFs
[14:31:27] [PASSED] 36 VFs
[14:31:27] [PASSED] 37 VFs
[14:31:27] [PASSED] 38 VFs
[14:31:27] [PASSED] 39 VFs
[14:31:27] [PASSED] 40 VFs
[14:31:27] [PASSED] 41 VFs
[14:31:27] [PASSED] 42 VFs
[14:31:27] [PASSED] 43 VFs
[14:31:27] [PASSED] 44 VFs
[14:31:27] [PASSED] 45 VFs
[14:31:27] [PASSED] 46 VFs
[14:31:27] [PASSED] 47 VFs
[14:31:27] [PASSED] 48 VFs
[14:31:27] [PASSED] 49 VFs
[14:31:27] [PASSED] 50 VFs
[14:31:27] [PASSED] 51 VFs
[14:31:27] [PASSED] 52 VFs
[14:31:27] [PASSED] 53 VFs
[14:31:27] [PASSED] 54 VFs
[14:31:27] [PASSED] 55 VFs
[14:31:27] [PASSED] 56 VFs
[14:31:27] [PASSED] 57 VFs
[14:31:27] [PASSED] 58 VFs
[14:31:27] [PASSED] 59 VFs
[14:31:27] [PASSED] 60 VFs
[14:31:27] [PASSED] 61 VFs
[14:31:27] [PASSED] 62 VFs
[14:31:27] [PASSED] 63 VFs
[14:31:27] ================== [PASSED] fair_contexts ==================
[14:31:27] ===================== fair_doorbells  ======================
[14:31:27] [PASSED] 1 VF
[14:31:27] [PASSED] 2 VFs
[14:31:27] [PASSED] 3 VFs
[14:31:27] [PASSED] 4 VFs
[14:31:27] [PASSED] 5 VFs
[14:31:27] [PASSED] 6 VFs
[14:31:27] [PASSED] 7 VFs
[14:31:27] [PASSED] 8 VFs
[14:31:27] [PASSED] 9 VFs
[14:31:27] [PASSED] 10 VFs
[14:31:27] [PASSED] 11 VFs
[14:31:27] [PASSED] 12 VFs
[14:31:27] [PASSED] 13 VFs
[14:31:27] [PASSED] 14 VFs
[14:31:27] [PASSED] 15 VFs
[14:31:27] [PASSED] 16 VFs
[14:31:27] [PASSED] 17 VFs
[14:31:27] [PASSED] 18 VFs
[14:31:27] [PASSED] 19 VFs
[14:31:27] [PASSED] 20 VFs
[14:31:27] [PASSED] 21 VFs
[14:31:27] [PASSED] 22 VFs
[14:31:27] [PASSED] 23 VFs
[14:31:27] [PASSED] 24 VFs
[14:31:27] [PASSED] 25 VFs
[14:31:27] [PASSED] 26 VFs
[14:31:27] [PASSED] 27 VFs
[14:31:27] [PASSED] 28 VFs
[14:31:27] [PASSED] 29 VFs
[14:31:27] [PASSED] 30 VFs
[14:31:27] [PASSED] 31 VFs
[14:31:27] [PASSED] 32 VFs
[14:31:27] [PASSED] 33 VFs
[14:31:27] [PASSED] 34 VFs
[14:31:27] [PASSED] 35 VFs
[14:31:27] [PASSED] 36 VFs
[14:31:27] [PASSED] 37 VFs
[14:31:27] [PASSED] 38 VFs
[14:31:27] [PASSED] 39 VFs
[14:31:27] [PASSED] 40 VFs
[14:31:27] [PASSED] 41 VFs
[14:31:27] [PASSED] 42 VFs
[14:31:27] [PASSED] 43 VFs
[14:31:27] [PASSED] 44 VFs
[14:31:27] [PASSED] 45 VFs
[14:31:27] [PASSED] 46 VFs
[14:31:27] [PASSED] 47 VFs
[14:31:27] [PASSED] 48 VFs
[14:31:27] [PASSED] 49 VFs
[14:31:27] [PASSED] 50 VFs
[14:31:27] [PASSED] 51 VFs
[14:31:27] [PASSED] 52 VFs
[14:31:27] [PASSED] 53 VFs
[14:31:27] [PASSED] 54 VFs
[14:31:27] [PASSED] 55 VFs
[14:31:27] [PASSED] 56 VFs
[14:31:27] [PASSED] 57 VFs
[14:31:27] [PASSED] 58 VFs
[14:31:27] [PASSED] 59 VFs
[14:31:27] [PASSED] 60 VFs
[14:31:27] [PASSED] 61 VFs
[14:31:27] [PASSED] 62 VFs
[14:31:27] [PASSED] 63 VFs
[14:31:27] ================= [PASSED] fair_doorbells ==================
[14:31:27] ======================== fair_ggtt  ========================
[14:31:27] [PASSED] 1 VF
[14:31:27] [PASSED] 2 VFs
[14:31:27] [PASSED] 3 VFs
[14:31:27] [PASSED] 4 VFs
[14:31:27] [PASSED] 5 VFs
[14:31:27] [PASSED] 6 VFs
[14:31:27] [PASSED] 7 VFs
[14:31:27] [PASSED] 8 VFs
[14:31:27] [PASSED] 9 VFs
[14:31:28] [PASSED] 10 VFs
[14:31:28] [PASSED] 11 VFs
[14:31:28] [PASSED] 12 VFs
[14:31:28] [PASSED] 13 VFs
[14:31:28] [PASSED] 14 VFs
[14:31:28] [PASSED] 15 VFs
[14:31:28] [PASSED] 16 VFs
[14:31:28] [PASSED] 17 VFs
[14:31:28] [PASSED] 18 VFs
[14:31:28] [PASSED] 19 VFs
[14:31:28] [PASSED] 20 VFs
[14:31:28] [PASSED] 21 VFs
[14:31:28] [PASSED] 22 VFs
[14:31:28] [PASSED] 23 VFs
[14:31:28] [PASSED] 24 VFs
[14:31:28] [PASSED] 25 VFs
[14:31:28] [PASSED] 26 VFs
[14:31:28] [PASSED] 27 VFs
[14:31:28] [PASSED] 28 VFs
[14:31:28] [PASSED] 29 VFs
[14:31:28] [PASSED] 30 VFs
[14:31:28] [PASSED] 31 VFs
[14:31:28] [PASSED] 32 VFs
[14:31:28] [PASSED] 33 VFs
[14:31:28] [PASSED] 34 VFs
[14:31:28] [PASSED] 35 VFs
[14:31:28] [PASSED] 36 VFs
[14:31:28] [PASSED] 37 VFs
[14:31:28] [PASSED] 38 VFs
[14:31:28] [PASSED] 39 VFs
[14:31:28] [PASSED] 40 VFs
[14:31:28] [PASSED] 41 VFs
[14:31:28] [PASSED] 42 VFs
[14:31:28] [PASSED] 43 VFs
[14:31:28] [PASSED] 44 VFs
[14:31:28] [PASSED] 45 VFs
[14:31:28] [PASSED] 46 VFs
[14:31:28] [PASSED] 47 VFs
[14:31:28] [PASSED] 48 VFs
[14:31:28] [PASSED] 49 VFs
[14:31:28] [PASSED] 50 VFs
[14:31:28] [PASSED] 51 VFs
[14:31:28] [PASSED] 52 VFs
[14:31:28] [PASSED] 53 VFs
[14:31:28] [PASSED] 54 VFs
[14:31:28] [PASSED] 55 VFs
[14:31:28] [PASSED] 56 VFs
[14:31:28] [PASSED] 57 VFs
[14:31:28] [PASSED] 58 VFs
[14:31:28] [PASSED] 59 VFs
[14:31:28] [PASSED] 60 VFs
[14:31:28] [PASSED] 61 VFs
[14:31:28] [PASSED] 62 VFs
[14:31:28] [PASSED] 63 VFs
[14:31:28] ==================== [PASSED] fair_ggtt ====================
[14:31:28] ======================== fair_vram  ========================
[14:31:28] [PASSED] 1 VF
[14:31:28] [PASSED] 2 VFs
[14:31:28] [PASSED] 3 VFs
[14:31:28] [PASSED] 4 VFs
[14:31:28] [PASSED] 5 VFs
[14:31:28] [PASSED] 6 VFs
[14:31:28] [PASSED] 7 VFs
[14:31:28] [PASSED] 8 VFs
[14:31:28] [PASSED] 9 VFs
[14:31:28] [PASSED] 10 VFs
[14:31:28] [PASSED] 11 VFs
[14:31:28] [PASSED] 12 VFs
[14:31:28] [PASSED] 13 VFs
[14:31:28] [PASSED] 14 VFs
[14:31:28] [PASSED] 15 VFs
[14:31:28] [PASSED] 16 VFs
[14:31:28] [PASSED] 17 VFs
[14:31:28] [PASSED] 18 VFs
[14:31:28] [PASSED] 19 VFs
[14:31:28] [PASSED] 20 VFs
[14:31:28] [PASSED] 21 VFs
[14:31:28] [PASSED] 22 VFs
[14:31:28] [PASSED] 23 VFs
[14:31:28] [PASSED] 24 VFs
[14:31:28] [PASSED] 25 VFs
[14:31:28] [PASSED] 26 VFs
[14:31:28] [PASSED] 27 VFs
[14:31:28] [PASSED] 28 VFs
[14:31:28] [PASSED] 29 VFs
[14:31:28] [PASSED] 30 VFs
[14:31:28] [PASSED] 31 VFs
[14:31:28] [PASSED] 32 VFs
[14:31:28] [PASSED] 33 VFs
[14:31:28] [PASSED] 34 VFs
[14:31:28] [PASSED] 35 VFs
[14:31:28] [PASSED] 36 VFs
[14:31:28] [PASSED] 37 VFs
[14:31:28] [PASSED] 38 VFs
[14:31:28] [PASSED] 39 VFs
[14:31:28] [PASSED] 40 VFs
[14:31:28] [PASSED] 41 VFs
[14:31:28] [PASSED] 42 VFs
[14:31:28] [PASSED] 43 VFs
[14:31:28] [PASSED] 44 VFs
[14:31:28] [PASSED] 45 VFs
[14:31:28] [PASSED] 46 VFs
[14:31:28] [PASSED] 47 VFs
[14:31:28] [PASSED] 48 VFs
[14:31:28] [PASSED] 49 VFs
[14:31:28] [PASSED] 50 VFs
[14:31:28] [PASSED] 51 VFs
[14:31:28] [PASSED] 52 VFs
[14:31:28] [PASSED] 53 VFs
[14:31:28] [PASSED] 54 VFs
[14:31:28] [PASSED] 55 VFs
[14:31:28] [PASSED] 56 VFs
[14:31:28] [PASSED] 57 VFs
[14:31:28] [PASSED] 58 VFs
[14:31:28] [PASSED] 59 VFs
[14:31:28] [PASSED] 60 VFs
[14:31:28] [PASSED] 61 VFs
[14:31:28] [PASSED] 62 VFs
[14:31:28] [PASSED] 63 VFs
[14:31:28] ==================== [PASSED] fair_vram ====================
[14:31:28] ================== [PASSED] pf_gt_config ===================
[14:31:28] ===================== lmtt (1 subtest) =====================
[14:31:28] ======================== test_ops  =========================
[14:31:28] [PASSED] 2-level
[14:31:28] [PASSED] multi-level
[14:31:28] ==================== [PASSED] test_ops =====================
[14:31:28] ====================== [PASSED] lmtt =======================
[14:31:28] ================= pf_service (11 subtests) =================
[14:31:28] [PASSED] pf_negotiate_any
[14:31:28] [PASSED] pf_negotiate_base_match
[14:31:28] [PASSED] pf_negotiate_base_newer
[14:31:28] [PASSED] pf_negotiate_base_next
[14:31:28] [SKIPPED] pf_negotiate_base_older
[14:31:28] [PASSED] pf_negotiate_base_prev
[14:31:28] [PASSED] pf_negotiate_latest_match
[14:31:28] [PASSED] pf_negotiate_latest_newer
[14:31:28] [PASSED] pf_negotiate_latest_next
[14:31:28] [SKIPPED] pf_negotiate_latest_older
[14:31:28] [SKIPPED] pf_negotiate_latest_prev
[14:31:28] =================== [PASSED] pf_service ====================
[14:31:28] ================= xe_guc_g2g (2 subtests) ==================
[14:31:28] ============== xe_live_guc_g2g_kunit_default  ==============
[14:31:28] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[14:31:28] ============== xe_live_guc_g2g_kunit_allmem  ===============
[14:31:28] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[14:31:28] =================== [SKIPPED] xe_guc_g2g ===================
[14:31:28] =================== xe_mocs (2 subtests) ===================
[14:31:28] ================ xe_live_mocs_kernel_kunit  ================
[14:31:28] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[14:31:28] ================ xe_live_mocs_reset_kunit  =================
[14:31:28] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[14:31:28] ==================== [SKIPPED] xe_mocs =====================
[14:31:28] ================= xe_migrate (2 subtests) ==================
[14:31:28] ================= xe_migrate_sanity_kunit  =================
[14:31:28] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[14:31:28] ================== xe_validate_ccs_kunit  ==================
[14:31:28] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[14:31:28] =================== [SKIPPED] xe_migrate ===================
[14:31:28] ================== xe_dma_buf (1 subtest) ==================
[14:31:28] ==================== xe_dma_buf_kunit  =====================
[14:31:28] ================ [SKIPPED] xe_dma_buf_kunit ================
[14:31:28] =================== [SKIPPED] xe_dma_buf ===================
[14:31:28] ================= xe_bo_shrink (1 subtest) =================
[14:31:28] =================== xe_bo_shrink_kunit  ====================
[14:31:28] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[14:31:28] ================== [SKIPPED] xe_bo_shrink ==================
[14:31:28] ==================== xe_bo (2 subtests) ====================
[14:31:28] ================== xe_ccs_migrate_kunit  ===================
[14:31:28] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[14:31:28] ==================== xe_bo_evict_kunit  ====================
[14:31:28] =============== [SKIPPED] xe_bo_evict_kunit ================
[14:31:28] ===================== [SKIPPED] xe_bo ======================
[14:31:28] ==================== args (13 subtests) ====================
[14:31:28] [PASSED] count_args_test
[14:31:28] [PASSED] call_args_example
[14:31:28] [PASSED] call_args_test
[14:31:28] [PASSED] drop_first_arg_example
[14:31:28] [PASSED] drop_first_arg_test
[14:31:28] [PASSED] first_arg_example
[14:31:28] [PASSED] first_arg_test
[14:31:28] [PASSED] last_arg_example
[14:31:28] [PASSED] last_arg_test
[14:31:28] [PASSED] pick_arg_example
[14:31:28] [PASSED] if_args_example
[14:31:28] [PASSED] if_args_test
[14:31:28] [PASSED] sep_comma_example
[14:31:28] ====================== [PASSED] args =======================
[14:31:28] =================== xe_pci (3 subtests) ====================
[14:31:28] ==================== check_graphics_ip  ====================
[14:31:28] [PASSED] 12.00 Xe_LP
[14:31:28] [PASSED] 12.10 Xe_LP+
[14:31:28] [PASSED] 12.55 Xe_HPG
[14:31:28] [PASSED] 12.60 Xe_HPC
[14:31:28] [PASSED] 12.70 Xe_LPG
[14:31:28] [PASSED] 12.71 Xe_LPG
[14:31:28] [PASSED] 12.74 Xe_LPG+
[14:31:28] [PASSED] 20.01 Xe2_HPG
[14:31:28] [PASSED] 20.02 Xe2_HPG
[14:31:28] [PASSED] 20.04 Xe2_LPG
[14:31:28] [PASSED] 30.00 Xe3_LPG
[14:31:28] [PASSED] 30.01 Xe3_LPG
[14:31:28] [PASSED] 30.03 Xe3_LPG
[14:31:28] [PASSED] 30.04 Xe3_LPG
[14:31:28] [PASSED] 30.05 Xe3_LPG
[14:31:28] [PASSED] 35.10 Xe3p_LPG
[14:31:28] [PASSED] 35.11 Xe3p_XPC
[14:31:28] ================ [PASSED] check_graphics_ip ================
[14:31:28] ===================== check_media_ip  ======================
[14:31:28] [PASSED] 12.00 Xe_M
[14:31:28] [PASSED] 12.55 Xe_HPM
[14:31:28] [PASSED] 13.00 Xe_LPM+
[14:31:28] [PASSED] 13.01 Xe2_HPM
[14:31:28] [PASSED] 20.00 Xe2_LPM
[14:31:28] [PASSED] 30.00 Xe3_LPM
[14:31:28] [PASSED] 30.02 Xe3_LPM
[14:31:28] [PASSED] 35.00 Xe3p_LPM
[14:31:28] [PASSED] 35.03 Xe3p_HPM
[14:31:28] ================= [PASSED] check_media_ip ==================
[14:31:28] =================== check_platform_desc  ===================
[14:31:28] [PASSED] 0x9A60 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A68 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A70 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A40 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A49 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A59 (TIGERLAKE)
[14:31:28] [PASSED] 0x9A78 (TIGERLAKE)
[14:31:28] [PASSED] 0x9AC0 (TIGERLAKE)
[14:31:28] [PASSED] 0x9AC9 (TIGERLAKE)
[14:31:28] [PASSED] 0x9AD9 (TIGERLAKE)
[14:31:28] [PASSED] 0x9AF8 (TIGERLAKE)
[14:31:28] [PASSED] 0x4C80 (ROCKETLAKE)
[14:31:28] [PASSED] 0x4C8A (ROCKETLAKE)
[14:31:28] [PASSED] 0x4C8B (ROCKETLAKE)
[14:31:28] [PASSED] 0x4C8C (ROCKETLAKE)
[14:31:28] [PASSED] 0x4C90 (ROCKETLAKE)
[14:31:28] [PASSED] 0x4C9A (ROCKETLAKE)
[14:31:28] [PASSED] 0x4680 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4682 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4688 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x468A (ALDERLAKE_S)
[14:31:28] [PASSED] 0x468B (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4690 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4692 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4693 (ALDERLAKE_S)
[14:31:28] [PASSED] 0x46A0 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46A1 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46A2 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46A3 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46A6 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46A8 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46AA (ALDERLAKE_P)
[14:31:28] [PASSED] 0x462A (ALDERLAKE_P)
[14:31:28] [PASSED] 0x4626 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x4628 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46B0 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46B1 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46B2 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46B3 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46C0 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46C1 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46C2 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46C3 (ALDERLAKE_P)
[14:31:28] [PASSED] 0x46D0 (ALDERLAKE_N)
[14:31:28] [PASSED] 0x46D1 (ALDERLAKE_N)
[14:31:28] [PASSED] 0x46D2 (ALDERLAKE_N)
[14:31:28] [PASSED] 0x46D3 (ALDERLAKE_N)
[14:31:28] [PASSED] 0x46D4 (ALDERLAKE_N)
[14:31:28] [PASSED] 0xA721 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7A1 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7A9 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7AC (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7AD (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA720 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7A0 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7A8 (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7AA (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA7AB (ALDERLAKE_P)
[14:31:28] [PASSED] 0xA780 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA781 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA782 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA783 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA788 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA789 (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA78A (ALDERLAKE_S)
[14:31:28] [PASSED] 0xA78B (ALDERLAKE_S)
[14:31:28] [PASSED] 0x4905 (DG1)
[14:31:28] [PASSED] 0x4906 (DG1)
[14:31:28] [PASSED] 0x4907 (DG1)
[14:31:28] [PASSED] 0x4908 (DG1)
[14:31:28] [PASSED] 0x4909 (DG1)
[14:31:28] [PASSED] 0x56C0 (DG2)
[14:31:28] [PASSED] 0x56C2 (DG2)
[14:31:28] [PASSED] 0x56C1 (DG2)
[14:31:28] [PASSED] 0x7D51 (METEORLAKE)
[14:31:28] [PASSED] 0x7DD1 (METEORLAKE)
[14:31:28] [PASSED] 0x7D41 (METEORLAKE)
[14:31:28] [PASSED] 0x7D67 (METEORLAKE)
[14:31:28] [PASSED] 0xB640 (METEORLAKE)
[14:31:28] [PASSED] 0x56A0 (DG2)
[14:31:28] [PASSED] 0x56A1 (DG2)
[14:31:28] [PASSED] 0x56A2 (DG2)
[14:31:28] [PASSED] 0x56BE (DG2)
[14:31:28] [PASSED] 0x56BF (DG2)
[14:31:28] [PASSED] 0x5690 (DG2)
[14:31:28] [PASSED] 0x5691 (DG2)
[14:31:28] [PASSED] 0x5692 (DG2)
[14:31:28] [PASSED] 0x56A5 (DG2)
[14:31:28] [PASSED] 0x56A6 (DG2)
[14:31:28] [PASSED] 0x56B0 (DG2)
[14:31:28] [PASSED] 0x56B1 (DG2)
[14:31:28] [PASSED] 0x56BA (DG2)
[14:31:28] [PASSED] 0x56BB (DG2)
[14:31:28] [PASSED] 0x56BC (DG2)
[14:31:28] [PASSED] 0x56BD (DG2)
[14:31:28] [PASSED] 0x5693 (DG2)
[14:31:28] [PASSED] 0x5694 (DG2)
[14:31:28] [PASSED] 0x5695 (DG2)
[14:31:28] [PASSED] 0x56A3 (DG2)
[14:31:28] [PASSED] 0x56A4 (DG2)
[14:31:28] [PASSED] 0x56B2 (DG2)
[14:31:28] [PASSED] 0x56B3 (DG2)
[14:31:28] [PASSED] 0x5696 (DG2)
[14:31:28] [PASSED] 0x5697 (DG2)
[14:31:28] [PASSED] 0xB69 (PVC)
[14:31:28] [PASSED] 0xB6E (PVC)
[14:31:28] [PASSED] 0xBD4 (PVC)
[14:31:28] [PASSED] 0xBD5 (PVC)
[14:31:28] [PASSED] 0xBD6 (PVC)
[14:31:28] [PASSED] 0xBD7 (PVC)
[14:31:28] [PASSED] 0xBD8 (PVC)
[14:31:28] [PASSED] 0xBD9 (PVC)
[14:31:28] [PASSED] 0xBDA (PVC)
[14:31:28] [PASSED] 0xBDB (PVC)
[14:31:28] [PASSED] 0xBE0 (PVC)
[14:31:28] [PASSED] 0xBE1 (PVC)
[14:31:28] [PASSED] 0xBE5 (PVC)
[14:31:28] [PASSED] 0x7D40 (METEORLAKE)
[14:31:28] [PASSED] 0x7D45 (METEORLAKE)
[14:31:28] [PASSED] 0x7D55 (METEORLAKE)
[14:31:28] [PASSED] 0x7D60 (METEORLAKE)
[14:31:28] [PASSED] 0x7DD5 (METEORLAKE)
[14:31:28] [PASSED] 0x6420 (LUNARLAKE)
[14:31:28] [PASSED] 0x64A0 (LUNARLAKE)
[14:31:28] [PASSED] 0x64B0 (LUNARLAKE)
[14:31:28] [PASSED] 0xE202 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE209 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE20B (BATTLEMAGE)
[14:31:28] [PASSED] 0xE20C (BATTLEMAGE)
[14:31:28] [PASSED] 0xE20D (BATTLEMAGE)
[14:31:28] [PASSED] 0xE210 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE211 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE212 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE216 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE220 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE221 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE222 (BATTLEMAGE)
[14:31:28] [PASSED] 0xE223 (BATTLEMAGE)
[14:31:28] [PASSED] 0xB080 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB081 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB082 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB083 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB084 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB085 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB086 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB087 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB08F (PANTHERLAKE)
[14:31:28] [PASSED] 0xB090 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB0A0 (PANTHERLAKE)
[14:31:28] [PASSED] 0xB0B0 (PANTHERLAKE)
[14:31:28] [PASSED] 0xFD80 (PANTHERLAKE)
[14:31:28] [PASSED] 0xFD81 (PANTHERLAKE)
[14:31:28] [PASSED] 0xD740 (NOVALAKE_S)
[14:31:28] [PASSED] 0xD741 (NOVALAKE_S)
[14:31:28] [PASSED] 0xD742 (NOVALAKE_S)
[14:31:28] [PASSED] 0xD743 (NOVALAKE_S)
[14:31:28] [PASSED] 0xD744 (NOVALAKE_S)
[14:31:28] [PASSED] 0xD745 (NOVALAKE_S)
[14:31:28] [PASSED] 0x674C (CRESCENTISLAND)
[14:31:28] [PASSED] 0x674D (CRESCENTISLAND)
[14:31:28] [PASSED] 0x674E (CRESCENTISLAND)
[14:31:28] [PASSED] 0x674F (CRESCENTISLAND)
[14:31:28] [PASSED] 0x6750 (CRESCENTISLAND)
[14:31:28] [PASSED] 0xD750 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD751 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD752 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD753 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD754 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD755 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD756 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD757 (NOVALAKE_P)
[14:31:28] [PASSED] 0xD75F (NOVALAKE_P)
[14:31:28] =============== [PASSED] check_platform_desc ===============
[14:31:28] ===================== [PASSED] xe_pci ======================
[14:31:28] =================== xe_rtp (3 subtests) ====================
[14:31:28] =================== xe_rtp_rules_tests  ====================
[14:31:28] [PASSED] no
[14:31:28] [PASSED] yes
[14:31:28] [PASSED] no-and-no
[14:31:28] [PASSED] no-and-yes
[14:31:28] [PASSED] yes-and-no
[14:31:28] [PASSED] yes-and-yes
[14:31:28] [PASSED] no-or-no
[14:31:28] [PASSED] no-or-yes
[14:31:28] [PASSED] yes-or-no
[14:31:28] [PASSED] yes-or-yes
[14:31:28] [PASSED] no-yes-or-yes-no
[14:31:28] [PASSED] no-yes-or-yes-yes
[14:31:28] [PASSED] yes-yes-or-no-yes
[14:31:28] [PASSED] yes-yes-or-yes-yes
[14:31:28] [PASSED] no-no-or-yes-or-no
[14:31:28] [PASSED] or
[14:31:28] [PASSED] or-yes
[14:31:28] [PASSED] or-no
[14:31:28] [PASSED] yes-or
[14:31:28] [PASSED] no-or
[14:31:28] [PASSED] no-or-or-yes
[14:31:28] [PASSED] yes-or-or-no
[14:31:28] [PASSED] no-or-or-no
[14:31:28] [PASSED] missing-context-engine-class
[14:31:28] [PASSED] missing-context-engine-class-or-yes
[14:31:28] [PASSED] missing-context-engine-class-or-or-yes
[14:31:28] =============== [PASSED] xe_rtp_rules_tests ================
[14:31:28] =============== xe_rtp_process_to_sr_tests  ================
[14:31:28] [PASSED] coalesce-same-reg
[14:31:28] [PASSED] no-match-no-add
[14:31:28] [PASSED] two-regs-two-entries
[14:31:28] [PASSED] clr-one-set-other
[14:31:28] [PASSED] set-field
[14:31:28] [PASSED] conflict-duplicate
[14:31:28] [PASSED] conflict-not-disjoint
[14:31:28] [PASSED] conflict-reg-type
[14:31:28] [PASSED] bad-mcr-reg-forced-to-regular
[14:31:28] [PASSED] bad-regular-reg-forced-to-mcr
[14:31:28] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[14:31:28] ================== xe_rtp_process_tests  ===================
[14:31:28] [PASSED] active1
[14:31:28] [PASSED] active2
[14:31:28] [PASSED] active-inactive
[14:31:28] [PASSED] inactive-active
[14:31:28] [PASSED] inactive-active-inactive
[14:31:28] [PASSED] inactive-inactive-inactive
[14:31:28] ============== [PASSED] xe_rtp_process_tests ===============
[14:31:28] ===================== [PASSED] xe_rtp ======================
[14:31:28] ==================== xe_wa (1 subtest) =====================
[14:31:28] ======================== xe_wa_gt  =========================
[14:31:28] [PASSED] TIGERLAKE B0
[14:31:28] [PASSED] DG1 A0
[14:31:28] [PASSED] DG1 B0
[14:31:28] [PASSED] ALDERLAKE_S A0
[14:31:28] [PASSED] ALDERLAKE_S B0
[14:31:28] [PASSED] ALDERLAKE_S C0
[14:31:28] [PASSED] ALDERLAKE_S D0
[14:31:28] [PASSED] ALDERLAKE_P A0
[14:31:28] [PASSED] ALDERLAKE_P B0
[14:31:28] [PASSED] ALDERLAKE_P C0
[14:31:28] [PASSED] ALDERLAKE_S RPLS D0
[14:31:28] [PASSED] ALDERLAKE_P RPLU E0
[14:31:28] [PASSED] DG2 G10 C0
[14:31:28] [PASSED] DG2 G11 B1
[14:31:28] [PASSED] DG2 G12 A1
[14:31:28] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[14:31:28] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[14:31:28] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[14:31:28] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[14:31:28] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[14:31:28] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[14:31:28] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[14:31:28] ==================== [PASSED] xe_wa_gt =====================
[14:31:28] ====================== [PASSED] xe_wa ======================
[14:31:28] ============================================================
[14:31:28] Testing complete. Ran 624 tests: passed: 606, skipped: 18
[14:31:28] Elapsed time: 36.247s total, 4.211s configuring, 31.369s building, 0.648s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[14:31:28] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[14:31:30] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[14:31:54] Starting KUnit Kernel (1/1)...
[14:31:54] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[14:31:54] ============ drm_test_pick_cmdline (2 subtests) ============
[14:31:54] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[14:31:54] =============== drm_test_pick_cmdline_named  ===============
[14:31:54] [PASSED] NTSC
[14:31:54] [PASSED] NTSC-J
[14:31:54] [PASSED] PAL
[14:31:54] [PASSED] PAL-M
[14:31:54] =========== [PASSED] drm_test_pick_cmdline_named ===========
[14:31:54] ============== [PASSED] drm_test_pick_cmdline ==============
[14:31:54] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[14:31:54] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[14:31:54] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[14:31:54] =========== drm_validate_clone_mode (2 subtests) ===========
[14:31:54] ============== drm_test_check_in_clone_mode  ===============
[14:31:54] [PASSED] in_clone_mode
[14:31:54] [PASSED] not_in_clone_mode
[14:31:54] ========== [PASSED] drm_test_check_in_clone_mode ===========
[14:31:54] =============== drm_test_check_valid_clones  ===============
[14:31:54] [PASSED] not_in_clone_mode
[14:31:54] [PASSED] valid_clone
[14:31:54] [PASSED] invalid_clone
[14:31:54] =========== [PASSED] drm_test_check_valid_clones ===========
[14:31:54] ============= [PASSED] drm_validate_clone_mode =============
[14:31:54] ============= drm_validate_modeset (1 subtest) =============
[14:31:54] [PASSED] drm_test_check_connector_changed_modeset
[14:31:54] ============== [PASSED] drm_validate_modeset ===============
[14:31:54] ====== drm_test_bridge_get_current_state (2 subtests) ======
[14:31:54] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[14:31:54] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[14:31:54] ======== [PASSED] drm_test_bridge_get_current_state ========
[14:31:54] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[14:31:54] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[14:31:54] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[14:31:54] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[14:31:54] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[14:31:54] ============== drm_bridge_alloc (2 subtests) ===============
[14:31:54] [PASSED] drm_test_drm_bridge_alloc_basic
[14:31:54] [PASSED] drm_test_drm_bridge_alloc_get_put
[14:31:54] ================ [PASSED] drm_bridge_alloc =================
[14:31:54] ============= drm_cmdline_parser (40 subtests) =============
[14:31:54] [PASSED] drm_test_cmdline_force_d_only
[14:31:54] [PASSED] drm_test_cmdline_force_D_only_dvi
[14:31:54] [PASSED] drm_test_cmdline_force_D_only_hdmi
[14:31:54] [PASSED] drm_test_cmdline_force_D_only_not_digital
[14:31:54] [PASSED] drm_test_cmdline_force_e_only
[14:31:54] [PASSED] drm_test_cmdline_res
[14:31:54] [PASSED] drm_test_cmdline_res_vesa
[14:31:54] [PASSED] drm_test_cmdline_res_vesa_rblank
[14:31:54] [PASSED] drm_test_cmdline_res_rblank
[14:31:54] [PASSED] drm_test_cmdline_res_bpp
[14:31:54] [PASSED] drm_test_cmdline_res_refresh
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[14:31:54] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[14:31:54] [PASSED] drm_test_cmdline_res_margins_force_on
[14:31:54] [PASSED] drm_test_cmdline_res_vesa_margins
[14:31:54] [PASSED] drm_test_cmdline_name
[14:31:54] [PASSED] drm_test_cmdline_name_bpp
[14:31:54] [PASSED] drm_test_cmdline_name_option
[14:31:54] [PASSED] drm_test_cmdline_name_bpp_option
[14:31:54] [PASSED] drm_test_cmdline_rotate_0
[14:31:54] [PASSED] drm_test_cmdline_rotate_90
[14:31:54] [PASSED] drm_test_cmdline_rotate_180
[14:31:54] [PASSED] drm_test_cmdline_rotate_270
[14:31:54] [PASSED] drm_test_cmdline_hmirror
[14:31:54] [PASSED] drm_test_cmdline_vmirror
[14:31:54] [PASSED] drm_test_cmdline_margin_options
[14:31:54] [PASSED] drm_test_cmdline_multiple_options
[14:31:54] [PASSED] drm_test_cmdline_bpp_extra_and_option
[14:31:54] [PASSED] drm_test_cmdline_extra_and_option
[14:31:54] [PASSED] drm_test_cmdline_freestanding_options
[14:31:54] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[14:31:54] [PASSED] drm_test_cmdline_panel_orientation
[14:31:54] ================ drm_test_cmdline_invalid  =================
[14:31:54] [PASSED] margin_only
[14:31:54] [PASSED] interlace_only
[14:31:54] [PASSED] res_missing_x
[14:31:54] [PASSED] res_missing_y
[14:31:54] [PASSED] res_bad_y
[14:31:54] [PASSED] res_missing_y_bpp
[14:31:54] [PASSED] res_bad_bpp
[14:31:54] [PASSED] res_bad_refresh
[14:31:54] [PASSED] res_bpp_refresh_force_on_off
[14:31:54] [PASSED] res_invalid_mode
[14:31:54] [PASSED] res_bpp_wrong_place_mode
[14:31:54] [PASSED] name_bpp_refresh
[14:31:54] [PASSED] name_refresh
[14:31:54] [PASSED] name_refresh_wrong_mode
[14:31:54] [PASSED] name_refresh_invalid_mode
[14:31:54] [PASSED] rotate_multiple
[14:31:54] [PASSED] rotate_invalid_val
[14:31:54] [PASSED] rotate_truncated
[14:31:54] [PASSED] invalid_option
[14:31:54] [PASSED] invalid_tv_option
[14:31:54] [PASSED] truncated_tv_option
[14:31:54] ============ [PASSED] drm_test_cmdline_invalid =============
[14:31:54] =============== drm_test_cmdline_tv_options  ===============
[14:31:54] [PASSED] NTSC
[14:31:54] [PASSED] NTSC_443
[14:31:54] [PASSED] NTSC_J
[14:31:54] [PASSED] PAL
[14:31:54] [PASSED] PAL_M
[14:31:54] [PASSED] PAL_N
[14:31:54] [PASSED] SECAM
[14:31:54] [PASSED] MONO_525
[14:31:54] [PASSED] MONO_625
[14:31:54] =========== [PASSED] drm_test_cmdline_tv_options ===========
[14:31:54] =============== [PASSED] drm_cmdline_parser ================
[14:31:54] ========== drmm_connector_hdmi_init (20 subtests) ==========
[14:31:54] [PASSED] drm_test_connector_hdmi_init_valid
[14:31:54] [PASSED] drm_test_connector_hdmi_init_bpc_8
[14:31:54] [PASSED] drm_test_connector_hdmi_init_bpc_10
[14:31:54] [PASSED] drm_test_connector_hdmi_init_bpc_12
[14:31:54] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[14:31:54] [PASSED] drm_test_connector_hdmi_init_bpc_null
[14:31:54] [PASSED] drm_test_connector_hdmi_init_formats_empty
[14:31:54] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[14:31:54] === drm_test_connector_hdmi_init_formats_yuv420_allowed  ===
[14:31:54] [PASSED] supported_formats=0x9 yuv420_allowed=1
[14:31:54] [PASSED] supported_formats=0x9 yuv420_allowed=0
[14:31:54] [PASSED] supported_formats=0x5 yuv420_allowed=1
[14:31:54] [PASSED] supported_formats=0x5 yuv420_allowed=0
[14:31:54] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[14:31:54] [PASSED] drm_test_connector_hdmi_init_null_ddc
[14:31:54] [PASSED] drm_test_connector_hdmi_init_null_product
[14:31:54] [PASSED] drm_test_connector_hdmi_init_null_vendor
[14:31:54] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[14:31:54] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[14:31:54] [PASSED] drm_test_connector_hdmi_init_product_valid
[14:31:54] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[14:31:54] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[14:31:54] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[14:31:54] ========= drm_test_connector_hdmi_init_type_valid  =========
[14:31:54] [PASSED] HDMI-A
[14:31:54] [PASSED] HDMI-B
[14:31:54] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[14:31:54] ======== drm_test_connector_hdmi_init_type_invalid  ========
[14:31:54] [PASSED] Unknown
[14:31:54] [PASSED] VGA
[14:31:54] [PASSED] DVI-I
[14:31:54] [PASSED] DVI-D
[14:31:54] [PASSED] DVI-A
[14:31:54] [PASSED] Composite
[14:31:54] [PASSED] SVIDEO
[14:31:54] [PASSED] LVDS
[14:31:54] [PASSED] Component
[14:31:54] [PASSED] DIN
[14:31:54] [PASSED] DP
[14:31:54] [PASSED] TV
[14:31:54] [PASSED] eDP
[14:31:54] [PASSED] Virtual
[14:31:54] [PASSED] DSI
[14:31:54] [PASSED] DPI
[14:31:54] [PASSED] Writeback
[14:31:54] [PASSED] SPI
[14:31:54] [PASSED] USB
[14:31:54] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[14:31:54] ============ [PASSED] drmm_connector_hdmi_init =============
[14:31:54] ============= drmm_connector_init (3 subtests) =============
[14:31:54] [PASSED] drm_test_drmm_connector_init
[14:31:54] [PASSED] drm_test_drmm_connector_init_null_ddc
[14:31:54] ========= drm_test_drmm_connector_init_type_valid  =========
[14:31:54] [PASSED] Unknown
[14:31:54] [PASSED] VGA
[14:31:54] [PASSED] DVI-I
[14:31:54] [PASSED] DVI-D
[14:31:54] [PASSED] DVI-A
[14:31:54] [PASSED] Composite
[14:31:54] [PASSED] SVIDEO
[14:31:54] [PASSED] LVDS
[14:31:54] [PASSED] Component
[14:31:54] [PASSED] DIN
[14:31:54] [PASSED] DP
[14:31:54] [PASSED] HDMI-A
[14:31:54] [PASSED] HDMI-B
[14:31:54] [PASSED] TV
[14:31:54] [PASSED] eDP
[14:31:54] [PASSED] Virtual
[14:31:54] [PASSED] DSI
[14:31:54] [PASSED] DPI
[14:31:54] [PASSED] Writeback
[14:31:54] [PASSED] SPI
[14:31:54] [PASSED] USB
[14:31:54] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[14:31:54] =============== [PASSED] drmm_connector_init ===============
[14:31:54] ========= drm_connector_dynamic_init (6 subtests) ==========
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_init
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_init_properties
[14:31:54] ===== drm_test_drm_connector_dynamic_init_type_valid  ======
[14:31:54] [PASSED] Unknown
[14:31:54] [PASSED] VGA
[14:31:54] [PASSED] DVI-I
[14:31:54] [PASSED] DVI-D
[14:31:54] [PASSED] DVI-A
[14:31:54] [PASSED] Composite
[14:31:54] [PASSED] SVIDEO
[14:31:54] [PASSED] LVDS
[14:31:54] [PASSED] Component
[14:31:54] [PASSED] DIN
[14:31:54] [PASSED] DP
[14:31:54] [PASSED] HDMI-A
[14:31:54] [PASSED] HDMI-B
[14:31:54] [PASSED] TV
[14:31:54] [PASSED] eDP
[14:31:54] [PASSED] Virtual
[14:31:54] [PASSED] DSI
[14:31:54] [PASSED] DPI
[14:31:54] [PASSED] Writeback
[14:31:54] [PASSED] SPI
[14:31:54] [PASSED] USB
[14:31:54] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[14:31:54] ======== drm_test_drm_connector_dynamic_init_name  =========
[14:31:54] [PASSED] Unknown
[14:31:54] [PASSED] VGA
[14:31:54] [PASSED] DVI-I
[14:31:54] [PASSED] DVI-D
[14:31:54] [PASSED] DVI-A
[14:31:54] [PASSED] Composite
[14:31:54] [PASSED] SVIDEO
[14:31:54] [PASSED] LVDS
[14:31:54] [PASSED] Component
[14:31:54] [PASSED] DIN
[14:31:54] [PASSED] DP
[14:31:54] [PASSED] HDMI-A
[14:31:54] [PASSED] HDMI-B
[14:31:54] [PASSED] TV
[14:31:54] [PASSED] eDP
[14:31:54] [PASSED] Virtual
[14:31:54] [PASSED] DSI
[14:31:54] [PASSED] DPI
[14:31:54] [PASSED] Writeback
[14:31:54] [PASSED] SPI
[14:31:54] [PASSED] USB
[14:31:54] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[14:31:54] =========== [PASSED] drm_connector_dynamic_init ============
[14:31:54] ==== drm_connector_dynamic_register_early (4 subtests) =====
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[14:31:54] ====== [PASSED] drm_connector_dynamic_register_early =======
[14:31:54] ======= drm_connector_dynamic_register (7 subtests) ========
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[14:31:54] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[14:31:54] ========= [PASSED] drm_connector_dynamic_register ==========
[14:31:54] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[14:31:54] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[14:31:54] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[14:31:54] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[14:31:54] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[14:31:54] ========== drm_test_get_tv_mode_from_name_valid  ===========
[14:31:54] [PASSED] NTSC
[14:31:54] [PASSED] NTSC-443
[14:31:54] [PASSED] NTSC-J
[14:31:54] [PASSED] PAL
[14:31:54] [PASSED] PAL-M
[14:31:54] [PASSED] PAL-N
[14:31:54] [PASSED] SECAM
[14:31:54] [PASSED] Mono
[14:31:54] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[14:31:54] [PASSED] drm_test_get_tv_mode_from_name_truncated
[14:31:54] ============ [PASSED] drm_get_tv_mode_from_name ============
[14:31:54] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[14:31:54] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[14:31:54] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[14:31:54] [PASSED] VIC 96
[14:31:54] [PASSED] VIC 97
[14:31:54] [PASSED] VIC 101
[14:31:54] [PASSED] VIC 102
[14:31:54] [PASSED] VIC 106
[14:31:54] [PASSED] VIC 107
[14:31:54] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[14:31:54] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[14:31:54] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[14:31:54] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[14:31:54] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[14:31:54] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[14:31:54] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[14:31:54] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[14:31:54] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[14:31:54] [PASSED] Automatic
[14:31:54] [PASSED] Full
[14:31:54] [PASSED] Limited 16:235
[14:31:54] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[14:31:54] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[14:31:54] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[14:31:54] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[14:31:54] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[14:31:54] [PASSED] RGB
[14:31:54] [PASSED] YUV 4:2:0
[14:31:54] [PASSED] YUV 4:2:2
[14:31:54] [PASSED] YUV 4:4:4
[14:31:54] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[14:31:54] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[14:31:54] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[14:31:54] ============= drm_damage_helper (21 subtests) ==============
[14:31:54] [PASSED] drm_test_damage_iter_no_damage
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_src_moved
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_not_visible
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[14:31:54] [PASSED] drm_test_damage_iter_no_damage_no_fb
[14:31:54] [PASSED] drm_test_damage_iter_simple_damage
[14:31:54] [PASSED] drm_test_damage_iter_single_damage
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_outside_src
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_src_moved
[14:31:54] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[14:31:54] [PASSED] drm_test_damage_iter_damage
[14:31:54] [PASSED] drm_test_damage_iter_damage_one_intersect
[14:31:54] [PASSED] drm_test_damage_iter_damage_one_outside
[14:31:54] [PASSED] drm_test_damage_iter_damage_src_moved
[14:31:54] [PASSED] drm_test_damage_iter_damage_not_visible
[14:31:54] ================ [PASSED] drm_damage_helper ================
[14:31:54] ============== drm_dp_mst_helper (3 subtests) ==============
[14:31:54] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[14:31:54] [PASSED] Clock 154000 BPP 30 DSC disabled
[14:31:54] [PASSED] Clock 234000 BPP 30 DSC disabled
[14:31:54] [PASSED] Clock 297000 BPP 24 DSC disabled
[14:31:54] [PASSED] Clock 332880 BPP 24 DSC enabled
[14:31:54] [PASSED] Clock 324540 BPP 24 DSC enabled
[14:31:54] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[14:31:54] ============== drm_test_dp_mst_calc_pbn_div  ===============
[14:31:54] [PASSED] Link rate 2000000 lane count 4
[14:31:54] [PASSED] Link rate 2000000 lane count 2
[14:31:54] [PASSED] Link rate 2000000 lane count 1
[14:31:54] [PASSED] Link rate 1350000 lane count 4
[14:31:54] [PASSED] Link rate 1350000 lane count 2
[14:31:54] [PASSED] Link rate 1350000 lane count 1
[14:31:54] [PASSED] Link rate 1000000 lane count 4
[14:31:54] [PASSED] Link rate 1000000 lane count 2
[14:31:54] [PASSED] Link rate 1000000 lane count 1
[14:31:54] [PASSED] Link rate 810000 lane count 4
[14:31:54] [PASSED] Link rate 810000 lane count 2
[14:31:54] [PASSED] Link rate 810000 lane count 1
[14:31:54] [PASSED] Link rate 540000 lane count 4
[14:31:54] [PASSED] Link rate 540000 lane count 2
[14:31:54] [PASSED] Link rate 540000 lane count 1
[14:31:54] [PASSED] Link rate 270000 lane count 4
[14:31:54] [PASSED] Link rate 270000 lane count 2
[14:31:54] [PASSED] Link rate 270000 lane count 1
[14:31:54] [PASSED] Link rate 162000 lane count 4
[14:31:54] [PASSED] Link rate 162000 lane count 2
[14:31:54] [PASSED] Link rate 162000 lane count 1
[14:31:54] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[14:31:54] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[14:31:54] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[14:31:54] [PASSED] DP_POWER_UP_PHY with port number
[14:31:54] [PASSED] DP_POWER_DOWN_PHY with port number
[14:31:54] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[14:31:54] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[14:31:54] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[14:31:54] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[14:31:54] [PASSED] DP_QUERY_PAYLOAD with port number
[14:31:54] [PASSED] DP_QUERY_PAYLOAD with VCPI
[14:31:54] [PASSED] DP_REMOTE_DPCD_READ with port number
[14:31:54] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[14:31:54] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[14:31:54] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[14:31:54] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[14:31:54] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[14:31:54] [PASSED] DP_REMOTE_I2C_READ with port number
[14:31:54] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[14:31:54] [PASSED] DP_REMOTE_I2C_READ with transactions array
[14:31:54] [PASSED] DP_REMOTE_I2C_WRITE with port number
[14:31:54] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[14:31:54] [PASSED] DP_REMOTE_I2C_WRITE with data array
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[14:31:54] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[14:31:54] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[14:31:54] ================ [PASSED] drm_dp_mst_helper ================
[14:31:54] ================== drm_exec (7 subtests) ===================
[14:31:54] [PASSED] sanitycheck
[14:31:54] [PASSED] test_lock
[14:31:54] [PASSED] test_lock_unlock
[14:31:54] [PASSED] test_duplicates
[14:31:54] [PASSED] test_prepare
[14:31:54] [PASSED] test_prepare_array
[14:31:54] [PASSED] test_multiple_loops
[14:31:54] ==================== [PASSED] drm_exec =====================
[14:31:54] =========== drm_format_helper_test (17 subtests) ===========
[14:31:54] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[14:31:54] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[14:31:54] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[14:31:54] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[14:31:54] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[14:31:54] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[14:31:54] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[14:31:54] ============= drm_test_fb_xrgb8888_to_bgr888  ==============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[14:31:54] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[14:31:54] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[14:31:54] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[14:31:54] ============== drm_test_fb_xrgb8888_to_mono  ===============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[14:31:54] ==================== drm_test_fb_swab  =====================
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ================ [PASSED] drm_test_fb_swab =================
[14:31:54] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[14:31:54] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[14:31:54] [PASSED] single_pixel_source_buffer
[14:31:54] [PASSED] single_pixel_clip_rectangle
[14:31:54] [PASSED] well_known_colors
[14:31:54] [PASSED] destination_pitch
[14:31:54] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[14:31:54] ================= drm_test_fb_clip_offset  =================
[14:31:54] [PASSED] pass through
[14:31:54] [PASSED] horizontal offset
[14:31:54] [PASSED] vertical offset
[14:31:54] [PASSED] horizontal and vertical offset
[14:31:54] [PASSED] horizontal offset (custom pitch)
[14:31:54] [PASSED] vertical offset (custom pitch)
[14:31:54] [PASSED] horizontal and vertical offset (custom pitch)
[14:31:54] ============= [PASSED] drm_test_fb_clip_offset =============
[14:31:54] =================== drm_test_fb_memcpy  ====================
[14:31:54] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[14:31:54] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[14:31:54] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[14:31:54] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[14:31:54] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[14:31:54] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[14:31:54] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[14:31:54] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[14:31:54] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[14:31:54] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[14:31:54] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[14:31:54] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[14:31:54] =============== [PASSED] drm_test_fb_memcpy ================
[14:31:54] ============= [PASSED] drm_format_helper_test ==============
[14:31:54] ================= drm_format (18 subtests) =================
[14:31:54] [PASSED] drm_test_format_block_width_invalid
[14:31:54] [PASSED] drm_test_format_block_width_one_plane
[14:31:54] [PASSED] drm_test_format_block_width_two_plane
[14:31:54] [PASSED] drm_test_format_block_width_three_plane
[14:31:54] [PASSED] drm_test_format_block_width_tiled
[14:31:54] [PASSED] drm_test_format_block_height_invalid
[14:31:54] [PASSED] drm_test_format_block_height_one_plane
[14:31:54] [PASSED] drm_test_format_block_height_two_plane
[14:31:54] [PASSED] drm_test_format_block_height_three_plane
[14:31:54] [PASSED] drm_test_format_block_height_tiled
[14:31:54] [PASSED] drm_test_format_min_pitch_invalid
[14:31:54] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[14:31:54] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[14:31:54] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[14:31:54] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[14:31:54] [PASSED] drm_test_format_min_pitch_two_plane
[14:31:54] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[14:31:54] [PASSED] drm_test_format_min_pitch_tiled
[14:31:54] =================== [PASSED] drm_format ====================
[14:31:54] ============== drm_framebuffer (10 subtests) ===============
[14:31:54] ========== drm_test_framebuffer_check_src_coords  ==========
[14:31:54] [PASSED] Success: source fits into fb
[14:31:54] [PASSED] Fail: overflowing fb with x-axis coordinate
[14:31:54] [PASSED] Fail: overflowing fb with y-axis coordinate
[14:31:54] [PASSED] Fail: overflowing fb with source width
[14:31:54] [PASSED] Fail: overflowing fb with source height
[14:31:54] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[14:31:54] [PASSED] drm_test_framebuffer_cleanup
[14:31:54] =============== drm_test_framebuffer_create  ===============
[14:31:54] [PASSED] ABGR8888 normal sizes
[14:31:54] [PASSED] ABGR8888 max sizes
[14:31:54] [PASSED] ABGR8888 pitch greater than min required
[14:31:54] [PASSED] ABGR8888 pitch less than min required
[14:31:54] [PASSED] ABGR8888 Invalid width
[14:31:54] [PASSED] ABGR8888 Invalid buffer handle
[14:31:54] [PASSED] No pixel format
[14:31:54] [PASSED] ABGR8888 Width 0
[14:31:54] [PASSED] ABGR8888 Height 0
[14:31:54] [PASSED] ABGR8888 Out of bound height * pitch combination
[14:31:54] [PASSED] ABGR8888 Large buffer offset
[14:31:54] [PASSED] ABGR8888 Buffer offset for inexistent plane
[14:31:54] [PASSED] ABGR8888 Invalid flag
[14:31:54] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[14:31:54] [PASSED] ABGR8888 Valid buffer modifier
[14:31:54] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[14:31:54] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] NV12 Normal sizes
[14:31:54] [PASSED] NV12 Max sizes
[14:31:54] [PASSED] NV12 Invalid pitch
[14:31:54] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[14:31:54] [PASSED] NV12 different  modifier per-plane
[14:31:54] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[14:31:54] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] NV12 Modifier for inexistent plane
[14:31:54] [PASSED] NV12 Handle for inexistent plane
[14:31:54] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[14:31:54] [PASSED] YVU420 Normal sizes
[14:31:54] [PASSED] YVU420 Max sizes
[14:31:54] [PASSED] YVU420 Invalid pitch
[14:31:54] [PASSED] YVU420 Different pitches
[14:31:54] [PASSED] YVU420 Different buffer offsets/pitches
[14:31:54] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[14:31:54] [PASSED] YVU420 Valid modifier
[14:31:54] [PASSED] YVU420 Different modifiers per plane
[14:31:54] [PASSED] YVU420 Modifier for inexistent plane
[14:31:54] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[14:31:54] [PASSED] X0L2 Normal sizes
[14:31:54] [PASSED] X0L2 Max sizes
[14:31:54] [PASSED] X0L2 Invalid pitch
[14:31:54] [PASSED] X0L2 Pitch greater than minimum required
[14:31:54] [PASSED] X0L2 Handle for inexistent plane
[14:31:54] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[14:31:54] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[14:31:54] [PASSED] X0L2 Valid modifier
[14:31:54] [PASSED] X0L2 Modifier for inexistent plane
[14:31:54] =========== [PASSED] drm_test_framebuffer_create ===========
[14:31:54] [PASSED] drm_test_framebuffer_free
[14:31:54] [PASSED] drm_test_framebuffer_init
[14:31:54] [PASSED] drm_test_framebuffer_init_bad_format
[14:31:54] [PASSED] drm_test_framebuffer_init_dev_mismatch
[14:31:54] [PASSED] drm_test_framebuffer_lookup
[14:31:54] [PASSED] drm_test_framebuffer_lookup_inexistent
[14:31:54] [PASSED] drm_test_framebuffer_modifiers_not_supported
[14:31:54] ================= [PASSED] drm_framebuffer =================
[14:31:54] ================ drm_gem_shmem (8 subtests) ================
[14:31:54] [PASSED] drm_gem_shmem_test_obj_create
[14:31:54] [PASSED] drm_gem_shmem_test_obj_create_private
[14:31:54] [PASSED] drm_gem_shmem_test_pin_pages
[14:31:54] [PASSED] drm_gem_shmem_test_vmap
[14:31:54] [PASSED] drm_gem_shmem_test_get_sg_table
[14:31:54] [PASSED] drm_gem_shmem_test_get_pages_sgt
[14:31:54] [PASSED] drm_gem_shmem_test_madvise
[14:31:54] [PASSED] drm_gem_shmem_test_purge
[14:31:54] ================== [PASSED] drm_gem_shmem ==================
[14:31:54] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[14:31:54] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420  =======
[14:31:54] [PASSED] Automatic
[14:31:54] [PASSED] Full
[14:31:54] [PASSED] Limited 16:235
[14:31:54] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[14:31:54] [PASSED] drm_test_check_disable_connector
[14:31:54] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[14:31:54] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[14:31:54] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[14:31:54] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[14:31:54] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[14:31:54] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[14:31:54] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[14:31:54] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[14:31:54] [PASSED] drm_test_check_output_bpc_dvi
[14:31:54] [PASSED] drm_test_check_output_bpc_format_vic_1
[14:31:54] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[14:31:54] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[14:31:54] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[14:31:54] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[14:31:54] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[14:31:54] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[14:31:54] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[14:31:54] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[14:31:54] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[14:31:54] [PASSED] drm_test_check_broadcast_rgb_value
[14:31:54] [PASSED] drm_test_check_bpc_8_value
[14:31:54] [PASSED] drm_test_check_bpc_10_value
[14:31:54] [PASSED] drm_test_check_bpc_12_value
[14:31:54] [PASSED] drm_test_check_format_value
[14:31:54] [PASSED] drm_test_check_tmds_char_value
[14:31:54] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[14:31:54] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[14:31:54] [PASSED] drm_test_check_mode_valid
[14:31:54] [PASSED] drm_test_check_mode_valid_reject
[14:31:54] [PASSED] drm_test_check_mode_valid_reject_rate
[14:31:54] [PASSED] drm_test_check_mode_valid_reject_max_clock
[14:31:54] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[14:31:54] = drm_atomic_helper_connector_hdmi_infoframes (5 subtests) =
[14:31:54] [PASSED] drm_test_check_infoframes
[14:31:54] [PASSED] drm_test_check_reject_avi_infoframe
[14:31:54] [PASSED] drm_test_check_reject_hdr_infoframe_bpc_8
[14:31:54] [PASSED] drm_test_check_reject_hdr_infoframe_bpc_10
[14:31:54] [PASSED] drm_test_check_reject_audio_infoframe
[14:31:54] === [PASSED] drm_atomic_helper_connector_hdmi_infoframes ===
[14:31:54] ================= drm_managed (2 subtests) =================
[14:31:54] [PASSED] drm_test_managed_release_action
[14:31:54] [PASSED] drm_test_managed_run_action
[14:31:54] =================== [PASSED] drm_managed ===================
[14:31:54] =================== drm_mm (6 subtests) ====================
[14:31:54] [PASSED] drm_test_mm_init
[14:31:54] [PASSED] drm_test_mm_debug
[14:31:54] [PASSED] drm_test_mm_align32
[14:31:54] [PASSED] drm_test_mm_align64
[14:31:54] [PASSED] drm_test_mm_lowest
[14:31:54] [PASSED] drm_test_mm_highest
[14:31:54] ===================== [PASSED] drm_mm ======================
[14:31:54] ============= drm_modes_analog_tv (5 subtests) =============
[14:31:54] [PASSED] drm_test_modes_analog_tv_mono_576i
[14:31:54] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[14:31:54] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[14:31:54] [PASSED] drm_test_modes_analog_tv_pal_576i
[14:31:54] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[14:31:54] =============== [PASSED] drm_modes_analog_tv ===============
[14:31:54] ============== drm_plane_helper (2 subtests) ===============
[14:31:54] =============== drm_test_check_plane_state  ================
[14:31:54] [PASSED] clipping_simple
[14:31:54] [PASSED] clipping_rotate_reflect
[14:31:54] [PASSED] positioning_simple
[14:31:54] [PASSED] upscaling
[14:31:54] [PASSED] downscaling
[14:31:54] [PASSED] rounding1
[14:31:54] [PASSED] rounding2
[14:31:54] [PASSED] rounding3
[14:31:54] [PASSED] rounding4
[14:31:54] =========== [PASSED] drm_test_check_plane_state ============
[14:31:54] =========== drm_test_check_invalid_plane_state  ============
[14:31:54] [PASSED] positioning_invalid
[14:31:54] [PASSED] upscaling_invalid
[14:31:54] [PASSED] downscaling_invalid
[14:31:54] ======= [PASSED] drm_test_check_invalid_plane_state ========
[14:31:54] ================ [PASSED] drm_plane_helper =================
[14:31:54] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[14:31:54] ====== drm_test_connector_helper_tv_get_modes_check  =======
[14:31:54] [PASSED] None
[14:31:54] [PASSED] PAL
[14:31:54] [PASSED] NTSC
[14:31:54] [PASSED] Both, NTSC Default
[14:31:54] [PASSED] Both, PAL Default
[14:31:54] [PASSED] Both, NTSC Default, with PAL on command-line
[14:31:54] [PASSED] Both, PAL Default, with NTSC on command-line
[14:31:54] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[14:31:54] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[14:31:54] ================== drm_rect (9 subtests) ===================
[14:31:54] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[14:31:54] [PASSED] drm_test_rect_clip_scaled_not_clipped
[14:31:54] [PASSED] drm_test_rect_clip_scaled_clipped
[14:31:54] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[14:31:54] ================= drm_test_rect_intersect  =================
[14:31:54] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[14:31:54] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[14:31:54] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[14:31:54] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[14:31:54] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[14:31:54] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[14:31:54] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[14:31:54] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[14:31:54] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[14:31:54] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[14:31:54] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[14:31:54] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[14:31:54] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[14:31:54] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[14:31:54] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[14:31:54] ============= [PASSED] drm_test_rect_intersect =============
[14:31:54] ================ drm_test_rect_calc_hscale  ================
[14:31:54] [PASSED] normal use
[14:31:54] [PASSED] out of max range
[14:31:54] [PASSED] out of min range
[14:31:54] [PASSED] zero dst
[14:31:54] [PASSED] negative src
[14:31:54] [PASSED] negative dst
[14:31:54] ============ [PASSED] drm_test_rect_calc_hscale ============
[14:31:54] ================ drm_test_rect_calc_vscale  ================
[14:31:54] [PASSED] normal use
[14:31:54] [PASSED] out of max range
[14:31:54] [PASSED] out of min range
[14:31:54] [PASSED] zero dst
[14:31:54] [PASSED] negative src
[14:31:54] [PASSED] negative dst
[14:31:54] ============ [PASSED] drm_test_rect_calc_vscale ============
[14:31:54] ================== drm_test_rect_rotate  ===================
[14:31:54] [PASSED] reflect-x
[14:31:54] [PASSED] reflect-y
[14:31:54] [PASSED] rotate-0
[14:31:54] [PASSED] rotate-90
[14:31:54] [PASSED] rotate-180
[14:31:54] [PASSED] rotate-270
[14:31:54] ============== [PASSED] drm_test_rect_rotate ===============
[14:31:54] ================ drm_test_rect_rotate_inv  =================
[14:31:54] [PASSED] reflect-x
[14:31:54] [PASSED] reflect-y
[14:31:54] [PASSED] rotate-0
[14:31:54] [PASSED] rotate-90
[14:31:54] [PASSED] rotate-180
[14:31:54] [PASSED] rotate-270
[14:31:54] ============ [PASSED] drm_test_rect_rotate_inv =============
[14:31:54] ==================== [PASSED] drm_rect =====================
[14:31:54] ============ drm_sysfb_modeset_test (1 subtest) ============
[14:31:54] ============ drm_test_sysfb_build_fourcc_list  =============
[14:31:54] [PASSED] no native formats
[14:31:54] [PASSED] XRGB8888 as native format
[14:31:54] [PASSED] remove duplicates
[14:31:54] [PASSED] convert alpha formats
[14:31:54] [PASSED] random formats
[14:31:54] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[14:31:54] ============= [PASSED] drm_sysfb_modeset_test ==============
[14:31:54] ================== drm_fixp (2 subtests) ===================
[14:31:54] [PASSED] drm_test_int2fixp
[14:31:54] [PASSED] drm_test_sm2fixp
[14:31:54] ==================== [PASSED] drm_fixp =====================
[14:31:54] ============================================================
[14:31:54] Testing complete. Ran 621 tests: passed: 621
[14:31:54] Elapsed time: 25.995s total, 1.781s configuring, 24.048s building, 0.135s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[14:31:54] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[14:31:56] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[14:32:05] Starting KUnit Kernel (1/1)...
[14:32:05] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[14:32:05] ================= ttm_device (5 subtests) ==================
[14:32:05] [PASSED] ttm_device_init_basic
[14:32:05] [PASSED] ttm_device_init_multiple
[14:32:05] [PASSED] ttm_device_fini_basic
[14:32:05] [PASSED] ttm_device_init_no_vma_man
[14:32:05] ================== ttm_device_init_pools  ==================
[14:32:05] [PASSED] No DMA allocations, no DMA32 required
[14:32:05] [PASSED] DMA allocations, DMA32 required
[14:32:05] [PASSED] No DMA allocations, DMA32 required
[14:32:05] [PASSED] DMA allocations, no DMA32 required
[14:32:05] ============== [PASSED] ttm_device_init_pools ==============
[14:32:05] =================== [PASSED] ttm_device ====================
[14:32:05] ================== ttm_pool (8 subtests) ===================
[14:32:05] ================== ttm_pool_alloc_basic  ===================
[14:32:05] [PASSED] One page
[14:32:05] [PASSED] More than one page
[14:32:05] [PASSED] Above the allocation limit
[14:32:05] [PASSED] One page, with coherent DMA mappings enabled
[14:32:05] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[14:32:05] ============== [PASSED] ttm_pool_alloc_basic ===============
[14:32:05] ============== ttm_pool_alloc_basic_dma_addr  ==============
[14:32:05] [PASSED] One page
[14:32:05] [PASSED] More than one page
[14:32:05] [PASSED] Above the allocation limit
[14:32:05] [PASSED] One page, with coherent DMA mappings enabled
[14:32:05] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[14:32:05] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[14:32:05] [PASSED] ttm_pool_alloc_order_caching_match
[14:32:05] [PASSED] ttm_pool_alloc_caching_mismatch
[14:32:05] [PASSED] ttm_pool_alloc_order_mismatch
[14:32:05] [PASSED] ttm_pool_free_dma_alloc
[14:32:05] [PASSED] ttm_pool_free_no_dma_alloc
[14:32:05] [PASSED] ttm_pool_fini_basic
[14:32:05] ==================== [PASSED] ttm_pool =====================
[14:32:05] ================ ttm_resource (8 subtests) =================
[14:32:05] ================= ttm_resource_init_basic  =================
[14:32:05] [PASSED] Init resource in TTM_PL_SYSTEM
[14:32:05] [PASSED] Init resource in TTM_PL_VRAM
[14:32:05] [PASSED] Init resource in a private placement
[14:32:05] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[14:32:05] ============= [PASSED] ttm_resource_init_basic =============
[14:32:05] [PASSED] ttm_resource_init_pinned
[14:32:05] [PASSED] ttm_resource_fini_basic
[14:32:05] [PASSED] ttm_resource_manager_init_basic
[14:32:05] [PASSED] ttm_resource_manager_usage_basic
[14:32:05] [PASSED] ttm_resource_manager_set_used_basic
[14:32:05] [PASSED] ttm_sys_man_alloc_basic
[14:32:05] [PASSED] ttm_sys_man_free_basic
[14:32:05] ================== [PASSED] ttm_resource ===================
[14:32:05] =================== ttm_tt (15 subtests) ===================
[14:32:05] ==================== ttm_tt_init_basic  ====================
[14:32:05] [PASSED] Page-aligned size
[14:32:05] [PASSED] Extra pages requested
[14:32:05] ================ [PASSED] ttm_tt_init_basic ================
[14:32:05] [PASSED] ttm_tt_init_misaligned
[14:32:05] [PASSED] ttm_tt_fini_basic
[14:32:05] [PASSED] ttm_tt_fini_sg
[14:32:05] [PASSED] ttm_tt_fini_shmem
[14:32:05] [PASSED] ttm_tt_create_basic
[14:32:05] [PASSED] ttm_tt_create_invalid_bo_type
[14:32:05] [PASSED] ttm_tt_create_ttm_exists
[14:32:05] [PASSED] ttm_tt_create_failed
[14:32:05] [PASSED] ttm_tt_destroy_basic
[14:32:05] [PASSED] ttm_tt_populate_null_ttm
[14:32:05] [PASSED] ttm_tt_populate_populated_ttm
[14:32:05] [PASSED] ttm_tt_unpopulate_basic
[14:32:05] [PASSED] ttm_tt_unpopulate_empty_ttm
[14:32:05] [PASSED] ttm_tt_swapin_basic
[14:32:05] ===================== [PASSED] ttm_tt ======================
[14:32:05] =================== ttm_bo (14 subtests) ===================
[14:32:05] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[14:32:05] [PASSED] Cannot be interrupted and sleeps
[14:32:05] [PASSED] Cannot be interrupted, locks straight away
[14:32:05] [PASSED] Can be interrupted, sleeps
[14:32:05] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[14:32:05] [PASSED] ttm_bo_reserve_locked_no_sleep
[14:32:05] [PASSED] ttm_bo_reserve_no_wait_ticket
[14:32:05] [PASSED] ttm_bo_reserve_double_resv
[14:32:05] [PASSED] ttm_bo_reserve_interrupted
[14:32:05] [PASSED] ttm_bo_reserve_deadlock
[14:32:05] [PASSED] ttm_bo_unreserve_basic
[14:32:05] [PASSED] ttm_bo_unreserve_pinned
[14:32:05] [PASSED] ttm_bo_unreserve_bulk
[14:32:05] [PASSED] ttm_bo_fini_basic
[14:32:05] [PASSED] ttm_bo_fini_shared_resv
[14:32:05] [PASSED] ttm_bo_pin_basic
[14:32:05] [PASSED] ttm_bo_pin_unpin_resource
[14:32:05] [PASSED] ttm_bo_multiple_pin_one_unpin
[14:32:05] ===================== [PASSED] ttm_bo ======================
[14:32:05] ============== ttm_bo_validate (22 subtests) ===============
[14:32:05] ============== ttm_bo_init_reserved_sys_man  ===============
[14:32:05] [PASSED] Buffer object for userspace
[14:32:05] [PASSED] Kernel buffer object
[14:32:05] [PASSED] Shared buffer object
[14:32:05] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[14:32:05] ============== ttm_bo_init_reserved_mock_man  ==============
[14:32:05] [PASSED] Buffer object for userspace
[14:32:05] [PASSED] Kernel buffer object
[14:32:05] [PASSED] Shared buffer object
[14:32:05] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[14:32:05] [PASSED] ttm_bo_init_reserved_resv
[14:32:05] ================== ttm_bo_validate_basic  ==================
[14:32:05] [PASSED] Buffer object for userspace
[14:32:05] [PASSED] Kernel buffer object
[14:32:05] [PASSED] Shared buffer object
[14:32:05] ============== [PASSED] ttm_bo_validate_basic ==============
[14:32:05] [PASSED] ttm_bo_validate_invalid_placement
[14:32:05] ============= ttm_bo_validate_same_placement  ==============
[14:32:05] [PASSED] System manager
[14:32:05] [PASSED] VRAM manager
[14:32:05] ========= [PASSED] ttm_bo_validate_same_placement ==========
[14:32:05] [PASSED] ttm_bo_validate_failed_alloc
[14:32:05] [PASSED] ttm_bo_validate_pinned
[14:32:05] [PASSED] ttm_bo_validate_busy_placement
[14:32:05] ================ ttm_bo_validate_multihop  =================
[14:32:05] [PASSED] Buffer object for userspace
[14:32:05] [PASSED] Kernel buffer object
[14:32:05] [PASSED] Shared buffer object
[14:32:05] ============ [PASSED] ttm_bo_validate_multihop =============
[14:32:05] ========== ttm_bo_validate_no_placement_signaled  ==========
[14:32:05] [PASSED] Buffer object in system domain, no page vector
[14:32:05] [PASSED] Buffer object in system domain with an existing page vector
[14:32:05] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[14:32:05] ======== ttm_bo_validate_no_placement_not_signaled  ========
[14:32:05] [PASSED] Buffer object for userspace
[14:32:05] [PASSED] Kernel buffer object
[14:32:05] [PASSED] Shared buffer object
[14:32:05] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[14:32:05] [PASSED] ttm_bo_validate_move_fence_signaled
[14:32:05] ========= ttm_bo_validate_move_fence_not_signaled  =========
[14:32:05] [PASSED] Waits for GPU
[14:32:05] [PASSED] Tries to lock straight away
[14:32:05] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[14:32:05] [PASSED] ttm_bo_validate_swapout
[14:32:05] [PASSED] ttm_bo_validate_happy_evict
[14:32:05] [PASSED] ttm_bo_validate_all_pinned_evict
[14:32:05] [PASSED] ttm_bo_validate_allowed_only_evict
[14:32:05] [PASSED] ttm_bo_validate_deleted_evict
[14:32:05] [PASSED] ttm_bo_validate_busy_domain_evict
[14:32:05] [PASSED] ttm_bo_validate_evict_gutting
[14:32:05] [PASSED] ttm_bo_validate_recrusive_evict
[14:32:05] ================= [PASSED] ttm_bo_validate =================
[14:32:05] ============================================================
[14:32:05] Testing complete. Ran 102 tests: passed: 102
[14:32:05] Elapsed time: 11.529s total, 1.717s configuring, 9.596s building, 0.179s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ Xe.CI.BAT: success for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
                   ` (2 preceding siblings ...)
  2026-05-27 14:32 ` ✓ CI.KUnit: success " Patchwork
@ 2026-05-27 15:24 ` Patchwork
  2026-05-27 19:27 ` ✗ Xe.CI.FULL: failure " Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2026-05-27 15:24 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]

== Series Details ==

Series: series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
URL   : https://patchwork.freedesktop.org/series/167369/
State : success

== Summary ==

CI Bug Log - changes from xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32_BAT -> xe-pw-167369v1_BAT
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (13 -> 13)
------------------------------

  No changes in participating hosts


Changes
-------

  No changes found


Build changes
-------------

  * Linux: xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32 -> xe-pw-167369v1

  IGT_8938: b024a3b67372962ff6e643d3998c5cf5acc07081 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32: 5390f2273d45bb259d88508828018c0fbbb79d32
  xe-pw-167369v1: 167369v1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/index.html

[-- Attachment #2: Type: text/html, Size: 1561 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✗ Xe.CI.FULL: failure for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
                   ` (3 preceding siblings ...)
  2026-05-27 15:24 ` ✓ Xe.CI.BAT: " Patchwork
@ 2026-05-27 19:27 ` Patchwork
  2026-05-28 12:33 ` [PATCH v4 1/2] " Arunpravin Paneer Selvam
  2026-05-29 17:41 ` Matthew Auld
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2026-05-27 19:27 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 37796 bytes --]

== Series Details ==

Series: series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
URL   : https://patchwork.freedesktop.org/series/167369/
State : failure

== Summary ==

CI Bug Log - changes from xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32_FULL -> xe-pw-167369v1_FULL
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with xe-pw-167369v1_FULL need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in xe-pw-167369v1_FULL, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (2 -> 2)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in xe-pw-167369v1_FULL:

### IGT changes ###

#### Warnings ####

  * igt@kms_content_protection@legacy:
    - shard-bmg:          [FAIL][1] ([Intel XE#1178] / [Intel XE#3304] / [Intel XE#7374]) -> [DMESG-FAIL][2] +1 other test dmesg-fail
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-1/igt@kms_content_protection@legacy.html
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-7/igt@kms_content_protection@legacy.html

  
Known issues
------------

  Here are the changes found in xe-pw-167369v1_FULL that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_async_flips@invalid-async-flip@pipe-a-hdmi-a-3:
    - shard-bmg:          [PASS][3] -> [ABORT][4] ([Intel XE#7814]) +1 other test abort
   [3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-8/igt@kms_async_flips@invalid-async-flip@pipe-a-hdmi-a-3.html
   [4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-10/igt@kms_async_flips@invalid-async-flip@pipe-a-hdmi-a-3.html

  * igt@kms_async_flips@invalid-async-flip@pipe-c-hdmi-a-3:
    - shard-bmg:          [PASS][5] -> [DMESG-WARN][6] ([Intel XE#7814]) +5 other tests dmesg-warn
   [5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-8/igt@kms_async_flips@invalid-async-flip@pipe-c-hdmi-a-3.html
   [6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-10/igt@kms_async_flips@invalid-async-flip@pipe-c-hdmi-a-3.html

  * igt@kms_atomic_transition@plane-all-modeset-transition-fencing:
    - shard-bmg:          [PASS][7] -> [INCOMPLETE][8] ([Intel XE#8174])
   [7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-5/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html
   [8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-2/igt@kms_atomic_transition@plane-all-modeset-transition-fencing.html

  * igt@kms_atomic_transition@plane-all-modeset-transition-fencing@pipe-b-dp-2:
    - shard-bmg:          [PASS][9] -> [INCOMPLETE][10] ([Intel XE#7961] / [Intel XE#8174])
   [9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-5/igt@kms_atomic_transition@plane-all-modeset-transition-fencing@pipe-b-dp-2.html
   [10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-2/igt@kms_atomic_transition@plane-all-modeset-transition-fencing@pipe-b-dp-2.html

  * igt@kms_big_fb@4-tiled-64bpp-rotate-180:
    - shard-bmg:          NOTRUN -> [INCOMPLETE][11] ([Intel XE#5643])
   [11]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-6/igt@kms_big_fb@4-tiled-64bpp-rotate-180.html

  * igt@kms_big_fb@x-tiled-8bpp-rotate-270:
    - shard-bmg:          NOTRUN -> [SKIP][12] ([Intel XE#2327]) +1 other test skip
   [12]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_big_fb@x-tiled-8bpp-rotate-270.html

  * igt@kms_big_fb@y-tiled-32bpp-rotate-180:
    - shard-lnl:          NOTRUN -> [SKIP][13] ([Intel XE#1124]) +1 other test skip
   [13]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_big_fb@y-tiled-32bpp-rotate-180.html

  * igt@kms_big_fb@y-tiled-addfb-size-overflow:
    - shard-lnl:          NOTRUN -> [SKIP][14] ([Intel XE#1428] / [Intel XE#7387])
   [14]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_big_fb@y-tiled-addfb-size-overflow.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0:
    - shard-bmg:          NOTRUN -> [SKIP][15] ([Intel XE#1124]) +3 other tests skip
   [15]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0.html

  * igt@kms_bw@connected-linear-tiling-3-displays-target-1920x1080p:
    - shard-bmg:          NOTRUN -> [SKIP][16] ([Intel XE#7679]) +1 other test skip
   [16]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_bw@connected-linear-tiling-3-displays-target-1920x1080p.html
    - shard-lnl:          NOTRUN -> [SKIP][17] ([Intel XE#7679])
   [17]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_bw@connected-linear-tiling-3-displays-target-1920x1080p.html

  * igt@kms_ccs@bad-pixel-format-4-tiled-mtl-rc-ccs-cc:
    - shard-bmg:          NOTRUN -> [SKIP][18] ([Intel XE#2887]) +3 other tests skip
   [18]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_ccs@bad-pixel-format-4-tiled-mtl-rc-ccs-cc.html

  * igt@kms_ccs@crc-primary-basic-4-tiled-bmg-ccs@pipe-c-edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][19] ([Intel XE#2669] / [Intel XE#7389]) +3 other tests skip
   [19]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_ccs@crc-primary-basic-4-tiled-bmg-ccs@pipe-c-edp-1.html

  * igt@kms_ccs@missing-ccs-buffer-y-tiled-gen12-rc-ccs-cc:
    - shard-lnl:          NOTRUN -> [SKIP][20] ([Intel XE#2887]) +3 other tests skip
   [20]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_ccs@missing-ccs-buffer-y-tiled-gen12-rc-ccs-cc.html

  * igt@kms_chamelium_color@ctm-blue-to-red:
    - shard-bmg:          NOTRUN -> [SKIP][21] ([Intel XE#2325] / [Intel XE#7358])
   [21]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_chamelium_color@ctm-blue-to-red.html

  * igt@kms_chamelium_edid@dp-edid-change-during-suspend:
    - shard-lnl:          NOTRUN -> [SKIP][22] ([Intel XE#373]) +1 other test skip
   [22]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_chamelium_edid@dp-edid-change-during-suspend.html

  * igt@kms_chamelium_hpd@hdmi-hpd-after-suspend:
    - shard-bmg:          NOTRUN -> [SKIP][23] ([Intel XE#2252]) +1 other test skip
   [23]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_chamelium_hpd@hdmi-hpd-after-suspend.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-lnl:          NOTRUN -> [SKIP][24] ([Intel XE#7642])
   [24]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@atomic-dpms-hdcp14@pipe-a-dp-2:
    - shard-bmg:          NOTRUN -> [FAIL][25] ([Intel XE#3304] / [Intel XE#7374]) +1 other test fail
   [25]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_content_protection@atomic-dpms-hdcp14@pipe-a-dp-2.html

  * igt@kms_cursor_crc@cursor-random-512x512:
    - shard-bmg:          NOTRUN -> [SKIP][26] ([Intel XE#2321] / [Intel XE#7355])
   [26]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_cursor_crc@cursor-random-512x512.html
    - shard-lnl:          NOTRUN -> [SKIP][27] ([Intel XE#2321] / [Intel XE#7355])
   [27]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_cursor_crc@cursor-random-512x512.html

  * igt@kms_cursor_crc@cursor-rapid-movement-64x21:
    - shard-bmg:          NOTRUN -> [SKIP][28] ([Intel XE#2320]) +1 other test skip
   [28]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_cursor_crc@cursor-rapid-movement-64x21.html

  * igt@kms_dsc@dsc-with-formats:
    - shard-lnl:          NOTRUN -> [SKIP][29] ([Intel XE#2244])
   [29]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_dsc@dsc-with-formats.html
    - shard-bmg:          NOTRUN -> [SKIP][30] ([Intel XE#2244])
   [30]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_dsc@dsc-with-formats.html

  * igt@kms_flip@2x-blocking-absolute-wf_vblank-interruptible:
    - shard-lnl:          NOTRUN -> [SKIP][31] ([Intel XE#1421]) +2 other tests skip
   [31]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_flip@2x-blocking-absolute-wf_vblank-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank@b-edp1:
    - shard-lnl:          [PASS][32] -> [FAIL][33] ([Intel XE#301])
   [32]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-lnl-7/igt@kms_flip@flip-vs-expired-vblank@b-edp1.html
   [33]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-6/igt@kms_flip@flip-vs-expired-vblank@b-edp1.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-upscaling:
    - shard-lnl:          NOTRUN -> [SKIP][34] ([Intel XE#7178] / [Intel XE#7351])
   [34]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-upscaling.html

  * igt@kms_flip_scaled_crc@flip-nv12-linear-to-nv12-linear-reflect-x:
    - shard-bmg:          NOTRUN -> [SKIP][35] ([Intel XE#7179])
   [35]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_flip_scaled_crc@flip-nv12-linear-to-nv12-linear-reflect-x.html

  * igt@kms_frontbuffer_tracking@drrs-1p-primscrn-shrfb-msflip-blt:
    - shard-lnl:          NOTRUN -> [SKIP][36] ([Intel XE#6312] / [Intel XE#651]) +2 other tests skip
   [36]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@drrs-1p-primscrn-shrfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@drrs-argb161616f-draw-blt:
    - shard-bmg:          NOTRUN -> [SKIP][37] ([Intel XE#7061] / [Intel XE#7356])
   [37]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_frontbuffer_tracking@drrs-argb161616f-draw-blt.html

  * igt@kms_frontbuffer_tracking@drrshdr-1p-primscrn-cur-indfb-draw-render:
    - shard-bmg:          NOTRUN -> [SKIP][38] ([Intel XE#2311]) +17 other tests skip
   [38]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_frontbuffer_tracking@drrshdr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-mmap-wc:
    - shard-bmg:          NOTRUN -> [SKIP][39] ([Intel XE#4141]) +4 other tests skip
   [39]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-render:
    - shard-lnl:          NOTRUN -> [SKIP][40] ([Intel XE#656] / [Intel XE#7905]) +10 other tests skip
   [40]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbc-abgr161616f-draw-blt:
    - shard-lnl:          NOTRUN -> [SKIP][41] ([Intel XE#7061] / [Intel XE#7356]) +1 other test skip
   [41]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@fbc-abgr161616f-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbcdrrshdr-1p-primscrn-pri-shrfb-draw-render:
    - shard-lnl:          NOTRUN -> [SKIP][42] ([Intel XE#6312]) +1 other test skip
   [42]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@fbcdrrshdr-1p-primscrn-pri-shrfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbcdrrshdr-abgr161616f-draw-blt:
    - shard-bmg:          NOTRUN -> [SKIP][43] ([Intel XE#7061]) +1 other test skip
   [43]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_frontbuffer_tracking@fbcdrrshdr-abgr161616f-draw-blt.html
    - shard-lnl:          NOTRUN -> [SKIP][44] ([Intel XE#7061]) +1 other test skip
   [44]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@fbcdrrshdr-abgr161616f-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbchdr-2p-scndscrn-shrfb-msflip-blt:
    - shard-lnl:          NOTRUN -> [SKIP][45] ([Intel XE#7905]) +12 other tests skip
   [45]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@fbchdr-2p-scndscrn-shrfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-render:
    - shard-bmg:          NOTRUN -> [SKIP][46] ([Intel XE#2313]) +15 other tests skip
   [46]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@psrhdr-1p-offscreen-pri-indfb-draw-mmap-wc:
    - shard-lnl:          NOTRUN -> [SKIP][47] ([Intel XE#7865]) +7 other tests skip
   [47]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_frontbuffer_tracking@psrhdr-1p-offscreen-pri-indfb-draw-mmap-wc.html

  * igt@kms_hdr@invalid-metadata-sizes:
    - shard-lnl:          NOTRUN -> [SKIP][48] ([Intel XE#1503] / [Intel XE#7915])
   [48]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_hdr@invalid-metadata-sizes.html

  * igt@kms_hdr@invalid-metadata-sizes@pipe-a-edp-1-xrgb2101010:
    - shard-lnl:          NOTRUN -> [SKIP][49] ([Intel XE#7915]) +1 other test skip
   [49]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_hdr@invalid-metadata-sizes@pipe-a-edp-1-xrgb2101010.html

  * igt@kms_hdr@invalid-metadata-sizes@pipe-a-hdmi-a-3-xrgb16161616f:
    - shard-bmg:          NOTRUN -> [SKIP][50] ([Intel XE#7915]) +1 other test skip
   [50]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_hdr@invalid-metadata-sizes@pipe-a-hdmi-a-3-xrgb16161616f.html

  * igt@kms_hdr@static-toggle@pipe-a-hdmi-a-3-xrgb16161616f:
    - shard-bmg:          [PASS][51] -> [SKIP][52] ([Intel XE#7915]) +1 other test skip
   [51]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-10/igt@kms_hdr@static-toggle@pipe-a-hdmi-a-3-xrgb16161616f.html
   [52]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_hdr@static-toggle@pipe-a-hdmi-a-3-xrgb16161616f.html

  * igt@kms_plane@pixel-format-4-tiled-dg2-rc-ccs-modifier:
    - shard-bmg:          NOTRUN -> [SKIP][53] ([Intel XE#7283]) +1 other test skip
   [53]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_plane@pixel-format-4-tiled-dg2-rc-ccs-modifier.html

  * igt@kms_plane@pixel-format-4-tiled-mtl-rc-ccs-cc-modifier:
    - shard-lnl:          NOTRUN -> [SKIP][54] ([Intel XE#7283])
   [54]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_plane@pixel-format-4-tiled-mtl-rc-ccs-cc-modifier.html

  * igt@kms_plane_multiple@2x-tiling-yf:
    - shard-bmg:          NOTRUN -> [SKIP][55] ([Intel XE#5021] / [Intel XE#7377])
   [55]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_plane_multiple@2x-tiling-yf.html
    - shard-lnl:          NOTRUN -> [SKIP][56] ([Intel XE#4596] / [Intel XE#5854])
   [56]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_plane_multiple@2x-tiling-yf.html

  * igt@kms_plane_scaling@planes-downscale-factor-0-75@pipe-a:
    - shard-bmg:          NOTRUN -> [SKIP][57] ([Intel XE#2763] / [Intel XE#6886]) +4 other tests skip
   [57]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_plane_scaling@planes-downscale-factor-0-75@pipe-a.html

  * igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5@pipe-c:
    - shard-lnl:          NOTRUN -> [SKIP][58] ([Intel XE#2763] / [Intel XE#6886]) +3 other tests skip
   [58]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5@pipe-c.html

  * igt@kms_pm_rpm@dpms-non-lpsp:
    - shard-lnl:          NOTRUN -> [SKIP][59] ([Intel XE#1439] / [Intel XE#3141] / [Intel XE#7383])
   [59]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_pm_rpm@dpms-non-lpsp.html

  * igt@kms_psr2_sf@fbc-pr-overlay-plane-move-continuous-exceed-sf:
    - shard-lnl:          NOTRUN -> [SKIP][60] ([Intel XE#2893] / [Intel XE#7304])
   [60]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_psr2_sf@fbc-pr-overlay-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@pr-plane-move-sf-dmg-area:
    - shard-bmg:          NOTRUN -> [SKIP][61] ([Intel XE#1489]) +2 other tests skip
   [61]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_psr2_sf@pr-plane-move-sf-dmg-area.html

  * igt@kms_psr@fbc-pr-dpms:
    - shard-lnl:          NOTRUN -> [SKIP][62] ([Intel XE#1406]) +1 other test skip
   [62]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_psr@fbc-pr-dpms.html

  * igt@kms_psr@fbc-psr2-suspend:
    - shard-bmg:          NOTRUN -> [SKIP][63] ([Intel XE#2234] / [Intel XE#2850]) +2 other tests skip
   [63]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_psr@fbc-psr2-suspend.html
    - shard-lnl:          NOTRUN -> [SKIP][64] ([Intel XE#1406] / [Intel XE#7345])
   [64]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_psr@fbc-psr2-suspend.html

  * igt@kms_psr@fbc-psr2-suspend@edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][65] ([Intel XE#1406] / [Intel XE#4609])
   [65]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_psr@fbc-psr2-suspend@edp-1.html

  * igt@kms_rotation_crc@primary-yf-tiled-reflect-x-270:
    - shard-bmg:          NOTRUN -> [SKIP][66] ([Intel XE#3904] / [Intel XE#7342])
   [66]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-270.html
    - shard-lnl:          NOTRUN -> [SKIP][67] ([Intel XE#3414] / [Intel XE#3904] / [Intel XE#7342])
   [67]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-270.html

  * igt@kms_vrr@seamless-rr-switch-virtual@pipe-a-edp-1:
    - shard-lnl:          [PASS][68] -> [FAIL][69] ([Intel XE#2142]) +1 other test fail
   [68]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-lnl-8/igt@kms_vrr@seamless-rr-switch-virtual@pipe-a-edp-1.html
   [69]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-3/igt@kms_vrr@seamless-rr-switch-virtual@pipe-a-edp-1.html

  * igt@xe_eudebug_online@debugger-reopen:
    - shard-lnl:          NOTRUN -> [SKIP][70] ([Intel XE#7636]) +1 other test skip
   [70]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_eudebug_online@debugger-reopen.html

  * igt@xe_eudebug_online@writes-caching-vram-bb-sram-target-vram:
    - shard-bmg:          NOTRUN -> [SKIP][71] ([Intel XE#7636]) +2 other tests skip
   [71]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_eudebug_online@writes-caching-vram-bb-sram-target-vram.html

  * igt@xe_evict@evict-beng-threads-large:
    - shard-lnl:          NOTRUN -> [SKIP][72] ([Intel XE#6540] / [Intel XE#688]) +1 other test skip
   [72]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_evict@evict-beng-threads-large.html

  * igt@xe_evict@evict-mixed-many-threads-small:
    - shard-bmg:          [PASS][73] -> [INCOMPLETE][74] ([Intel XE#6321])
   [73]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-3/igt@xe_evict@evict-mixed-many-threads-small.html
   [74]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-7/igt@xe_evict@evict-mixed-many-threads-small.html

  * igt@xe_exec_balancer@many-execqueues-cm-virtual-basic:
    - shard-lnl:          NOTRUN -> [SKIP][75] ([Intel XE#7482]) +4 other tests skip
   [75]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_balancer@many-execqueues-cm-virtual-basic.html

  * igt@xe_exec_basic@multigpu-no-exec-userptr:
    - shard-lnl:          NOTRUN -> [SKIP][76] ([Intel XE#1392]) +1 other test skip
   [76]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_basic@multigpu-no-exec-userptr.html
    - shard-bmg:          NOTRUN -> [SKIP][77] ([Intel XE#2322] / [Intel XE#7372]) +2 other tests skip
   [77]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_exec_basic@multigpu-no-exec-userptr.html

  * igt@xe_exec_fault_mode@many-execqueues-multi-queue-userptr-rebind-prefetch:
    - shard-lnl:          NOTRUN -> [SKIP][78] ([Intel XE#7136]) +4 other tests skip
   [78]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_fault_mode@many-execqueues-multi-queue-userptr-rebind-prefetch.html

  * igt@xe_exec_fault_mode@twice-multi-queue-userptr-invalidate-race-imm:
    - shard-bmg:          NOTRUN -> [SKIP][79] ([Intel XE#7136]) +5 other tests skip
   [79]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@xe_exec_fault_mode@twice-multi-queue-userptr-invalidate-race-imm.html

  * igt@xe_exec_multi_queue@many-execs-preempt-mode-fault-priority-smem:
    - shard-lnl:          NOTRUN -> [SKIP][80] ([Intel XE#6874]) +6 other tests skip
   [80]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_multi_queue@many-execs-preempt-mode-fault-priority-smem.html

  * igt@xe_exec_multi_queue@two-queues-priority:
    - shard-bmg:          NOTRUN -> [SKIP][81] ([Intel XE#6874]) +6 other tests skip
   [81]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@xe_exec_multi_queue@two-queues-priority.html

  * igt@xe_exec_reset@multi-queue-cancel-on-secondary:
    - shard-lnl:          NOTRUN -> [SKIP][82] ([Intel XE#7866])
   [82]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_reset@multi-queue-cancel-on-secondary.html

  * igt@xe_exec_threads@threads-multi-queue-cm-fd-userptr-invalidate:
    - shard-bmg:          NOTRUN -> [SKIP][83] ([Intel XE#7138]) +2 other tests skip
   [83]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_exec_threads@threads-multi-queue-cm-fd-userptr-invalidate.html
    - shard-lnl:          NOTRUN -> [SKIP][84] ([Intel XE#7138]) +1 other test skip
   [84]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_exec_threads@threads-multi-queue-cm-fd-userptr-invalidate.html

  * igt@xe_mmap@pci-membarrier-parallel:
    - shard-lnl:          NOTRUN -> [SKIP][85] ([Intel XE#5100] / [Intel XE#7322] / [Intel XE#7408])
   [85]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_mmap@pci-membarrier-parallel.html

  * igt@xe_multigpu_svm@mgpu-coherency-conflict:
    - shard-lnl:          NOTRUN -> [SKIP][86] ([Intel XE#6964])
   [86]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_multigpu_svm@mgpu-coherency-conflict.html

  * igt@xe_multigpu_svm@mgpu-pagefault-basic:
    - shard-bmg:          NOTRUN -> [SKIP][87] ([Intel XE#6964]) +1 other test skip
   [87]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@xe_multigpu_svm@mgpu-pagefault-basic.html

  * igt@xe_noexec_ping_pong@basic:
    - shard-lnl:          NOTRUN -> [SKIP][88] ([Intel XE#6259] / [Intel XE#7324] / [Intel XE#7406])
   [88]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_noexec_ping_pong@basic.html

  * igt@xe_page_reclaim@prl-invalidate-full:
    - shard-lnl:          NOTRUN -> [SKIP][89] ([Intel XE#7793])
   [89]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_page_reclaim@prl-invalidate-full.html
    - shard-bmg:          NOTRUN -> [SKIP][90] ([Intel XE#7793])
   [90]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_page_reclaim@prl-invalidate-full.html

  * igt@xe_pm@s4-d3cold-basic-exec:
    - shard-lnl:          NOTRUN -> [SKIP][91] ([Intel XE#2284] / [Intel XE#366] / [Intel XE#7370])
   [91]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_pm@s4-d3cold-basic-exec.html
    - shard-bmg:          NOTRUN -> [SKIP][92] ([Intel XE#2284] / [Intel XE#7370])
   [92]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_pm@s4-d3cold-basic-exec.html

  * igt@xe_pxp@display-pxp-fb:
    - shard-bmg:          NOTRUN -> [SKIP][93] ([Intel XE#4733] / [Intel XE#7417])
   [93]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_pxp@display-pxp-fb.html

  * igt@xe_query@multigpu-query-uc-fw-version-huc:
    - shard-bmg:          NOTRUN -> [SKIP][94] ([Intel XE#944])
   [94]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_query@multigpu-query-uc-fw-version-huc.html
    - shard-lnl:          NOTRUN -> [SKIP][95] ([Intel XE#944])
   [95]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_query@multigpu-query-uc-fw-version-huc.html

  * igt@xe_sriov_admin@sched-priority-write-readback-vfs-disabled:
    - shard-lnl:          NOTRUN -> [SKIP][96] ([Intel XE#7174])
   [96]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-2/igt@xe_sriov_admin@sched-priority-write-readback-vfs-disabled.html

  * igt@xe_survivability@runtime-survivability:
    - shard-bmg:          [PASS][97] -> [DMESG-WARN][98] ([Intel XE#6627] / [Intel XE#7419])
   [97]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-8/igt@xe_survivability@runtime-survivability.html
   [98]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-10/igt@xe_survivability@runtime-survivability.html

  * igt@xe_wedged@wedged-mode-toggle:
    - shard-lnl:          [PASS][99] -> [ABORT][100] ([Intel XE#8007])
   [99]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-lnl-4/igt@xe_wedged@wedged-mode-toggle.html
   [100]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-6/igt@xe_wedged@wedged-mode-toggle.html

  
#### Possible fixes ####

  * igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs:
    - shard-bmg:          [INCOMPLETE][101] ([Intel XE#7084] / [Intel XE#8150]) -> [PASS][102] +1 other test pass
   [101]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-3/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html
   [102]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-4/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1:
    - shard-lnl:          [FAIL][103] ([Intel XE#301] / [Intel XE#3149]) -> [PASS][104] +1 other test pass
   [103]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-lnl-4/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
   [104]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-1/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-lnl:          [FAIL][105] ([Intel XE#301]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-lnl-7/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
   [106]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-lnl-6/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@xe_pat@pt-caching:
    - shard-bmg:          [ABORT][107] ([Intel XE#7893]) -> [PASS][108]
   [107]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-8/igt@xe_pat@pt-caching.html
   [108]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@xe_pat@pt-caching.html

  
#### Warnings ####

  * igt@kms_tiled_display@basic-test-pattern:
    - shard-bmg:          [FAIL][109] ([Intel XE#1729] / [Intel XE#7424]) -> [SKIP][110] ([Intel XE#2426] / [Intel XE#5848])
   [109]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-5/igt@kms_tiled_display@basic-test-pattern.html
   [110]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-8/igt@kms_tiled_display@basic-test-pattern.html

  * igt@kms_tiled_display@basic-test-pattern-with-chamelium:
    - shard-bmg:          [SKIP][111] ([Intel XE#2426] / [Intel XE#5848]) -> [SKIP][112] ([Intel XE#2509] / [Intel XE#7437])
   [111]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32/shard-bmg-9/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html
   [112]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/shard-bmg-6/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html

  
  [Intel XE#1124]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1124
  [Intel XE#1178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1178
  [Intel XE#1392]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1392
  [Intel XE#1406]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1406
  [Intel XE#1421]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1421
  [Intel XE#1428]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1428
  [Intel XE#1439]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1439
  [Intel XE#1489]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1489
  [Intel XE#1503]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1503
  [Intel XE#1729]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1729
  [Intel XE#2142]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2142
  [Intel XE#2234]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2234
  [Intel XE#2244]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2244
  [Intel XE#2252]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2252
  [Intel XE#2284]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2284
  [Intel XE#2311]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2311
  [Intel XE#2313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2313
  [Intel XE#2320]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2320
  [Intel XE#2321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2321
  [Intel XE#2322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2322
  [Intel XE#2325]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2325
  [Intel XE#2327]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2327
  [Intel XE#2426]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2426
  [Intel XE#2509]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2509
  [Intel XE#2669]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2669
  [Intel XE#2763]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2763
  [Intel XE#2850]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2850
  [Intel XE#2887]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2887
  [Intel XE#2893]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2893
  [Intel XE#301]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/301
  [Intel XE#3141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3141
  [Intel XE#3149]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3149
  [Intel XE#3304]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3304
  [Intel XE#3414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3414
  [Intel XE#366]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/366
  [Intel XE#373]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/373
  [Intel XE#3904]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3904
  [Intel XE#4141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4141
  [Intel XE#4596]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4596
  [Intel XE#4609]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4609
  [Intel XE#4733]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4733
  [Intel XE#5021]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5021
  [Intel XE#5100]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5100
  [Intel XE#5643]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5643
  [Intel XE#5848]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5848
  [Intel XE#5854]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5854
  [Intel XE#6259]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6259
  [Intel XE#6312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6312
  [Intel XE#6321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6321
  [Intel XE#651]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/651
  [Intel XE#6540]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6540
  [Intel XE#656]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/656
  [Intel XE#6627]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6627
  [Intel XE#6874]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6874
  [Intel XE#688]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/688
  [Intel XE#6886]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6886
  [Intel XE#6964]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6964
  [Intel XE#7061]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7061
  [Intel XE#7084]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7084
  [Intel XE#7136]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7136
  [Intel XE#7138]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7138
  [Intel XE#7174]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7174
  [Intel XE#7178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7178
  [Intel XE#7179]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7179
  [Intel XE#7283]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7283
  [Intel XE#7304]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7304
  [Intel XE#7322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7322
  [Intel XE#7324]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7324
  [Intel XE#7342]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7342
  [Intel XE#7345]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7345
  [Intel XE#7351]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7351
  [Intel XE#7355]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7355
  [Intel XE#7356]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7356
  [Intel XE#7358]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7358
  [Intel XE#7370]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7370
  [Intel XE#7372]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7372
  [Intel XE#7374]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7374
  [Intel XE#7377]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7377
  [Intel XE#7383]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7383
  [Intel XE#7387]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7387
  [Intel XE#7389]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7389
  [Intel XE#7406]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7406
  [Intel XE#7408]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7408
  [Intel XE#7417]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7417
  [Intel XE#7419]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7419
  [Intel XE#7424]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7424
  [Intel XE#7437]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7437
  [Intel XE#7482]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7482
  [Intel XE#7636]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7636
  [Intel XE#7642]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7642
  [Intel XE#7679]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7679
  [Intel XE#7793]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7793
  [Intel XE#7814]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7814
  [Intel XE#7865]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7865
  [Intel XE#7866]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7866
  [Intel XE#7893]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7893
  [Intel XE#7905]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7905
  [Intel XE#7915]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7915
  [Intel XE#7961]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7961
  [Intel XE#8007]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/8007
  [Intel XE#8150]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/8150
  [Intel XE#8174]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/8174
  [Intel XE#944]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/944


Build changes
-------------

  * Linux: xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32 -> xe-pw-167369v1

  IGT_8938: b024a3b67372962ff6e643d3998c5cf5acc07081 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-5137-5390f2273d45bb259d88508828018c0fbbb79d32: 5390f2273d45bb259d88508828018c0fbbb79d32
  xe-pw-167369v1: 167369v1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-167369v1/index.html

[-- Attachment #2: Type: text/html, Size: 42782 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
                   ` (4 preceding siblings ...)
  2026-05-27 19:27 ` ✗ Xe.CI.FULL: failure " Patchwork
@ 2026-05-28 12:33 ` Arunpravin Paneer Selvam
  2026-05-29 17:41 ` Matthew Auld
  6 siblings, 0 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-05-28 12:33 UTC (permalink / raw)
  To: matthew.auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher

Hi Matthew,

We are targeting this series for inclusion in the next merge window.
Please let me know if there are any comments or concerns that need to be 
addressed.

Thanks,
Arun.

On 5/27/2026 4:59 PM, Arunpravin Paneer Selvam wrote:
> The current buddy allocator maintains separate clear_tree[] and
> dirty_tree[] rbtrees per order, preventing coalescing between cleared
> and dirty buddies. Under mixed workloads, this creates a merge barrier:
> adjacent buddies frequently end up split across trees, forcing reliance
> on __force_merge() during allocation.
>
> __force_merge() performs an O(N x max_order) scan under the VRAM manager
> lock, leading to allocation stalls and failures for large contiguous
> requests even when sufficient total free memory is available.
>
> Solution
>
> Replace the dual-tree design with:
> - A single free_tree[order] rbtree for dirty and mixed free blocks
>    (fully cleared free blocks float outside this tree)
> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
>
> Fully cleared free blocks are tracked outside the buddy trees using an
> augmented interval rbtree, enabling O(log E) lookup of the largest
> cleared extents.
>
> Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
> of clear/dirty state. This removes the merge barrier and eliminates the
> need for __force_merge().
>
> Benefits
>
> - Correct high-order allocations after mixed clear/dirty workloads
> - Elimination of O(N x max_order) merge cost from the allocation path
> - O(log E) cleared-extent lookup replacing O(N) scans
> - Predictable allocation latency under fragmentation
> - Reduced complexity with a single tree per order
>
> Test:
> dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000
>
> Below data is from /sys/kernel/debug/dri/1/amdgpu_vram_mm:
>
> Base (dual-tree), before VKCTS test:
>    order- 6 free:   6 MiB,  blocks: 26
>    order- 5 free:   1 MiB,  blocks: 15
>    order- 4 free: 960 KiB,  blocks: 15
>    order- 3 free:   5 MiB,  blocks: 171
>    order- 2 free:   2 MiB,  blocks: 176
>    order- 1 free:   1 MiB,  blocks: 165
>    order- 0 free:  16 KiB,  blocks: 4
>
> Base (dual-tree), after VKCTS test:
>    order- 6 free: 768 KiB,  blocks: 3
>    order- 5 free: 499 MiB,  blocks: 3999
>    order- 4 free: 250 MiB,  blocks: 4001
>    order- 3 free: 129 MiB,  blocks: 4157
>    order- 2 free:  65 MiB,  blocks: 4161
>    order- 1 free:  63 MiB,  blocks: 8138
>    order- 0 free:  20 KiB,  blocks: 5
>
> Clear tracker, before VKCTS test:
>    order- 6 free:   4 MiB,  blocks: 19
>    order- 5 free:   2 MiB,  blocks: 18
>    order- 4 free: 704 KiB,  blocks: 11
>    order- 3 free:   5 MiB,  blocks: 168
>    order- 2 free:   2 MiB,  blocks: 174
>    order- 1 free:   1 MiB,  blocks: 167
>    order- 0 free:  32 KiB,  blocks: 8
>
> Clear tracker, after VKCTS test:
>    order- 6 free:   4 MiB,  blocks: 19
>    order- 5 free:   2 MiB,  blocks: 18
>    order- 4 free: 704 KiB,  blocks: 11
>    order- 3 free:   5 MiB,  blocks: 168
>    order- 2 free:   2 MiB,  blocks: 174
>    order- 1 free:   1 MiB,  blocks: 167
>    order- 0 free:  28 KiB,  blocks: 7
>
> v2:
>   - Code-style cleanup and minor refactoring
>   - Renamed locals for clarity
>
> v3:
>   - Keep cleared blocks inside free_tree[] instead of floating them.
>   - Add subtree_has_dirty rbtree augment for O(log N) dirty-first walk.
>
> v4:
>   - Fixed checkpatch warnings.
>   - Optimized gpu_buddy_reset_clear() to a single post-order walk that
>     flips block headers and recomputes the rbtree augment in one pass.
>   - Propagate subtree_max_size top-down in insert_extent() so ancestors
>     are not left with stale values on no-rotation inserts. (sashiko)
>   - Drop the whole extent in gpu_clear_tracker_mark_dirty() when the
>     inside-split allocation fails, avoiding a stale clear claim. (sashiko)
>   - Make gpu_clear_tracker_find() alignment-aware and fall back to the
>     dirty tree on steered failure to avoid spurious -ENOSPC. (sashiko)
>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
> ---
>   drivers/gpu/buddy.c                | 1164 ++++++++++++++++++----------
>   drivers/gpu/drm/drm_buddy.c        |   12 +-
>   drivers/gpu/tests/gpu_buddy_test.c |   18 +-
>   include/linux/gpu_buddy.h          |   64 +-
>   4 files changed, 829 insertions(+), 429 deletions(-)
>
> diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
> index eb1457376307..dca66cc43959 100644
> --- a/drivers/gpu/buddy.c
> +++ b/drivers/gpu/buddy.c
> @@ -8,6 +8,7 @@
>   #include <linux/kmemleak.h>
>   #include <linux/module.h>
>   #include <linux/sizes.h>
> +#include <linux/slab.h>
>   
>   #include <linux/gpu_buddy.h>
>   
> @@ -35,6 +36,364 @@
>   
>   static struct kmem_cache *slab_blocks;
>   
> +static struct kmem_cache *slab_extents;
> +
> +/*
> + * Clear tracker
> + * -------------
> + *
> + * The clear tracker maintains an augmented interval rbtree of contiguous
> + * cleared (zeroed) address ranges, decoupled from the buddy free trees.
> + * Each node covers a maximal coalesced run; adjacent extents are merged
> + * on insertion so the tree always holds the smallest possible number of
> + * extents.  The augmentation field @subtree_max_size lets the allocator
> + * locate the largest cleared extent in O(log E).
> + */
> +
> +static u64 extent_size(struct gpu_clear_extent *clear_extent)
> +{
> +	return clear_extent->end - clear_extent->start;
> +}
> +
> +RB_DECLARE_CALLBACKS_MAX(static, gpu_clear_augment_cb,
> +			 struct gpu_clear_extent, rb,
> +			 u64, subtree_max_size,
> +			 extent_size)
> +
> +static struct gpu_clear_extent *extent_alloc(void)
> +{
> +	return kmem_cache_zalloc(slab_extents, GFP_KERNEL);
> +}
> +
> +static void extent_free(struct gpu_clear_extent *clear_extent)
> +{
> +	kmem_cache_free(slab_extents, clear_extent);
> +}
> +
> +/* Return the rightmost extent whose start is strictly below @offset. */
> +static struct gpu_clear_extent *
> +prev_extent(struct gpu_clear_tracker *clear_tracker, u64 offset)
> +{
> +	struct rb_node *rb = clear_tracker->root.rb_node;
> +	struct gpu_clear_extent *clear_extent = NULL;
> +
> +	while (rb) {
> +		struct gpu_clear_extent *tmp_extent =
> +			rb_entry(rb, struct gpu_clear_extent, rb);
> +
> +		if (tmp_extent->start < offset) {
> +			clear_extent = tmp_extent;
> +			rb = rb->rb_right;
> +		} else {
> +			rb = rb->rb_left;
> +		}
> +	}
> +
> +	return clear_extent;
> +}
> +
> +/* Return the leftmost extent whose start is at or above @offset. */
> +static struct gpu_clear_extent *
> +next_extent(struct gpu_clear_tracker *clear_tracker, u64 offset)
> +{
> +	struct rb_node *rb = clear_tracker->root.rb_node;
> +	struct gpu_clear_extent *clear_extent = NULL;
> +
> +	while (rb) {
> +		struct gpu_clear_extent *tmp_extent =
> +			rb_entry(rb, struct gpu_clear_extent, rb);
> +
> +		if (tmp_extent->start >= offset) {
> +			clear_extent = tmp_extent;
> +			rb = rb->rb_left;
> +		} else {
> +			rb = rb->rb_right;
> +		}
> +	}
> +
> +	return clear_extent;
> +}
> +
> +static void insert_extent(struct gpu_clear_tracker *clear_tracker,
> +			  struct gpu_clear_extent *clear_extent)
> +{
> +	struct rb_node **link = &clear_tracker->root.rb_node;
> +	struct rb_node *parent = NULL;
> +	u64 size = extent_size(clear_extent);
> +
> +	while (*link) {
> +		struct gpu_clear_extent *tmp_extent;
> +
> +		parent = *link;
> +		tmp_extent = rb_entry(parent, struct gpu_clear_extent, rb);
> +
> +		if (tmp_extent->subtree_max_size < size)
> +			tmp_extent->subtree_max_size = size;
> +
> +		if (clear_extent->start < tmp_extent->start)
> +			link = &parent->rb_left;
> +		else
> +			link = &parent->rb_right;
> +	}
> +
> +	clear_extent->subtree_max_size = size;
> +	rb_link_node(&clear_extent->rb, parent, link);
> +	rb_insert_augmented(&clear_extent->rb, &clear_tracker->root, &gpu_clear_augment_cb);
> +}
> +
> +static void remove_extent(struct gpu_clear_tracker *clear_tracker,
> +			  struct gpu_clear_extent *clear_extent)
> +{
> +	rb_erase_augmented(&clear_extent->rb, &clear_tracker->root, &gpu_clear_augment_cb);
> +	RB_CLEAR_NODE(&clear_extent->rb);
> +}
> +
> +static void gpu_clear_tracker_init(struct gpu_clear_tracker *clear_tracker)
> +{
> +	clear_tracker->root = RB_ROOT;
> +	clear_tracker->total_clear = 0;
> +}
> +
> +static void gpu_clear_tracker_fini(struct gpu_clear_tracker *clear_tracker)
> +{
> +	struct rb_node *rb;
> +
> +	while ((rb = rb_first(&clear_tracker->root))) {
> +		struct gpu_clear_extent *clear_extent =
> +			rb_entry(rb, struct gpu_clear_extent, rb);
> +
> +		remove_extent(clear_tracker, clear_extent);
> +		extent_free(clear_extent);
> +	}
> +
> +	clear_tracker->total_clear = 0;
> +}
> +
> +/*
> + * Mark the range [start, start + size] as cleared. Merge with the neighbour on
> + * each side if they are contiguous, so the tree never holds two adjacent ranges.
> + */
> +static void gpu_clear_tracker_mark_clear(struct gpu_clear_tracker *clear_tracker,
> +					 u64 start, u64 size)
> +{
> +	struct gpu_clear_extent *left, *right, *clear_extent;
> +	u64 end = start + size;
> +
> +	if (!size)
> +		return;
> +
> +	/* Find contiguous neighbours, if any. */
> +	left = prev_extent(clear_tracker, start);
> +	if (left && left->end != start)
> +		left = NULL;
> +
> +	right = next_extent(clear_tracker, end);
> +	if (right && right->start != end)
> +		right = NULL;
> +
> +	if (left && right) {
> +		/* Merge left + new + right into a single extent. */
> +		remove_extent(clear_tracker, left);
> +		remove_extent(clear_tracker, right);
> +		left->end = right->end;
> +		extent_free(right);
> +		insert_extent(clear_tracker, left);
> +	} else if (left) {
> +		/* Extend left neighbour rightwards. */
> +		remove_extent(clear_tracker, left);
> +		left->end = end;
> +		insert_extent(clear_tracker, left);
> +	} else if (right) {
> +		/* Extend right neighbour leftwards. */
> +		remove_extent(clear_tracker, right);
> +		right->start = start;
> +		insert_extent(clear_tracker, right);
> +	} else {
> +		/* Standalone extent. */
> +		clear_extent = extent_alloc();
> +		if (!clear_extent)
> +			return;
> +
> +		clear_extent->start = start;
> +		clear_extent->end   = end;
> +		insert_extent(clear_tracker, clear_extent);
> +	}
> +
> +	clear_tracker->total_clear += size;
> +}
> +
> +/*
> + * Mark the range [start, start + size] as dirty. Remove the range from every
> + * overlapping clear extent, splitting one extent in two if the dirty range
> + * falls strictly inside it.
> + */
> +static void gpu_clear_tracker_mark_dirty(struct gpu_clear_tracker *clear_tracker,
> +					 u64 start, u64 size)
> +{
> +	struct gpu_clear_extent *clear_extent, *next;
> +	u64 end = start + size;
> +
> +	if (!size)
> +		return;
> +
> +	clear_extent = prev_extent(clear_tracker, start + 1);
> +	if (!clear_extent)
> +		clear_extent = next_extent(clear_tracker, start);
> +
> +	while (clear_extent && clear_extent->start < end) {
> +		struct rb_node *next_node = rb_next(&clear_extent->rb);
> +		u64 extent_start = clear_extent->start;
> +		u64 extent_end = clear_extent->end;
> +
> +		if (next_node)
> +			next = rb_entry(next_node, struct gpu_clear_extent, rb);
> +		else
> +			next = NULL;
> +
> +		/* Skip a non-overlapping neighbour returned by prev_extent(). */
> +		if (extent_end <= start) {
> +			clear_extent = next;
> +			continue;
> +		}
> +
> +		if (extent_start < start && extent_end > end) {
> +			/* Dirty range falls strictly inside: split into left + right. */
> +			struct gpu_clear_extent *right = extent_alloc();
> +
> +			if (!right) {
> +				remove_extent(clear_tracker, clear_extent);
> +				extent_free(clear_extent);
> +
> +				clear_tracker->total_clear -=
> +					(extent_end - extent_start);
> +
> +				clear_extent = next;
> +				continue;
> +			}
> +
> +			remove_extent(clear_tracker, clear_extent);
> +
> +			clear_extent->end = start;
> +			right->start = end;
> +			right->end   = extent_end;
> +
> +			insert_extent(clear_tracker, clear_extent);
> +			insert_extent(clear_tracker, right);
> +
> +			clear_tracker->total_clear -= size;
> +		} else if (extent_start >= start && extent_end <= end) {
> +			/* Extent fully covered: drop it. */
> +			remove_extent(clear_tracker, clear_extent);
> +			extent_free(clear_extent);
> +
> +			clear_tracker->total_clear -= (extent_end - extent_start);
> +		} else if (extent_start < start) {
> +			/* Extent overlaps from the left: trim its right end. */
> +			remove_extent(clear_tracker, clear_extent);
> +			clear_extent->end = start;
> +			insert_extent(clear_tracker, clear_extent);
> +
> +			clear_tracker->total_clear -= (extent_end - start);
> +		} else {
> +			/* Extent overlaps from the right: trim its left end. */
> +			remove_extent(clear_tracker, clear_extent);
> +			clear_extent->start = end;
> +			insert_extent(clear_tracker, clear_extent);
> +
> +			clear_tracker->total_clear -= (end - extent_start);
> +		}
> +
> +		clear_extent = next;
> +	}
> +}
> +
> +/*
> + * Returns true if the range [start, start + size] lies entirely within
> + * a single clear extent in the tracker, i.e. the whole range is known
> + * to be cleared.
> + */
> +static bool gpu_clear_tracker_is_clear(struct gpu_clear_tracker *clear_tracker,
> +				       u64 start, u64 size)
> +{
> +	struct gpu_clear_extent *clear_extent;
> +	u64 end = start + size;
> +
> +	clear_extent = prev_extent(clear_tracker, start + 1);
> +	if (!clear_extent)
> +		return false;
> +
> +	return clear_extent->start <= start && clear_extent->end >= end;
> +}
> +
> +static struct rb_node *
> +clear_tracker_descend_right(struct rb_node *node, u64 min_size)
> +{
> +	while (node->rb_right) {
> +		struct gpu_clear_extent *tmp_extent;
> +
> +		tmp_extent = rb_entry(node->rb_right, struct gpu_clear_extent, rb);
> +
> +		if (tmp_extent->subtree_max_size < min_size)
> +			break;
> +		node = node->rb_right;
> +	}
> +
> +	return node;
> +}
> +
> +static struct gpu_clear_extent *
> +gpu_clear_tracker_find(struct gpu_clear_tracker *clear_tracker, u64 min_size)
> +{
> +	struct rb_node *rb = clear_tracker->root.rb_node;
> +	struct gpu_clear_extent *root_extent;
> +	struct rb_node *parent;
> +
> +	if (WARN_ON(!min_size || !is_power_of_2(min_size)))
> +		return NULL;
> +
> +	if (!rb)
> +		return NULL;
> +
> +	root_extent = rb_entry(rb, struct gpu_clear_extent, rb);
> +	if (root_extent->subtree_max_size < min_size)
> +		return NULL;
> +
> +	rb = clear_tracker_descend_right(rb, min_size);
> +
> +	while (rb) {
> +		struct gpu_clear_extent *clear_extent;
> +		u64 aligned_start;
> +
> +		clear_extent = rb_entry(rb, struct gpu_clear_extent, rb);
> +		aligned_start = ALIGN(clear_extent->start, min_size);
> +
> +		/* Check if a naturally aligned min_size block fits. */
> +		if (aligned_start <= clear_extent->end &&
> +		    clear_extent->end - aligned_start >= min_size)
> +			return clear_extent;
> +
> +		if (rb->rb_left) {
> +			struct gpu_clear_extent *tmp_extent;
> +
> +			tmp_extent = rb_entry(rb->rb_left, struct gpu_clear_extent, rb);
> +			if (tmp_extent->subtree_max_size >= min_size) {
> +				rb = clear_tracker_descend_right(rb->rb_left, min_size);
> +				continue;
> +			}
> +		}
> +
> +		/* Walk up until we exit a node via its right child. */
> +		parent = rb_parent(rb);
> +		while (parent && parent->rb_right != rb) {
> +			rb = parent;
> +			parent = rb_parent(rb);
> +		}
> +		rb = parent;
> +	}
> +
> +	return NULL;
> +}
> +
>   static unsigned int
>   gpu_buddy_block_state(struct gpu_buddy_block *block)
>   {
> @@ -67,10 +426,93 @@ static unsigned int gpu_buddy_block_offset_alignment(struct gpu_buddy_block *blo
>   	return __ffs64(offset);
>   }
>   
> -RB_DECLARE_CALLBACKS_MAX(static, gpu_buddy_augment_cb,
> -			 struct gpu_buddy_block, rb,
> -			 unsigned int, subtree_max_alignment,
> -			 gpu_buddy_block_offset_alignment);
> +static inline bool
> +gpu_buddy_block_is_dirty(struct gpu_buddy_block *block)
> +{
> +	return !gpu_buddy_block_is_clear(block);
> +}
> +
> +static inline void gpu_buddy_augment_compute(struct gpu_buddy_block *block)
> +{
> +	struct gpu_buddy_block *right;
> +	struct gpu_buddy_block *left;
> +	unsigned int max_align;
> +	bool has_dirty;
> +
> +	max_align = gpu_buddy_block_offset_alignment(block);
> +	has_dirty = gpu_buddy_block_is_dirty(block);
> +
> +	left = rb_entry_safe(block->rb.rb_left, struct gpu_buddy_block, rb);
> +	if (left) {
> +		if (left->subtree_max_alignment > max_align)
> +			max_align = left->subtree_max_alignment;
> +
> +		has_dirty |= left->subtree_has_dirty;
> +	}
> +
> +	right = rb_entry_safe(block->rb.rb_right, struct gpu_buddy_block, rb);
> +	if (right) {
> +		if (right->subtree_max_alignment > max_align)
> +			max_align = right->subtree_max_alignment;
> +
> +		has_dirty |= right->subtree_has_dirty;
> +	}
> +
> +	block->subtree_max_alignment = max_align;
> +	block->subtree_has_dirty = has_dirty;
> +}
> +
> +static void gpu_buddy_augment_propagate(struct rb_node *rb, struct rb_node *stop)
> +{
> +	while (rb != stop) {
> +		struct gpu_buddy_block *block;
> +		unsigned int old_align;
> +		bool old_has_dirty;
> +
> +		block = rb_entry(rb, struct gpu_buddy_block, rb);
> +		old_align = block->subtree_max_alignment;
> +		old_has_dirty = block->subtree_has_dirty;
> +
> +		gpu_buddy_augment_compute(block);
> +		if (block->subtree_max_alignment == old_align &&
> +		    block->subtree_has_dirty == old_has_dirty)
> +			break;
> +
> +		rb = rb_parent(&block->rb);
> +	}
> +}
> +
> +static void gpu_buddy_augment_copy(struct rb_node *rb_old, struct rb_node *rb_new)
> +{
> +	struct gpu_buddy_block *old;
> +	struct gpu_buddy_block *new;
> +
> +	old = rb_entry(rb_old, struct gpu_buddy_block, rb);
> +	new = rb_entry(rb_new, struct gpu_buddy_block, rb);
> +
> +	new->subtree_max_alignment = old->subtree_max_alignment;
> +	new->subtree_has_dirty = old->subtree_has_dirty;
> +}
> +
> +static void gpu_buddy_augment_rotate(struct rb_node *rb_old, struct rb_node *rb_new)
> +{
> +	struct gpu_buddy_block *old;
> +	struct gpu_buddy_block *new;
> +
> +	old = rb_entry(rb_old, struct gpu_buddy_block, rb);
> +	new = rb_entry(rb_new, struct gpu_buddy_block, rb);
> +
> +	new->subtree_max_alignment = old->subtree_max_alignment;
> +	new->subtree_has_dirty = old->subtree_has_dirty;
> +
> +	gpu_buddy_augment_compute(old);
> +}
> +
> +static const struct rb_augment_callbacks gpu_buddy_augment_cb = {
> +	.propagate = gpu_buddy_augment_propagate,
> +	.copy      = gpu_buddy_augment_copy,
> +	.rotate    = gpu_buddy_augment_rotate,
> +};
>   
>   static struct gpu_buddy_block *gpu_block_alloc(struct gpu_buddy *mm,
>   					       struct gpu_buddy_block *parent,
> @@ -101,13 +543,6 @@ static void gpu_block_free(struct gpu_buddy *mm,
>   	kmem_cache_free(slab_blocks, block);
>   }
>   
> -static enum gpu_buddy_free_tree
> -get_block_tree(struct gpu_buddy_block *block)
> -{
> -	return gpu_buddy_block_is_clear(block) ?
> -	       GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
> -}
> -
>   static struct gpu_buddy_block *
>   rbtree_get_free_block(const struct rb_node *node)
>   {
> @@ -120,24 +555,61 @@ rbtree_last_free_block(struct rb_root *root)
>   	return rbtree_get_free_block(rb_last(root));
>   }
>   
> -static bool rbtree_is_empty(struct rb_root *root)
> +/*
> + * Find the rightmost (highest-offset) free block in @root that is itself
> + * dirty, by descending the tree using the subtree_has_dirty augment to
> + * skip subtrees that contain only cleared blocks.  Returns NULL if no
> + * dirty block exists in the tree.
> + */
> +static struct gpu_buddy_block *
> +rbtree_last_dirty_free_block(struct rb_root *root)
>   {
> -	return RB_EMPTY_ROOT(root);
> +	struct gpu_buddy_block *block = NULL;
> +	struct rb_node *node = root->rb_node;
> +
> +	while (node) {
> +		struct gpu_buddy_block *right_block;
> +		struct gpu_buddy_block *node_block;
> +
> +		node_block = rbtree_get_free_block(node);
> +		right_block = rbtree_get_free_block(node->rb_right);
> +
> +		/*
> +		 * Prefer the rightmost subtree that contains a dirty block;
> +		 * fall back to the current node if it is itself dirty;
> +		 * otherwise descend left.
> +		 */
> +		if (right_block && right_block->subtree_has_dirty) {
> +			node = node->rb_right;
> +			continue;
> +		}
> +
> +		if (gpu_buddy_block_is_dirty(node_block)) {
> +			block = node_block;
> +			break;
> +		}
> +
> +		node = node->rb_left;
> +	}
> +
> +	return block;
>   }
>   
>   static void rbtree_insert(struct gpu_buddy *mm,
> -			  struct gpu_buddy_block *block,
> -			  enum gpu_buddy_free_tree tree)
> +			  struct gpu_buddy_block *block)
>   {
>   	struct rb_node **link, *parent = NULL;
> -	unsigned int block_alignment, order;
>   	struct gpu_buddy_block *node;
> +	unsigned int block_alignment;
>   	struct rb_root *root;
> +	unsigned int order;
> +	bool block_dirty;
>   
>   	order = gpu_buddy_block_order(block);
>   	block_alignment = gpu_buddy_block_offset_alignment(block);
> +	block_dirty = gpu_buddy_block_is_dirty(block);
>   
> -	root = &mm->free_trees[tree][order];
> +	root = &mm->free_tree[order];
>   	link = &root->rb_node;
>   
>   	while (*link) {
> @@ -147,10 +619,12 @@ static void rbtree_insert(struct gpu_buddy *mm,
>   		 * Manual augmentation update during insertion traversal. Required
>   		 * because rb_insert_augmented() only calls rotate callback during
>   		 * rotations. This ensures all ancestors on the insertion path have
> -		 * correct subtree_max_alignment values.
> +		 * correct subtree_max_alignment / subtree_has_dirty values.
>   		 */
>   		if (node->subtree_max_alignment < block_alignment)
>   			node->subtree_max_alignment = block_alignment;
> +		if (block_dirty)
> +			node->subtree_has_dirty = true;
>   
>   		if (gpu_buddy_block_offset(block) < gpu_buddy_block_offset(node))
>   			link = &parent->rb_left;
> @@ -159,6 +633,7 @@ static void rbtree_insert(struct gpu_buddy *mm,
>   	}
>   
>   	block->subtree_max_alignment = block_alignment;
> +	block->subtree_has_dirty = block_dirty;
>   	rb_link_node(&block->rb, parent, link);
>   	rb_insert_augmented(&block->rb, root, &gpu_buddy_augment_cb);
>   }
> @@ -167,26 +642,11 @@ static void rbtree_remove(struct gpu_buddy *mm,
>   			  struct gpu_buddy_block *block)
>   {
>   	unsigned int order = gpu_buddy_block_order(block);
> -	enum gpu_buddy_free_tree tree;
> -	struct rb_root *root;
> -
> -	tree = get_block_tree(block);
> -	root = &mm->free_trees[tree][order];
>   
> -	rb_erase_augmented(&block->rb, root, &gpu_buddy_augment_cb);
> +	rb_erase_augmented(&block->rb, &mm->free_tree[order], &gpu_buddy_augment_cb);
>   	RB_CLEAR_NODE(&block->rb);
>   }
>   
> -static void clear_reset(struct gpu_buddy_block *block)
> -{
> -	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> -}
> -
> -static void mark_cleared(struct gpu_buddy_block *block)
> -{
> -	block->header |= GPU_BUDDY_HEADER_CLEAR;
> -}
> -
>   static void mark_allocated(struct gpu_buddy *mm,
>   			   struct gpu_buddy_block *block)
>   {
> @@ -199,13 +659,17 @@ static void mark_allocated(struct gpu_buddy *mm,
>   static void mark_free(struct gpu_buddy *mm,
>   		      struct gpu_buddy_block *block)
>   {
> -	enum gpu_buddy_free_tree tree;
> -
>   	block->header &= ~GPU_BUDDY_HEADER_STATE;
>   	block->header |= GPU_BUDDY_FREE;
>   
> -	tree = get_block_tree(block);
> -	rbtree_insert(mm, block, tree);
> +	if (gpu_clear_tracker_is_clear(&mm->clear,
> +				       gpu_buddy_block_offset(block),
> +				       gpu_buddy_block_size(mm, block)))
> +		block->header |= GPU_BUDDY_HEADER_CLEAR;
> +	else
> +		block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> +
> +	rbtree_insert(mm, block);
>   }
>   
>   static void mark_split(struct gpu_buddy *mm,
> @@ -243,36 +707,18 @@ __get_buddy(struct gpu_buddy_block *block)
>   }
>   
>   static unsigned int __gpu_buddy_free(struct gpu_buddy *mm,
> -				     struct gpu_buddy_block *block,
> -				     bool force_merge)
> +				     struct gpu_buddy_block *block)
>   {
>   	struct gpu_buddy_block *parent;
>   	unsigned int order;
>   
>   	while ((parent = block->parent)) {
> -		struct gpu_buddy_block *buddy;
> -
> -		buddy = __get_buddy(block);
> +		struct gpu_buddy_block *buddy = __get_buddy(block);
>   
>   		if (!gpu_buddy_block_is_free(buddy))
>   			break;
>   
> -		if (!force_merge) {
> -			/*
> -			 * Check the block and its buddy clear state and exit
> -			 * the loop if they both have the dissimilar state.
> -			 */
> -			if (gpu_buddy_block_is_clear(block) !=
> -			    gpu_buddy_block_is_clear(buddy))
> -				break;
> -
> -			if (gpu_buddy_block_is_clear(block))
> -				mark_cleared(parent);
> -		}
> -
>   		rbtree_remove(mm, buddy);
> -		if (force_merge && gpu_buddy_block_is_clear(buddy))
> -			mm->clear_avail -= gpu_buddy_block_size(mm, buddy);
>   
>   		gpu_block_free(mm, block);
>   		gpu_block_free(mm, buddy);
> @@ -286,66 +732,15 @@ static unsigned int __gpu_buddy_free(struct gpu_buddy *mm,
>   	return order;
>   }
>   
> -static int __force_merge(struct gpu_buddy *mm,
> -			 u64 start,
> -			 u64 end,
> -			 unsigned int min_order)
> +static void undo_partial_split(struct gpu_buddy *mm,
> +			       struct gpu_buddy_block *block)
>   {
> -	unsigned int tree, order;
> -	int i;
> +	struct gpu_buddy_block *buddy = __get_buddy(block);
>   
> -	if (!min_order)
> -		return -ENOMEM;
> -
> -	if (min_order > mm->max_order)
> -		return -EINVAL;
> -
> -	for_each_free_tree(tree) {
> -		for (i = min_order - 1; i >= 0; i--) {
> -			struct rb_node *iter = rb_last(&mm->free_trees[tree][i]);
> -
> -			while (iter) {
> -				struct gpu_buddy_block *block, *buddy;
> -				u64 block_start, block_end;
> -
> -				block = rbtree_get_free_block(iter);
> -				iter = rb_prev(iter);
> -
> -				if (!block || !block->parent)
> -					continue;
> -
> -				block_start = gpu_buddy_block_offset(block);
> -				block_end = block_start + gpu_buddy_block_size(mm, block) - 1;
> -
> -				if (!contains(start, end, block_start, block_end))
> -					continue;
> -
> -				buddy = __get_buddy(block);
> -				if (!gpu_buddy_block_is_free(buddy))
> -					continue;
> -
> -				gpu_buddy_assert(gpu_buddy_block_is_clear(block) !=
> -						 gpu_buddy_block_is_clear(buddy));
> -
> -				/*
> -				 * Advance to the next node when the current node is the buddy,
> -				 * as freeing the block will also remove its buddy from the tree.
> -				 */
> -				if (iter == &buddy->rb)
> -					iter = rb_prev(iter);
> -
> -				rbtree_remove(mm, block);
> -				if (gpu_buddy_block_is_clear(block))
> -					mm->clear_avail -= gpu_buddy_block_size(mm, block);
> -
> -				order = __gpu_buddy_free(mm, block, true);
> -				if (order >= min_order)
> -					return 0;
> -			}
> -		}
> -	}
> -
> -	return -ENOMEM;
> +	if (buddy &&
> +	    gpu_buddy_block_is_free(block) &&
> +	    gpu_buddy_block_is_free(buddy))
> +		__gpu_buddy_free(mm, block);
>   }
>   
>   /**
> @@ -362,7 +757,7 @@ static int __force_merge(struct gpu_buddy *mm,
>    */
>   int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
>   {
> -	unsigned int i, j, root_count = 0;
> +	unsigned int root_count = 0;
>   	u64 offset = 0;
>   
>   	if (size < chunk_size)
> @@ -384,22 +779,13 @@ int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
>   
>   	BUG_ON(mm->max_order > GPU_BUDDY_MAX_ORDER);
>   
> -	mm->free_trees = kmalloc_array(GPU_BUDDY_MAX_FREE_TREES,
> -				       sizeof(*mm->free_trees),
> -				       GFP_KERNEL);
> -	if (!mm->free_trees)
> +	mm->free_tree = kcalloc(mm->max_order + 1,
> +				sizeof(struct rb_root),
> +				GFP_KERNEL);
> +	if (!mm->free_tree)
>   		return -ENOMEM;
>   
> -	for_each_free_tree(i) {
> -		mm->free_trees[i] = kmalloc_array(mm->max_order + 1,
> -						  sizeof(struct rb_root),
> -						  GFP_KERNEL);
> -		if (!mm->free_trees[i])
> -			goto out_free_tree;
> -
> -		for (j = 0; j <= mm->max_order; ++j)
> -			mm->free_trees[i][j] = RB_ROOT;
> -	}
> +	gpu_clear_tracker_init(&mm->clear);
>   
>   	mm->n_roots = hweight64(size);
>   
> @@ -447,9 +833,8 @@ int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
>   		gpu_block_free(mm, mm->roots[root_count]);
>   	kfree(mm->roots);
>   out_free_tree:
> -	while (i--)
> -		kfree(mm->free_trees[i]);
> -	kfree(mm->free_trees);
> +	gpu_clear_tracker_fini(&mm->clear);
> +	kfree(mm->free_tree);
>   	return -ENOMEM;
>   }
>   EXPORT_SYMBOL(gpu_buddy_init);
> @@ -463,7 +848,7 @@ EXPORT_SYMBOL(gpu_buddy_init);
>    */
>   void gpu_buddy_fini(struct gpu_buddy *mm)
>   {
> -	u64 root_size, size, start;
> +	u64 root_size, size;
>   	unsigned int order;
>   	int i;
>   
> @@ -471,22 +856,17 @@ void gpu_buddy_fini(struct gpu_buddy *mm)
>   
>   	for (i = 0; i < mm->n_roots; ++i) {
>   		order = ilog2(size) - ilog2(mm->chunk_size);
> -		start = gpu_buddy_block_offset(mm->roots[i]);
> -		__force_merge(mm, start, start + size, order);
> +		root_size = mm->chunk_size << order;
>   
>   		gpu_buddy_assert(gpu_buddy_block_is_free(mm->roots[i]));
> -
>   		gpu_block_free(mm, mm->roots[i]);
> -
> -		root_size = mm->chunk_size << order;
>   		size -= root_size;
>   	}
>   
>   	gpu_buddy_assert(mm->avail == mm->size);
>   
> -	for_each_free_tree(i)
> -		kfree(mm->free_trees[i]);
> -	kfree(mm->free_trees);
> +	gpu_clear_tracker_fini(&mm->clear);
> +	kfree(mm->free_tree);
>   	kfree(mm->roots);
>   }
>   EXPORT_SYMBOL(gpu_buddy_fini);
> @@ -512,13 +892,6 @@ static int split_block(struct gpu_buddy *mm,
>   	}
>   
>   	mark_split(mm, block);
> -
> -	if (gpu_buddy_block_is_clear(block)) {
> -		mark_cleared(block->left);
> -		mark_cleared(block->right);
> -		clear_reset(block);
> -	}
> -
>   	mark_free(mm, block->left);
>   	mark_free(mm, block->right);
>   
> @@ -536,42 +909,33 @@ static int split_block(struct gpu_buddy *mm,
>    */
>   void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear)
>   {
> -	enum gpu_buddy_free_tree src_tree, dst_tree;
> -	u64 root_size, size, start;
> -	unsigned int order;
> -	int i;
> +	unsigned int i;
>   
>   	gpu_buddy_driver_lock_held(mm);
> -	size = mm->size;
> -	for (i = 0; i < mm->n_roots; ++i) {
> -		order = ilog2(size) - ilog2(mm->chunk_size);
> -		start = gpu_buddy_block_offset(mm->roots[i]);
> -		__force_merge(mm, start, start + size, order);
> -
> -		root_size = mm->chunk_size << order;
> -		size -= root_size;
> -	}
>   
> -	src_tree = is_clear ? GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
> -	dst_tree = is_clear ? GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
> +	gpu_clear_tracker_fini(&mm->clear);
> +	gpu_clear_tracker_init(&mm->clear);
>   
>   	for (i = 0; i <= mm->max_order; ++i) {
> -		struct rb_root *root = &mm->free_trees[src_tree][i];
>   		struct gpu_buddy_block *block, *tmp;
>   
> -		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
> -			rbtree_remove(mm, block);
> +		rbtree_postorder_for_each_entry_safe(block, tmp,
> +						     &mm->free_tree[i], rb) {
>   			if (is_clear) {
> -				mark_cleared(block);
> -				mm->clear_avail += gpu_buddy_block_size(mm, block);
> -			} else {
> -				clear_reset(block);
> -				mm->clear_avail -= gpu_buddy_block_size(mm, block);
> +				if (!gpu_buddy_block_is_clear(block))
> +					block->header |= GPU_BUDDY_HEADER_CLEAR;
> +				gpu_clear_tracker_mark_clear(&mm->clear,
> +							     gpu_buddy_block_offset(block),
> +							     gpu_buddy_block_size(mm, block));
> +			} else if (gpu_buddy_block_is_clear(block)) {
> +				block->header &= ~GPU_BUDDY_HEADER_CLEAR;
>   			}
>   
> -			rbtree_insert(mm, block, dst_tree);
> +			gpu_buddy_augment_compute(block);
>   		}
>   	}
> +
> +	mm->clear_avail = mm->clear.total_clear;
>   }
>   EXPORT_SYMBOL(gpu_buddy_reset_clear);
>   
> @@ -584,13 +948,23 @@ EXPORT_SYMBOL(gpu_buddy_reset_clear);
>   void gpu_buddy_free_block(struct gpu_buddy *mm,
>   			  struct gpu_buddy_block *block)
>   {
> +	bool was_clear = gpu_buddy_block_is_clear(block);
> +	u64 size   = gpu_buddy_block_size(mm, block);
> +	u64 offset = gpu_buddy_block_offset(block);
> +
>   	gpu_buddy_driver_lock_held(mm);
> +
>   	BUG_ON(!gpu_buddy_block_is_allocated(block));
> -	mm->avail += gpu_buddy_block_size(mm, block);
> -	if (gpu_buddy_block_is_clear(block))
> -		mm->clear_avail += gpu_buddy_block_size(mm, block);
>   
> -	__gpu_buddy_free(mm, block, false);
> +	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> +	mm->avail += size;
> +
> +	if (was_clear) {
> +		gpu_clear_tracker_mark_clear(&mm->clear, offset, size);
> +		mm->clear_avail = mm->clear.total_clear;
> +	}
> +
> +	__gpu_buddy_free(mm, block);
>   }
>   EXPORT_SYMBOL(gpu_buddy_free_block);
>   
> @@ -604,10 +978,15 @@ static void __gpu_buddy_free_list(struct gpu_buddy *mm,
>   	gpu_buddy_assert(!(mark_dirty && mark_clear));
>   
>   	list_for_each_entry_safe(block, on, objects, link) {
> +		/*
> +		 * Propagate the caller's clear/dirty intent onto the block header
> +		 * before handing it to gpu_buddy_free_block(), which will then
> +		 * update the clear tracker accordingly.
> +		 */
>   		if (mark_clear)
> -			mark_cleared(block);
> +			block->header |= GPU_BUDDY_HEADER_CLEAR;
>   		else if (mark_dirty)
> -			clear_reset(block);
> +			block->header &= ~GPU_BUDDY_HEADER_CLEAR;
>   		gpu_buddy_free_block(mm, block);
>   		cond_resched();
>   	}
> @@ -643,23 +1022,14 @@ void gpu_buddy_free_list(struct gpu_buddy *mm,
>   }
>   EXPORT_SYMBOL(gpu_buddy_free_list);
>   
> -static bool block_incompatible(struct gpu_buddy_block *block, unsigned int flags)
> -{
> -	bool needs_clear = flags & GPU_BUDDY_CLEAR_ALLOCATION;
> -
> -	return needs_clear != gpu_buddy_block_is_clear(block);
> -}
> -
>   static struct gpu_buddy_block *
>   __alloc_range_bias(struct gpu_buddy *mm,
>   		   u64 start, u64 end,
>   		   unsigned int order,
> -		   unsigned long flags,
> -		   bool fallback)
> +		   unsigned long flags)
>   {
>   	u64 req_size = mm->chunk_size << order;
>   	struct gpu_buddy_block *block;
> -	struct gpu_buddy_block *buddy;
>   	LIST_HEAD(dfs);
>   	int err;
>   	int i;
> @@ -702,9 +1072,6 @@ __alloc_range_bias(struct gpu_buddy *mm,
>   				continue;
>   		}
>   
> -		if (!fallback && block_incompatible(block, flags))
> -			continue;
> -
>   		if (contains(start, end, block_start, block_end) &&
>   		    order == gpu_buddy_block_order(block)) {
>   			/*
> @@ -722,68 +1089,55 @@ __alloc_range_bias(struct gpu_buddy *mm,
>   				goto err_undo;
>   		}
>   
> -		list_add(&block->right->tmp_link, &dfs);
>   		list_add(&block->left->tmp_link, &dfs);
> +		list_add(&block->right->tmp_link, &dfs);
>   	} while (1);
>   
>   	return ERR_PTR(-ENOSPC);
>   
>   err_undo:
> -	/*
> -	 * We really don't want to leave around a bunch of split blocks, since
> -	 * bigger is better, so make sure we merge everything back before we
> -	 * free the allocated blocks.
> -	 */
> -	buddy = __get_buddy(block);
> -	if (buddy &&
> -	    (gpu_buddy_block_is_free(block) &&
> -	     gpu_buddy_block_is_free(buddy)))
> -		__gpu_buddy_free(mm, block, false);
> +	undo_partial_split(mm, block);
>   	return ERR_PTR(err);
>   }
>   
> -static struct gpu_buddy_block *
> -__gpu_buddy_alloc_range_bias(struct gpu_buddy *mm,
> -			     u64 start, u64 end,
> -			     unsigned int order,
> -			     unsigned long flags)
> -{
> -	struct gpu_buddy_block *block;
> -	bool fallback = false;
> -
> -	block = __alloc_range_bias(mm, start, end, order,
> -				   flags, fallback);
> -	if (IS_ERR(block))
> -		return __alloc_range_bias(mm, start, end, order,
> -					  flags, !fallback);
> -
> -	return block;
> -}
> -
>   static struct gpu_buddy_block *
>   get_maxblock(struct gpu_buddy *mm,
>   	     unsigned int order,
> -	     enum gpu_buddy_free_tree tree)
> +	     unsigned long flags)
>   {
> -	struct gpu_buddy_block *max_block = NULL, *block = NULL;
> -	struct rb_root *root;
> +	struct gpu_buddy_block *max_block;
> +	struct gpu_buddy_block *block;
> +	bool prefer_clear;
>   	unsigned int i;
>   
> +	max_block = NULL;
> +	prefer_clear = flags & GPU_BUDDY_CLEAR_ALLOCATION;
> +
>   	for (i = order; i <= mm->max_order; ++i) {
> -		root = &mm->free_trees[tree][i];
> -		block = rbtree_last_free_block(root);
> +		if (prefer_clear)
> +			block = rbtree_last_free_block(&mm->free_tree[i]);
> +		else
> +			block = rbtree_last_dirty_free_block(&mm->free_tree[i]);
> +
>   		if (!block)
>   			continue;
>   
> -		if (!max_block) {
> +		if (!max_block ||
> +		    gpu_buddy_block_offset(block) > gpu_buddy_block_offset(max_block))
>   			max_block = block;
> +	}
> +
> +	if (max_block || prefer_clear)
> +		return max_block;
> +
> +	for (i = order; i <= mm->max_order; ++i) {
> +		block = rbtree_last_free_block(&mm->free_tree[i]);
> +		if (!block)
>   			continue;
> -		}
>   
> -		if (gpu_buddy_block_offset(block) >
> -		    gpu_buddy_block_offset(max_block)) {
> +		if (!max_block ||
> +		    gpu_buddy_block_offset(block) > gpu_buddy_block_offset(max_block))
>   			max_block = block;
> -		}
>   	}
>   
>   	return max_block;
> @@ -795,45 +1149,37 @@ alloc_from_freetree(struct gpu_buddy *mm,
>   		    unsigned long flags)
>   {
>   	struct gpu_buddy_block *block = NULL;
> -	struct rb_root *root;
> -	enum gpu_buddy_free_tree tree;
>   	unsigned int tmp;
>   	int err;
>   
> -	tree = (flags & GPU_BUDDY_CLEAR_ALLOCATION) ?
> -		GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
> -
>   	if (flags & GPU_BUDDY_TOPDOWN_ALLOCATION) {
> -		block = get_maxblock(mm, order, tree);
> +		block = get_maxblock(mm, order, flags);
>   		if (block)
> -			/* Store the obtained block order */
>   			tmp = gpu_buddy_block_order(block);
> -	} else {
> +	} else if (!(flags & GPU_BUDDY_CLEAR_ALLOCATION)) {
>   		for (tmp = order; tmp <= mm->max_order; ++tmp) {
> -			/* Get RB tree root for this order and tree */
> -			root = &mm->free_trees[tree][tmp];
> -			block = rbtree_last_free_block(root);
> +			block = rbtree_last_dirty_free_block(&mm->free_tree[tmp]);
>   			if (block)
>   				break;
>   		}
> -	}
> -
> -	if (!block) {
> -		/* Try allocating from the other tree */
> -		tree = (tree == GPU_BUDDY_CLEAR_TREE) ?
> -			GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
> -
> +		if (!block) {
> +			for (tmp = order; tmp <= mm->max_order; ++tmp) {
> +				block = rbtree_last_free_block(&mm->free_tree[tmp]);
> +				if (block)
> +					break;
> +			}
> +		}
> +	} else {
>   		for (tmp = order; tmp <= mm->max_order; ++tmp) {
> -			root = &mm->free_trees[tree][tmp];
> -			block = rbtree_last_free_block(root);
> +			block = rbtree_last_free_block(&mm->free_tree[tmp]);
>   			if (block)
>   				break;
>   		}
> -
> -		if (!block)
> -			return ERR_PTR(-ENOSPC);
>   	}
>   
> +	if (!block)
> +		return ERR_PTR(-ENOSPC);
> +
>   	BUG_ON(!gpu_buddy_block_is_free(block));
>   
>   	while (tmp != order) {
> @@ -841,14 +1187,18 @@ alloc_from_freetree(struct gpu_buddy *mm,
>   		if (unlikely(err))
>   			goto err_undo;
>   
> -		block = block->right;
> +		if (!(flags & GPU_BUDDY_CLEAR_ALLOCATION) &&
> +		    gpu_buddy_block_is_clear(block->right))
> +			block = block->left;
> +		else
> +			block = block->right;
>   		tmp--;
>   	}
>   	return block;
>   
>   err_undo:
>   	if (tmp != order)
> -		__gpu_buddy_free(mm, block, false);
> +		__gpu_buddy_free(mm, block);
>   	return ERR_PTR(err);
>   }
>   
> @@ -869,12 +1219,11 @@ static bool gpu_buddy_subtree_can_satisfy(struct rb_node *node,
>   
>   static struct gpu_buddy_block *
>   gpu_buddy_find_block_aligned(struct gpu_buddy *mm,
> -			     enum gpu_buddy_free_tree tree,
>   			     unsigned int order,
>   			     unsigned int alignment,
>   			     unsigned long flags)
>   {
> -	struct rb_root *root = &mm->free_trees[tree][order];
> +	struct rb_root *root = &mm->free_tree[order];
>   	struct rb_node *rb = root->rb_node;
>   
>   	while (rb) {
> @@ -912,8 +1261,6 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
>   {
>   	struct gpu_buddy_block *block = NULL;
>   	unsigned int order, tmp, alignment;
> -	struct gpu_buddy_block *buddy;
> -	enum gpu_buddy_free_tree tree;
>   	unsigned long pages;
>   	int err;
>   
> @@ -921,19 +1268,8 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
>   	pages = size >> ilog2(mm->chunk_size);
>   	order = fls(pages) - 1;
>   
> -	tree = (flags & GPU_BUDDY_CLEAR_ALLOCATION) ?
> -		GPU_BUDDY_CLEAR_TREE : GPU_BUDDY_DIRTY_TREE;
> -
>   	for (tmp = order; tmp <= mm->max_order; ++tmp) {
> -		block = gpu_buddy_find_block_aligned(mm, tree, tmp,
> -						     alignment, flags);
> -		if (!block) {
> -			tree = (tree == GPU_BUDDY_CLEAR_TREE) ?
> -				GPU_BUDDY_DIRTY_TREE : GPU_BUDDY_CLEAR_TREE;
> -			block = gpu_buddy_find_block_aligned(mm, tree, tmp,
> -							     alignment, flags);
> -		}
> -
> +		block = gpu_buddy_find_block_aligned(mm, tmp, alignment, flags);
>   		if (block)
>   			break;
>   	}
> @@ -960,27 +1296,18 @@ gpu_buddy_offset_aligned_allocation(struct gpu_buddy *mm,
>   	return block;
>   
>   err_undo:
> -	/*
> -	 * We really don't want to leave around a bunch of split blocks, since
> -	 * bigger is better, so make sure we merge everything back before we
> -	 * free the allocated blocks.
> -	 */
> -	buddy = __get_buddy(block);
> -	if (buddy &&
> -	    (gpu_buddy_block_is_free(block) &&
> -	     gpu_buddy_block_is_free(buddy)))
> -		__gpu_buddy_free(mm, block, false);
> +	undo_partial_split(mm, block);
>   	return ERR_PTR(err);
>   }
>   
>   static int __alloc_range(struct gpu_buddy *mm,
>   			 struct list_head *dfs,
>   			 u64 start, u64 size,
> +			 unsigned long flags,
>   			 struct list_head *blocks,
>   			 u64 *total_allocated_on_err)
>   {
>   	struct gpu_buddy_block *block;
> -	struct gpu_buddy_block *buddy;
>   	u64 total_allocated = 0;
>   	LIST_HEAD(allocated);
>   	u64 end;
> @@ -1013,16 +1340,25 @@ static int __alloc_range(struct gpu_buddy *mm,
>   
>   		if (contains(start, end, block_start, block_end)) {
>   			if (gpu_buddy_block_is_free(block)) {
> +				u64 bsize = gpu_buddy_block_size(mm, block);
> +				u64 boff  = gpu_buddy_block_offset(block);
> +
>   				mark_allocated(mm, block);
> -				total_allocated += gpu_buddy_block_size(mm, block);
> -				mm->avail -= gpu_buddy_block_size(mm, block);
> -				if (gpu_buddy_block_is_clear(block))
> -					mm->clear_avail -= gpu_buddy_block_size(mm, block);
> +				total_allocated += bsize;
> +				mm->avail -= bsize;
> +
> +				block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> +				if (gpu_clear_tracker_is_clear(&mm->clear,
> +							       boff, bsize)) {
> +					if (flags & GPU_BUDDY_CLEAR_ALLOCATION)
> +						block->header |= GPU_BUDDY_HEADER_CLEAR;
> +				}
> +				gpu_clear_tracker_mark_dirty(&mm->clear,
> +							     boff, bsize);
> +				mm->clear_avail = mm->clear.total_clear;
> +
>   				list_add_tail(&block->link, &allocated);
>   				continue;
> -			} else if (!mm->clear_avail) {
> -				err = -ENOSPC;
> -				goto err_free;
>   			}
>   		}
>   
> @@ -1046,16 +1382,7 @@ static int __alloc_range(struct gpu_buddy *mm,
>   	return 0;
>   
>   err_undo:
> -	/*
> -	 * We really don't want to leave around a bunch of split blocks, since
> -	 * bigger is better, so make sure we merge everything back before we
> -	 * free the allocated blocks.
> -	 */
> -	buddy = __get_buddy(block);
> -	if (buddy &&
> -	    (gpu_buddy_block_is_free(block) &&
> -	     gpu_buddy_block_is_free(buddy)))
> -		__gpu_buddy_free(mm, block, false);
> +	undo_partial_split(mm, block);
>   
>   err_free:
>   	if (err == -ENOSPC && total_allocated_on_err) {
> @@ -1071,6 +1398,7 @@ static int __alloc_range(struct gpu_buddy *mm,
>   static int __gpu_buddy_alloc_range(struct gpu_buddy *mm,
>   				   u64 start,
>   				   u64 size,
> +				   unsigned long flags,
>   				   u64 *total_allocated_on_err,
>   				   struct list_head *blocks)
>   {
> @@ -1080,20 +1408,23 @@ static int __gpu_buddy_alloc_range(struct gpu_buddy *mm,
>   	for (i = 0; i < mm->n_roots; ++i)
>   		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
>   
> -	return __alloc_range(mm, &dfs, start, size,
> +	return __alloc_range(mm, &dfs, start, size, flags,
>   			     blocks, total_allocated_on_err);
>   }
>   
>   static int __alloc_contig_try_harder(struct gpu_buddy *mm,
>   				     u64 size,
>   				     u64 min_block_size,
> +				     unsigned long flags,
>   				     struct list_head *blocks)
>   {
>   	u64 rhs_offset, lhs_offset, lhs_size, filled;
>   	struct gpu_buddy_block *block;
> -	unsigned int tree, order;
>   	LIST_HEAD(blocks_lhs);
> +	struct rb_root *root;
> +	struct rb_node *iter;
>   	unsigned long pages;
> +	unsigned int order;
>   	u64 modify_size;
>   	int err;
>   
> @@ -1103,45 +1434,40 @@ static int __alloc_contig_try_harder(struct gpu_buddy *mm,
>   	if (order == 0)
>   		return -ENOSPC;
>   
> -	for_each_free_tree(tree) {
> -		struct rb_root *root;
> -		struct rb_node *iter;
> -
> -		root = &mm->free_trees[tree][order];
> -		if (rbtree_is_empty(root))
> -			continue;
> +	root = &mm->free_tree[order];
> +	if (RB_EMPTY_ROOT(root))
> +		return -ENOSPC;
>   
> -		iter = rb_last(root);
> -		while (iter) {
> -			block = rbtree_get_free_block(iter);
> -
> -			/* Allocate blocks traversing RHS */
> -			rhs_offset = gpu_buddy_block_offset(block);
> -			err =  __gpu_buddy_alloc_range(mm, rhs_offset, size,
> -						       &filled, blocks);
> -			if (!err || err != -ENOSPC)
> -				return err;
> -
> -			lhs_size = max((size - filled), min_block_size);
> -			if (!IS_ALIGNED(lhs_size, min_block_size))
> -				lhs_size = round_up(lhs_size, min_block_size);
> -
> -			/* Allocate blocks traversing LHS */
> -			lhs_offset = gpu_buddy_block_offset(block) - lhs_size;
> -			err =  __gpu_buddy_alloc_range(mm, lhs_offset, lhs_size,
> -						       NULL, &blocks_lhs);
> -			if (!err) {
> -				list_splice(&blocks_lhs, blocks);
> -				return 0;
> -			} else if (err != -ENOSPC) {
> -				gpu_buddy_free_list_internal(mm, blocks);
> -				return err;
> -			}
> -			/* Free blocks for the next iteration */
> +	iter = rb_last(root);
> +	while (iter) {
> +		block = rbtree_get_free_block(iter);
> +
> +		/* Allocate blocks traversing RHS */
> +		rhs_offset = gpu_buddy_block_offset(block);
> +		err =  __gpu_buddy_alloc_range(mm, rhs_offset, size,
> +					       flags, &filled, blocks);
> +		if (!err || err != -ENOSPC)
> +			return err;
> +
> +		lhs_size = max((size - filled), min_block_size);
> +		if (!IS_ALIGNED(lhs_size, min_block_size))
> +			lhs_size = round_up(lhs_size, min_block_size);
> +
> +		/* Allocate blocks traversing LHS */
> +		lhs_offset = gpu_buddy_block_offset(block) - lhs_size;
> +		err =  __gpu_buddy_alloc_range(mm, lhs_offset, lhs_size,
> +					       flags, NULL, &blocks_lhs);
> +		if (!err) {
> +			list_splice(&blocks_lhs, blocks);
> +			return 0;
> +		} else if (err != -ENOSPC) {
>   			gpu_buddy_free_list_internal(mm, blocks);
> -
> -			iter = rb_prev(iter);
> +			return err;
>   		}
> +		/* Free blocks for the next iteration */
> +		gpu_buddy_free_list_internal(mm, blocks);
> +
> +		iter = rb_prev(iter);
>   	}
>   
>   	return -ENOSPC;
> @@ -1175,6 +1501,7 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
>   	struct gpu_buddy_block *block;
>   	u64 block_start, block_end;
>   	LIST_HEAD(dfs);
> +	bool was_clear;
>   	u64 new_start;
>   	int err;
>   
> @@ -1217,22 +1544,38 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
>   	}
>   
>   	list_del(&block->link);
> +
> +	was_clear = gpu_buddy_block_is_clear(block);
> +	block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> +
> +	if (was_clear) {
> +		gpu_clear_tracker_mark_clear(&mm->clear,
> +					     gpu_buddy_block_offset(block),
> +					     gpu_buddy_block_size(mm, block));
> +		mm->clear_avail = mm->clear.total_clear;
> +	}
> +
>   	mark_free(mm, block);
>   	mm->avail += gpu_buddy_block_size(mm, block);
> -	if (gpu_buddy_block_is_clear(block))
> -		mm->clear_avail += gpu_buddy_block_size(mm, block);
>   
>   	/* Prevent recursively freeing this node */
>   	parent = block->parent;
>   	block->parent = NULL;
>   
>   	list_add(&block->tmp_link, &dfs);
> -	err =  __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL);
> +	err =  __alloc_range(mm, &dfs, new_start, new_size,
> +			     was_clear ? GPU_BUDDY_CLEAR_ALLOCATION : 0,
> +			     blocks, NULL);
>   	if (err) {
>   		mark_allocated(mm, block);
>   		mm->avail -= gpu_buddy_block_size(mm, block);
> -		if (gpu_buddy_block_is_clear(block))
> -			mm->clear_avail -= gpu_buddy_block_size(mm, block);
> +		if (was_clear) {
> +			gpu_clear_tracker_mark_dirty(&mm->clear,
> +						     gpu_buddy_block_offset(block),
> +						     gpu_buddy_block_size(mm, block));
> +			mm->clear_avail = mm->clear.total_clear;
> +			block->header |= GPU_BUDDY_HEADER_CLEAR;
> +		}
>   		list_add(&block->link, blocks);
>   	}
>   
> @@ -1241,6 +1584,21 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
>   }
>   EXPORT_SYMBOL(gpu_buddy_block_trim);
>   
> +static bool clear_steer_window(struct gpu_buddy *mm, u64 min_sz,
> +			       u64 *start, u64 *end, unsigned long *flags)
> +{
> +	struct gpu_clear_extent *ext =
> +		gpu_clear_tracker_find(&mm->clear, min_sz);
> +
> +	if (!ext)
> +		return false;
> +
> +	*start  = ext->start;
> +	*end    = ext->end;
> +	*flags |= GPU_BUDDY_RANGE_ALLOCATION;
> +	return true;
> +}
> +
>   static struct gpu_buddy_block *
>   __gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   			 u64 start, u64 end,
> @@ -1248,18 +1606,32 @@ __gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   			 unsigned int order,
>   			 unsigned long flags)
>   {
> -	if (flags & GPU_BUDDY_RANGE_ALLOCATION)
> +	struct gpu_buddy_block *block;
> +	bool steered = false;
> +
> +	/* Steer cleared allocations to a cleared extent that fits the order */
> +	if (!(flags & GPU_BUDDY_RANGE_ALLOCATION) &&
> +	    (flags & GPU_BUDDY_CLEAR_ALLOCATION) && mm->clear_avail)
> +		steered = clear_steer_window(mm, mm->chunk_size << order,
> +					     &start, &end, &flags);
> +
> +	if (flags & GPU_BUDDY_RANGE_ALLOCATION) {
>   		/* Allocate traversing within the range */
> -		return  __gpu_buddy_alloc_range_bias(mm, start, end,
> -						     order, flags);
> -	else if (size < min_block_size)
> +		block = __alloc_range_bias(mm, start, end, order, flags);
> +		if (!IS_ERR(block) || !steered)
> +			return block;
> +
> +		flags &= ~GPU_BUDDY_RANGE_ALLOCATION;
> +	}
> +
> +	if (size < min_block_size)
>   		/* Allocate from an offset-aligned region without size rounding */
>   		return gpu_buddy_offset_aligned_allocation(mm, size,
>   							   min_block_size,
>   							   flags);
> -	else
> -		/* Allocate from freetree */
> -		return alloc_from_freetree(mm, order, flags);
> +
> +	/* Allocate from freetree */
> +	return alloc_from_freetree(mm, order, flags);
>   }
>   
>   /**
> @@ -1320,7 +1692,7 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   		if (!IS_ALIGNED(start | end, min_block_size))
>   			return -EINVAL;
>   
> -		return __gpu_buddy_alloc_range(mm, start, size, NULL, blocks);
> +		return __gpu_buddy_alloc_range(mm, start, size, flags, NULL, blocks);
>   	}
>   
>   	original_size = size;
> @@ -1346,7 +1718,8 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   		if ((flags & GPU_BUDDY_CONTIGUOUS_ALLOCATION) &&
>   		    !(flags & GPU_BUDDY_RANGE_ALLOCATION))
>   			return __alloc_contig_try_harder(mm, original_size,
> -							 original_min_size, blocks);
> +							 original_min_size,
> +							 flags, blocks);
>   
>   		return -EINVAL;
>   	}
> @@ -1361,8 +1734,6 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   		BUG_ON(size >= min_block_size && order < min_order);
>   
>   		do {
> -			unsigned int fallback_order;
> -
>   			block = __gpu_buddy_alloc_blocks(mm, start,
>   							 end,
>   							 size,
> @@ -1372,48 +1743,46 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
>   			if (!IS_ERR(block))
>   				break;
>   
> -			if (size < min_block_size) {
> -				fallback_order = order;
> -			} else if (order == min_order) {
> -				fallback_order = min_order;
> -			} else {
> +			if (size >= min_block_size && order > min_order) {
>   				order--;
>   				continue;
>   			}
>   
> -			/* Try allocation through force merge method */
> -			if (mm->clear_avail &&
> -			    !__force_merge(mm, start, end, fallback_order)) {
> -				block = __gpu_buddy_alloc_blocks(mm, start,
> -								 end,
> -								 size,
> -								 min_block_size,
> -								 fallback_order,
> -								 flags);
> -				if (!IS_ERR(block)) {
> -					order = fallback_order;
> -					break;
> -				}
> -			}
> -
>   			/*
>   			 * Try contiguous block allocation through
>   			 * try harder method.
>   			 */
>   			if (flags & GPU_BUDDY_CONTIGUOUS_ALLOCATION &&
> -			    !(flags & GPU_BUDDY_RANGE_ALLOCATION))
> -				return __alloc_contig_try_harder(mm,
> -								 original_size,
> -								 original_min_size,
> -								 blocks);
> +			    !(flags & GPU_BUDDY_RANGE_ALLOCATION)) {
> +				err = __alloc_contig_try_harder(mm,
> +								original_size,
> +								original_min_size,
> +								flags,
> +								blocks);
> +				if (!err)
> +					return 0;
> +				if (err != -ENOSPC)
> +					return err;
> +				goto err_free;
> +			}
>   			err = -ENOSPC;
>   			goto err_free;
>   		} while (1);
>   
>   		mark_allocated(mm, block);
>   		mm->avail -= gpu_buddy_block_size(mm, block);
> -		if (gpu_buddy_block_is_clear(block))
> -			mm->clear_avail -= gpu_buddy_block_size(mm, block);
> +
> +		block->header &= ~GPU_BUDDY_HEADER_CLEAR;
> +		if (flags & GPU_BUDDY_CLEAR_ALLOCATION &&
> +		    gpu_clear_tracker_is_clear(&mm->clear,
> +					       gpu_buddy_block_offset(block),
> +					       gpu_buddy_block_size(mm, block)))
> +			block->header |= GPU_BUDDY_HEADER_CLEAR;
> +
> +		gpu_clear_tracker_mark_dirty(&mm->clear,
> +					     gpu_buddy_block_offset(block),
> +					     gpu_buddy_block_size(mm, block));
> +		mm->clear_avail = mm->clear.total_clear;
>   		kmemleak_update_trace(block);
>   		list_add_tail(&block->link, &allocated);
>   
> @@ -1492,31 +1861,30 @@ void gpu_buddy_print(struct gpu_buddy *mm)
>   	for (order = mm->max_order; order >= 0; order--) {
>   		struct gpu_buddy_block *block, *tmp;
>   		struct rb_root *root;
> -		u64 count = 0, free;
> -		unsigned int tree;
> -
> -		for_each_free_tree(tree) {
> -			root = &mm->free_trees[tree][order];
> +		u64 count = 0, clear = 0, free;
>   
> -			rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
> -				BUG_ON(!gpu_buddy_block_is_free(block));
> -				count++;
> -			}
> +		root = &mm->free_tree[order];
> +		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
> +			BUG_ON(!gpu_buddy_block_is_free(block));
> +			count++;
> +			if (gpu_buddy_block_is_clear(block))
> +				clear++;
>   		}
>   
>   		free = count * (mm->chunk_size << order);
>   		if (free < SZ_1M)
> -			pr_info("order-%2d free: %8llu KiB, blocks: %llu\n",
> -				order, free >> 10, count);
> +			pr_info("order-%2d free: %8llu KiB, blocks: %llu (clear: %llu)\n",
> +				order, free >> 10, count, clear);
>   		else
> -			pr_info("order-%2d free: %8llu MiB, blocks: %llu\n",
> -				order, free >> 20, count);
> +			pr_info("order-%2d free: %8llu MiB, blocks: %llu (clear: %llu)\n",
> +				order, free >> 20, count, clear);
>   	}
>   }
>   EXPORT_SYMBOL(gpu_buddy_print);
>   
>   static void gpu_buddy_module_exit(void)
>   {
> +	kmem_cache_destroy(slab_extents);
>   	kmem_cache_destroy(slab_blocks);
>   }
>   
> @@ -1526,6 +1894,12 @@ static int __init gpu_buddy_module_init(void)
>   	if (!slab_blocks)
>   		return -ENOMEM;
>   
> +	slab_extents = KMEM_CACHE(gpu_clear_extent, 0);
> +	if (!slab_extents) {
> +		kmem_cache_destroy(slab_blocks);
> +		return -ENOMEM;
> +	}
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index faa025498de4..a89c392a155a 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -50,15 +50,11 @@ void drm_buddy_print(struct gpu_buddy *mm, struct drm_printer *p)
>   		struct gpu_buddy_block *block, *tmp;
>   		struct rb_root *root;
>   		u64 count = 0, free;
> -		unsigned int tree;
>   
> -		for_each_free_tree(tree) {
> -			root = &mm->free_trees[tree][order];
> -
> -			rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
> -				BUG_ON(!gpu_buddy_block_is_free(block));
> -				count++;
> -			}
> +		root = &mm->free_tree[order];
> +		rbtree_postorder_for_each_entry_safe(block, tmp, root, rb) {
> +			BUG_ON(!gpu_buddy_block_is_free(block));
> +			count++;
>   		}
>   
>   		drm_printf(p, "order-%2d ", order);
> diff --git a/drivers/gpu/tests/gpu_buddy_test.c b/drivers/gpu/tests/gpu_buddy_test.c
> index 7df5c2ae83bb..e0d24a4542b2 100644
> --- a/drivers/gpu/tests/gpu_buddy_test.c
> +++ b/drivers/gpu/tests/gpu_buddy_test.c
> @@ -78,15 +78,11 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
>   		}
>   
>   		for (order = mm.max_order; order >= 0 && !root; order--) {
> -			for (tree = 0; tree < 2; tree++) {
> -				node = mm.free_trees[tree][order].rb_node;
> -				if (node) {
> -					root = container_of(node,
> -							    struct gpu_buddy_block,
> -							    rb);
> -					break;
> -				}
> -			}
> +			node = mm.free_tree[order].rb_node;
> +			if (node)
> +				root = container_of(node,
> +						    struct gpu_buddy_block,
> +						    rb);
>   		}
>   
>   		KUNIT_ASSERT_NOT_NULL(test, root);
> @@ -97,8 +93,8 @@ static void gpu_test_buddy_subtree_offset_alignment_stress(struct kunit *test)
>   		gpu_buddy_free_list(&mm, &allocated[i], 0);
>   
>   		for (order = 0; order <= mm.max_order; order++) {
> -			for (tree = 0; tree < 2; tree++) {
> -				node = mm.free_trees[tree][order].rb_node;
> +			{
> +				node = mm.free_tree[order].rb_node;
>   				if (!node)
>   					continue;
>   
> diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
> index 71941a039648..07da1aa4865b 100644
> --- a/include/linux/gpu_buddy.h
> +++ b/include/linux/gpu_buddy.h
> @@ -67,15 +67,6 @@
>    */
>   #define GPU_BUDDY_TRIM_DISABLE			BIT(5)
>   
> -enum gpu_buddy_free_tree {
> -	GPU_BUDDY_CLEAR_TREE = 0,
> -	GPU_BUDDY_DIRTY_TREE,
> -	GPU_BUDDY_MAX_FREE_TREES,
> -};
> -
> -#define for_each_free_tree(tree) \
> -	for ((tree) = 0; (tree) < GPU_BUDDY_MAX_FREE_TREES; (tree)++)
> -
>   /**
>    * struct gpu_buddy_block - Block within a buddy allocator
>    *
> @@ -103,6 +94,14 @@ struct gpu_buddy_block {
>   #define   GPU_BUDDY_ALLOCATED	   (1 << 10)
>   #define   GPU_BUDDY_FREE	   (2 << 10)
>   #define   GPU_BUDDY_SPLIT	   (3 << 10)
> +/*
> + * GPU_BUDDY_HEADER_CLEAR has two roles:
> + *  - FREE state:      set when the block's full range is cleared (tracker
> + *                     confirmed).  Cleared free blocks float in the buddy
> + *                     tree and are NOT inserted into free_tree[].
> + *  - ALLOCATED state: set when the block was served from cleared memory,
> + *                     informing the caller that no GPU clear pass is needed.
> + */
>   #define GPU_BUDDY_HEADER_CLEAR  GENMASK_ULL(9, 9)
>   /* Free to be used, if needed in the future */
>   #define GPU_BUDDY_HEADER_UNUSED GENMASK_ULL(8, 6)
> @@ -130,11 +129,44 @@ struct gpu_buddy_block {
>   /* private: */
>   	struct list_head tmp_link;
>   	unsigned int subtree_max_alignment;
> +	bool subtree_has_dirty;
>   };
>   
>   /* Order-zero must be at least SZ_4K */
>   #define GPU_BUDDY_MAX_ORDER (63 - 12)
>   
> +/**
> + * struct gpu_clear_extent - a contiguous cleared (zeroed) address range
> + *
> + * Tracks a single contiguous address range whose memory content is known
> + * to be zeroed.  Extents are non-overlapping and stored in an augmented
> + * red-black tree sorted by @start.  The augmented value @subtree_max_size
> + * allows O(log N) search for an extent of at least a given size.
> + */
> +struct gpu_clear_extent {
> +/* private: */
> +	struct rb_node	rb;
> +	u64		start;
> +	u64		end;
> +	u64		subtree_max_size;
> +};
> +
> +/**
> + * struct gpu_clear_tracker - tracks cleared (zeroed) address intervals
> + *
> + * Maintains a set of non-overlapping cleared extents as an augmented
> + * red-black tree.  The tracker is embedded inside struct gpu_buddy and
> + * replaces the former dual (clear/dirty) free-tree scheme.
> + *
> + * @total_clear: Total bytes of cleared memory currently tracked.
> + */
> +struct gpu_clear_tracker {
> +/* private: */
> +	struct rb_root	root;
> +/* public: */
> +	u64		total_clear;
> +};
> +
>   /**
>    * struct gpu_buddy - GPU binary buddy allocator
>    *
> @@ -154,18 +186,20 @@ struct gpu_buddy_block {
>    * @avail: Total free space currently available for allocation in bytes.
>    * @clear_avail: Free space available in the clear tree (zeroed memory) in bytes.
>    *               This is a subset of @avail.
> + * @clear: Tracker of cleared address ranges (decoupled from free_tree).
>    * @lock_dep_map: Annotates gpu_buddy API with a driver provided lock.
>    */
>   struct gpu_buddy {
>   /* private: */
> +	struct gpu_clear_tracker clear;
>   	/*
> -	 * Array of red-black trees for free block management.
> -	 * Indexed as free_trees[clear/dirty][order] where:
> -	 * - Index 0 (GPU_BUDDY_CLEAR_TREE): blocks with zeroed content
> -	 * - Index 1 (GPU_BUDDY_DIRTY_TREE): blocks with unknown content
> -	 * Each tree holds free blocks of the corresponding order.
> +	 * One RB-tree per order containing all free blocks (clear and
> +	 * dirty alike).  The augment field subtree_has_dirty lets dirty
> +	 * allocations skip subtrees with no dirty inventory in O(log N).
> +	 * Cleared free blocks coexist here but are also indexed by the
> +	 * @clear tracker for fast CLEAR_ALLOCATION lookups.
>   	 */
> -	struct rb_root **free_trees;
> +	struct rb_root *free_tree;
>   	/*
>   	 * Array of root blocks representing the top-level blocks of the
>   	 * binary tree(s). Multiple roots exist when the total size is not
>
> base-commit: 3c3c5fb9b36836d279ebe370189d68a0a3387362


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
                   ` (5 preceding siblings ...)
  2026-05-28 12:33 ` [PATCH v4 1/2] " Arunpravin Paneer Selvam
@ 2026-05-29 17:41 ` Matthew Auld
  2026-06-01 10:51   ` Arunpravin Paneer Selvam
  6 siblings, 1 reply; 12+ messages in thread
From: Matthew Auld @ 2026-05-29 17:41 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam, christian.koenig, dri-devel, intel-gfx,
	intel-xe, amd-gfx
  Cc: alexander.deucher

Hi,

On 27/05/2026 12:29, Arunpravin Paneer Selvam wrote:
> The current buddy allocator maintains separate clear_tree[] and
> dirty_tree[] rbtrees per order, preventing coalescing between cleared
> and dirty buddies. Under mixed workloads, this creates a merge barrier:
> adjacent buddies frequently end up split across trees, forcing reliance
> on __force_merge() during allocation.
> 
> __force_merge() performs an O(N x max_order) scan under the VRAM manager
> lock, leading to allocation stalls and failures for large contiguous
> requests even when sufficient total free memory is available.

So is this contig with non power-of-two sizes?

Do we know if we could force_merge everything in one go or somehow be 
more aggressive and do more than needed now, at the first sign of 
contention here, instead of doing it piecemeal? Downside would be losing 
more of the clear tracking, when this happens, but more re-merging.

Could we have another per-order list, of all blocks that we failed to 
merge, when we did the free step? When doing the force merge step, we 
maybe don't need to search blindly and can focus instead on the stuff 
tracked in those lists? Maybe it doesn't need to be a list, but could be 
another rb-tree?

We know the size of the total allocation, if we trigger force_merge, 
could we try to merge enough in one go for the entire allocation, 
instead of restarting the entire thing on the next iteration? Would that 
help at all?

But I guess these are more for the stalling side, and won't help much 
with the contig angle?

For the extent idea, is there any merit in maybe doing this for all 
contig blobs, and not just cleared stuff? Or is the workload you are 
seeing only benefit users that want cleared stuff? Wondering if this 
would benefit all users that want contig? Like if we hypothetically kept 
clear and dirty separate, like we do now, but with an improved 
force_merge, and then have extent tracking for all contig blobs and 
replace the try_harder stuff? When you do a contig alloc, the individual 
clear/dirty is still all there within the range, so you can skip 
re-clearing in some cases. I guess downside is overall more fuzzy contig 
+ clear/free path, but I guess you would never get allocation failures, 
when there is sufficient contig space?

> 
> Solution
> 
> Replace the dual-tree design with:
> - A single free_tree[order] rbtree for dirty and mixed free blocks
>    (fully cleared free blocks float outside this tree)
> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
> 
> Fully cleared free blocks are tracked outside the buddy trees using an
> augmented interval rbtree, enabling O(log E) lookup of the largest
> cleared extents.
> 
> Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
> of clear/dirty state. This removes the merge barrier and eliminates the
> need for __force_merge().
> 
> Benefits
> 
> - Correct high-order allocations after mixed clear/dirty workloads
> - Elimination of O(N x max_order) merge cost from the allocation path
> - O(log E) cleared-extent lookup replacing O(N) scans
> - Predictable allocation latency under fragmentation
> - Reduced complexity with a single tree per order

Since there is no separate tracking for dirty stuff, is the non-cleared 
alloc path a bit more "fuzzy" now, with it potentially stealing cleared 
memory, or is it the same behaviour still?

For drivers that don't use free tracking, is there some benefit? Are 
there any downsides there? I assume that clear tracker is always empty.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-05-29 17:41 ` Matthew Auld
@ 2026-06-01 10:51   ` Arunpravin Paneer Selvam
  2026-06-10  6:27     ` Arunpravin Paneer Selvam
  2026-06-10  9:19     ` Matthew Auld
  0 siblings, 2 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-06-01 10:51 UTC (permalink / raw)
  To: Matthew Auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher



On 5/29/2026 11:11 PM, Matthew Auld wrote:
> Hi,
>
> On 27/05/2026 12:29, Arunpravin Paneer Selvam wrote:
>> The current buddy allocator maintains separate clear_tree[] and
>> dirty_tree[] rbtrees per order, preventing coalescing between cleared
>> and dirty buddies. Under mixed workloads, this creates a merge barrier:
>> adjacent buddies frequently end up split across trees, forcing reliance
>> on __force_merge() during allocation.
>>
>> __force_merge() performs an O(N x max_order) scan under the VRAM manager
>> lock, leading to allocation stalls and failures for large contiguous
>> requests even when sufficient total free memory is available.
>
> So is this contig with non power-of-two sizes?
Both power-of-two and non-power-of-two contiguous requests are affected 
- in either case, the required higher-order block can't form when its 
lower-order buddies are separated by clear/dirty state across the dual 
trees. But the core issue we are seeing is VRAM fragmentation caused by 
massive small allocations (e.g., thousands of 4 KiB–8 KiB buffers) that 
end up split across clear and dirty trees, preventing buddy coalescing. 
This leads to allocation failures and OOM in later workloads even when 
sufficient total free VRAM is available.
>
> Do we know if we could force_merge everything in one go or somehow be 
> more aggressive and do more than needed now, at the first sign of 
> contention here, instead of doing it piecemeal? Downside would be 
> losing more of the clear tracking, when this happens, but more 
> re-merging.
>
> Could we have another per-order list, of all blocks that we failed to 
> merge, when we did the free step? When doing the force merge step, we 
> maybe don't need to search blindly and can focus instead on the stuff 
> tracked in those lists? Maybe it doesn't need to be a list, but could 
> be another rb-tree?
>
> We know the size of the total allocation, if we trigger force_merge, 
> could we try to merge enough in one go for the entire allocation, 
> instead of restarting the entire thing on the next iteration? Would 
> that help at all?
>
> But I guess these are more for the stalling side, and won't help much 
> with the contig angle?
The memory is highly fragmented into mostly 4 KiB chunks and small 
scattered blocks across the dual trees, so although total free memory 
exists, it is split into low-order fragments. The workload then requests 
very large contiguous allocations (tens of GBs, e.g., ~64 GiB), which 
fail with OOM because the allocator cannot form sufficiently large 
high-order blocks from the fragmented space. We could go with more 
aggressive merging or merge-in-one-go approaches, but this might waste 
more cleared memory. I think fundamentally the buddy allocator should be 
allowed to merge unconditionally - the single-tree approach with 
unconditional coalescing would improve the fragmentation and benefit 
contiguous allocations along with addressing the stalling and latency 
issues.
>
> For the extent idea, is there any merit in maybe doing this for all 
> contig blobs, and not just cleared stuff? Or is the workload you are 
> seeing only benefit users that want cleared stuff? Wondering if this 
> would benefit all users that want contig? Like if we hypothetically 
> kept clear and dirty separate, like we do now, but with an improved 
> force_merge, and then have extent tracking for all contig blobs and 
> replace the try_harder stuff? When you do a contig alloc, the 
> individual clear/dirty is still all there within the range, so you can 
> skip re-clearing in some cases. I guess downside is overall more fuzzy 
> contig + clear/free path, but I guess you would never get allocation 
> failures, when there is sufficient contig space?
Yes, extending extent tracking to all contig allocations has merit, but 
the core problem remains - with the dual-tree design, we still need 
force_merge to undo the clear/dirty split before those extents can form. 
In cases like heavy small-allocation workloads (thousands of 4 KiB 
buffers) running first, the memory ends up massively fragmented across 
both trees. When a very large contiguous allocation (e.g., ~64 GiB) 
comes in later, the allocator fails with OOM even though sufficient 
total free memory exists, because the extent tracker can't find a 
contiguous range that was never allowed to merge in the first place. I 
think the dirty/clear split is fundamentally the problem - allowing the 
buddy allocator to merge unconditionally removes this barrier, and the 
clear tracker can then be layered on top as an optimization without 
blocking coalescing.
>
>>
>> Solution
>>
>> Replace the dual-tree design with:
>> - A single free_tree[order] rbtree for dirty and mixed free blocks
>>    (fully cleared free blocks float outside this tree)
>> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
>>
>> Fully cleared free blocks are tracked outside the buddy trees using an
>> augmented interval rbtree, enabling O(log E) lookup of the largest
>> cleared extents.
>>
>> Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
>> of clear/dirty state. This removes the merge barrier and eliminates the
>> need for __force_merge().
>>
>> Benefits
>>
>> - Correct high-order allocations after mixed clear/dirty workloads
>> - Elimination of O(N x max_order) merge cost from the allocation path
>> - O(log E) cleared-extent lookup replacing O(N) scans
>> - Predictable allocation latency under fragmentation
>> - Reduced complexity with a single tree per order
>
> Since there is no separate tracking for dirty stuff, is the 
> non-cleared alloc path a bit more "fuzzy" now, with it potentially 
> stealing cleared memory, or is it the same behaviour still?
Right, on v4, the dirty and mixed (partially cleared) blocks are 
allocated for the non-cleared alloc path, which can end up stealing 
cleared memory. On v5, I plan to address this with a three-tier dirty 
allocation fallback: dirty → mixed → clear, driven by rbtree augment 
bits (subtree_has_dirty, subtree_has_mixed), each pass O(log N). The 
split-descent also applies the same preference at every level when 
carving a higher-order block, so cleared memory is preserved as much as 
possible and only used as a last resort.
Thoughts ?
>
> For drivers that don't use free tracking, is there some benefit? Are 
> there any downsides there? I assume that clear tracker is always empty.
Correct, for drivers that don't clear memory, the clear tracker is 
always empty and they simply allocate from the free_tree[]. Benefits:

Single tree per order instead of dual trees (fewer rbtree operations)
No force_merge path at all (unconditional coalescing at free time)
Simpler code path overall

No real downsides - the clear tracker adds zero overhead when empty, and 
the augment bits would simply show all blocks as dirty, so the walk 
degenerates to a normal rbtree lookup with no extra cost.

Regards,
Arun.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-06-01 10:51   ` Arunpravin Paneer Selvam
@ 2026-06-10  6:27     ` Arunpravin Paneer Selvam
  2026-06-10  9:19     ` Matthew Auld
  1 sibling, 0 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-06-10  6:27 UTC (permalink / raw)
  To: Matthew Auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher

Hi Matthew,

Ping ?

Regards,
Arun.

On 6/1/2026 4:21 PM, Arunpravin Paneer Selvam wrote:
>
>
> On 5/29/2026 11:11 PM, Matthew Auld wrote:
>> Hi,
>>
>> On 27/05/2026 12:29, Arunpravin Paneer Selvam wrote:
>>> The current buddy allocator maintains separate clear_tree[] and
>>> dirty_tree[] rbtrees per order, preventing coalescing between cleared
>>> and dirty buddies. Under mixed workloads, this creates a merge barrier:
>>> adjacent buddies frequently end up split across trees, forcing reliance
>>> on __force_merge() during allocation.
>>>
>>> __force_merge() performs an O(N x max_order) scan under the VRAM 
>>> manager
>>> lock, leading to allocation stalls and failures for large contiguous
>>> requests even when sufficient total free memory is available.
>>
>> So is this contig with non power-of-two sizes?
> Both power-of-two and non-power-of-two contiguous requests are 
> affected - in either case, the required higher-order block can't form 
> when its lower-order buddies are separated by clear/dirty state across 
> the dual trees. But the core issue we are seeing is VRAM fragmentation 
> caused by massive small allocations (e.g., thousands of 4 KiB–8 KiB 
> buffers) that end up split across clear and dirty trees, preventing 
> buddy coalescing. This leads to allocation failures and OOM in later 
> workloads even when sufficient total free VRAM is available.
>>
>> Do we know if we could force_merge everything in one go or somehow be 
>> more aggressive and do more than needed now, at the first sign of 
>> contention here, instead of doing it piecemeal? Downside would be 
>> losing more of the clear tracking, when this happens, but more 
>> re-merging.
>>
>> Could we have another per-order list, of all blocks that we failed to 
>> merge, when we did the free step? When doing the force merge step, we 
>> maybe don't need to search blindly and can focus instead on the stuff 
>> tracked in those lists? Maybe it doesn't need to be a list, but could 
>> be another rb-tree?
>>
>> We know the size of the total allocation, if we trigger force_merge, 
>> could we try to merge enough in one go for the entire allocation, 
>> instead of restarting the entire thing on the next iteration? Would 
>> that help at all?
>>
>> But I guess these are more for the stalling side, and won't help much 
>> with the contig angle?
> The memory is highly fragmented into mostly 4 KiB chunks and small 
> scattered blocks across the dual trees, so although total free memory 
> exists, it is split into low-order fragments. The workload then 
> requests very large contiguous allocations (tens of GBs, e.g., ~64 
> GiB), which fail with OOM because the allocator cannot form 
> sufficiently large high-order blocks from the fragmented space. We 
> could go with more aggressive merging or merge-in-one-go approaches, 
> but this might waste more cleared memory. I think fundamentally the 
> buddy allocator should be allowed to merge unconditionally - the 
> single-tree approach with unconditional coalescing would improve the 
> fragmentation and benefit contiguous allocations along with addressing 
> the stalling and latency issues.
>>
>> For the extent idea, is there any merit in maybe doing this for all 
>> contig blobs, and not just cleared stuff? Or is the workload you are 
>> seeing only benefit users that want cleared stuff? Wondering if this 
>> would benefit all users that want contig? Like if we hypothetically 
>> kept clear and dirty separate, like we do now, but with an improved 
>> force_merge, and then have extent tracking for all contig blobs and 
>> replace the try_harder stuff? When you do a contig alloc, the 
>> individual clear/dirty is still all there within the range, so you 
>> can skip re-clearing in some cases. I guess downside is overall more 
>> fuzzy contig + clear/free path, but I guess you would never get 
>> allocation failures, when there is sufficient contig space?
> Yes, extending extent tracking to all contig allocations has merit, 
> but the core problem remains - with the dual-tree design, we still 
> need force_merge to undo the clear/dirty split before those extents 
> can form. In cases like heavy small-allocation workloads (thousands of 
> 4 KiB buffers) running first, the memory ends up massively fragmented 
> across both trees. When a very large contiguous allocation (e.g., ~64 
> GiB) comes in later, the allocator fails with OOM even though 
> sufficient total free memory exists, because the extent tracker can't 
> find a contiguous range that was never allowed to merge in the first 
> place. I think the dirty/clear split is fundamentally the problem - 
> allowing the buddy allocator to merge unconditionally removes this 
> barrier, and the clear tracker can then be layered on top as an 
> optimization without blocking coalescing.
>>
>>>
>>> Solution
>>>
>>> Replace the dual-tree design with:
>>> - A single free_tree[order] rbtree for dirty and mixed free blocks
>>>    (fully cleared free blocks float outside this tree)
>>> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
>>>
>>> Fully cleared free blocks are tracked outside the buddy trees using an
>>> augmented interval rbtree, enabling O(log E) lookup of the largest
>>> cleared extents.
>>>
>>> Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
>>> of clear/dirty state. This removes the merge barrier and eliminates the
>>> need for __force_merge().
>>>
>>> Benefits
>>>
>>> - Correct high-order allocations after mixed clear/dirty workloads
>>> - Elimination of O(N x max_order) merge cost from the allocation path
>>> - O(log E) cleared-extent lookup replacing O(N) scans
>>> - Predictable allocation latency under fragmentation
>>> - Reduced complexity with a single tree per order
>>
>> Since there is no separate tracking for dirty stuff, is the 
>> non-cleared alloc path a bit more "fuzzy" now, with it potentially 
>> stealing cleared memory, or is it the same behaviour still?
> Right, on v4, the dirty and mixed (partially cleared) blocks are 
> allocated for the non-cleared alloc path, which can end up stealing 
> cleared memory. On v5, I plan to address this with a three-tier dirty 
> allocation fallback: dirty → mixed → clear, driven by rbtree augment 
> bits (subtree_has_dirty, subtree_has_mixed), each pass O(log N). The 
> split-descent also applies the same preference at every level when 
> carving a higher-order block, so cleared memory is preserved as much 
> as possible and only used as a last resort.
> Thoughts ?
>>
>> For drivers that don't use free tracking, is there some benefit? Are 
>> there any downsides there? I assume that clear tracker is always empty.
> Correct, for drivers that don't clear memory, the clear tracker is 
> always empty and they simply allocate from the free_tree[]. Benefits:
>
> Single tree per order instead of dual trees (fewer rbtree operations)
> No force_merge path at all (unconditional coalescing at free time)
> Simpler code path overall
>
> No real downsides - the clear tracker adds zero overhead when empty, 
> and the augment bits would simply show all blocks as dirty, so the 
> walk degenerates to a normal rbtree lookup with no extra cost.
>
> Regards,
> Arun.
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-06-01 10:51   ` Arunpravin Paneer Selvam
  2026-06-10  6:27     ` Arunpravin Paneer Selvam
@ 2026-06-10  9:19     ` Matthew Auld
  2026-06-10 13:06       ` Arunpravin Paneer Selvam
  1 sibling, 1 reply; 12+ messages in thread
From: Matthew Auld @ 2026-06-10  9:19 UTC (permalink / raw)
  To: Arunpravin Paneer Selvam, christian.koenig, dri-devel, intel-gfx,
	intel-xe, amd-gfx
  Cc: alexander.deucher

On 01/06/2026 11:51, Arunpravin Paneer Selvam wrote:
> 
> 
> On 5/29/2026 11:11 PM, Matthew Auld wrote:
>> Hi,
>>
>> On 27/05/2026 12:29, Arunpravin Paneer Selvam wrote:
>>> The current buddy allocator maintains separate clear_tree[] and
>>> dirty_tree[] rbtrees per order, preventing coalescing between cleared
>>> and dirty buddies. Under mixed workloads, this creates a merge barrier:
>>> adjacent buddies frequently end up split across trees, forcing reliance
>>> on __force_merge() during allocation.
>>>
>>> __force_merge() performs an O(N x max_order) scan under the VRAM manager
>>> lock, leading to allocation stalls and failures for large contiguous
>>> requests even when sufficient total free memory is available.
>>
>> So is this contig with non power-of-two sizes?
> Both power-of-two and non-power-of-two contiguous requests are affected 
> - in either case, the required higher-order block can't form when its 
> lower-order buddies are separated by clear/dirty state across the dual 
> trees. But the core issue we are seeing is VRAM fragmentation caused by 
> massive small allocations (e.g., thousands of 4 KiB–8 KiB buffers) that 
> end up split across clear and dirty trees, preventing buddy coalescing. 
> This leads to allocation failures and OOM in later workloads even when 
> sufficient total free VRAM is available.
>>
>> Do we know if we could force_merge everything in one go or somehow be 
>> more aggressive and do more than needed now, at the first sign of 
>> contention here, instead of doing it piecemeal? Downside would be 
>> losing more of the clear tracking, when this happens, but more re- 
>> merging.
>>
>> Could we have another per-order list, of all blocks that we failed to 
>> merge, when we did the free step? When doing the force merge step, we 
>> maybe don't need to search blindly and can focus instead on the stuff 
>> tracked in those lists? Maybe it doesn't need to be a list, but could 
>> be another rb-tree?
>>
>> We know the size of the total allocation, if we trigger force_merge, 
>> could we try to merge enough in one go for the entire allocation, 
>> instead of restarting the entire thing on the next iteration? Would 
>> that help at all?
>>
>> But I guess these are more for the stalling side, and won't help much 
>> with the contig angle?
> The memory is highly fragmented into mostly 4 KiB chunks and small 
> scattered blocks across the dual trees, so although total free memory 
> exists, it is split into low-order fragments. The workload then requests 
> very large contiguous allocations (tens of GBs, e.g., ~64 GiB), which 
> fail with OOM because the allocator cannot form sufficiently large high- 
> order blocks from the fragmented space. We could go with more aggressive 
> merging or merge-in-one-go approaches, but this might waste more cleared 
> memory. I think fundamentally the buddy allocator should be allowed to 
> merge unconditionally - the single-tree approach with unconditional 
> coalescing would improve the fragmentation and benefit contiguous 
> allocations along with addressing the stalling and latency issues.
>>
>> For the extent idea, is there any merit in maybe doing this for all 
>> contig blobs, and not just cleared stuff? Or is the workload you are 
>> seeing only benefit users that want cleared stuff? Wondering if this 
>> would benefit all users that want contig? Like if we hypothetically 
>> kept clear and dirty separate, like we do now, but with an improved 
>> force_merge, and then have extent tracking for all contig blobs and 
>> replace the try_harder stuff? When you do a contig alloc, the 
>> individual clear/dirty is still all there within the range, so you can 
>> skip re-clearing in some cases. I guess downside is overall more fuzzy 
>> contig + clear/free path, but I guess you would never get allocation 
>> failures, when there is sufficient contig space?
> Yes, extending extent tracking to all contig allocations has merit, but 
> the core problem remains - with the dual-tree design, we still need 
> force_merge to undo the clear/dirty split before those extents can form. 
> In cases like heavy small-allocation workloads (thousands of 4 KiB 
> buffers) running first, the memory ends up massively fragmented across 
> both trees. When a very large contiguous allocation (e.g., ~64 GiB) 
> comes in later, the allocator fails with OOM even though sufficient 
> total free memory exists, because the extent tracker can't find a 
> contiguous range that was never allowed to merge in the first place. I 
> think the dirty/clear split is fundamentally the problem - allowing the 
> buddy allocator to merge unconditionally removes this barrier, and the 
> clear tracker can then be layered on top as an optimization without 
> blocking coalescing.
>>
>>>
>>> Solution
>>>
>>> Replace the dual-tree design with:
>>> - A single free_tree[order] rbtree for dirty and mixed free blocks
>>>    (fully cleared free blocks float outside this tree)
>>> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
>>>
>>> Fully cleared free blocks are tracked outside the buddy trees using an
>>> augmented interval rbtree, enabling O(log E) lookup of the largest
>>> cleared extents.
>>>
>>> Buddy coalescing is now unconditional in __gpu_buddy_free(), regardless
>>> of clear/dirty state. This removes the merge barrier and eliminates the
>>> need for __force_merge().
>>>
>>> Benefits
>>>
>>> - Correct high-order allocations after mixed clear/dirty workloads
>>> - Elimination of O(N x max_order) merge cost from the allocation path
>>> - O(log E) cleared-extent lookup replacing O(N) scans
>>> - Predictable allocation latency under fragmentation
>>> - Reduced complexity with a single tree per order
>>
>> Since there is no separate tracking for dirty stuff, is the non- 
>> cleared alloc path a bit more "fuzzy" now, with it potentially 
>> stealing cleared memory, or is it the same behaviour still?
> Right, on v4, the dirty and mixed (partially cleared) blocks are 
> allocated for the non-cleared alloc path, which can end up stealing 
> cleared memory. On v5, I plan to address this with a three-tier dirty 
> allocation fallback: dirty → mixed → clear, driven by rbtree augment 
> bits (subtree_has_dirty, subtree_has_mixed), each pass O(log N). The 
> split-descent also applies the same preference at every level when 
> carving a higher-order block, so cleared memory is preserved as much as 
> possible and only used as a last resort.
> Thoughts ?

No objections from me. Do you want me to still look at v4 in depth, or 
wait for v5? I only really looked at this from high level.

>>
>> For drivers that don't use free tracking, is there some benefit? Are 
>> there any downsides there? I assume that clear tracker is always empty.
> Correct, for drivers that don't clear memory, the clear tracker is 
> always empty and they simply allocate from the free_tree[]. Benefits:
> 
> Single tree per order instead of dual trees (fewer rbtree operations)
> No force_merge path at all (unconditional coalescing at free time)
> Simpler code path overall
> 
> No real downsides - the clear tracker adds zero overhead when empty, and 
> the augment bits would simply show all blocks as dirty, so the walk 
> degenerates to a normal rbtree lookup with no extra cost.
> 
> Regards,
> Arun.
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker
  2026-06-10  9:19     ` Matthew Auld
@ 2026-06-10 13:06       ` Arunpravin Paneer Selvam
  0 siblings, 0 replies; 12+ messages in thread
From: Arunpravin Paneer Selvam @ 2026-06-10 13:06 UTC (permalink / raw)
  To: Matthew Auld, christian.koenig, dri-devel, intel-gfx, intel-xe,
	amd-gfx
  Cc: alexander.deucher



On 6/10/2026 2:49 PM, Matthew Auld wrote:
> On 01/06/2026 11:51, Arunpravin Paneer Selvam wrote:
>>
>>
>> On 5/29/2026 11:11 PM, Matthew Auld wrote:
>>> Hi,
>>>
>>> On 27/05/2026 12:29, Arunpravin Paneer Selvam wrote:
>>>> The current buddy allocator maintains separate clear_tree[] and
>>>> dirty_tree[] rbtrees per order, preventing coalescing between cleared
>>>> and dirty buddies. Under mixed workloads, this creates a merge 
>>>> barrier:
>>>> adjacent buddies frequently end up split across trees, forcing 
>>>> reliance
>>>> on __force_merge() during allocation.
>>>>
>>>> __force_merge() performs an O(N x max_order) scan under the VRAM 
>>>> manager
>>>> lock, leading to allocation stalls and failures for large contiguous
>>>> requests even when sufficient total free memory is available.
>>>
>>> So is this contig with non power-of-two sizes?
>> Both power-of-two and non-power-of-two contiguous requests are 
>> affected - in either case, the required higher-order block can't form 
>> when its lower-order buddies are separated by clear/dirty state 
>> across the dual trees. But the core issue we are seeing is VRAM 
>> fragmentation caused by massive small allocations (e.g., thousands of 
>> 4 KiB–8 KiB buffers) that end up split across clear and dirty trees, 
>> preventing buddy coalescing. This leads to allocation failures and 
>> OOM in later workloads even when sufficient total free VRAM is 
>> available.
>>>
>>> Do we know if we could force_merge everything in one go or somehow 
>>> be more aggressive and do more than needed now, at the first sign of 
>>> contention here, instead of doing it piecemeal? Downside would be 
>>> losing more of the clear tracking, when this happens, but more re- 
>>> merging.
>>>
>>> Could we have another per-order list, of all blocks that we failed 
>>> to merge, when we did the free step? When doing the force merge 
>>> step, we maybe don't need to search blindly and can focus instead on 
>>> the stuff tracked in those lists? Maybe it doesn't need to be a 
>>> list, but could be another rb-tree?
>>>
>>> We know the size of the total allocation, if we trigger force_merge, 
>>> could we try to merge enough in one go for the entire allocation, 
>>> instead of restarting the entire thing on the next iteration? Would 
>>> that help at all?
>>>
>>> But I guess these are more for the stalling side, and won't help 
>>> much with the contig angle?
>> The memory is highly fragmented into mostly 4 KiB chunks and small 
>> scattered blocks across the dual trees, so although total free memory 
>> exists, it is split into low-order fragments. The workload then 
>> requests very large contiguous allocations (tens of GBs, e.g., ~64 
>> GiB), which fail with OOM because the allocator cannot form 
>> sufficiently large high- order blocks from the fragmented space. We 
>> could go with more aggressive merging or merge-in-one-go approaches, 
>> but this might waste more cleared memory. I think fundamentally the 
>> buddy allocator should be allowed to merge unconditionally - the 
>> single-tree approach with unconditional coalescing would improve the 
>> fragmentation and benefit contiguous allocations along with 
>> addressing the stalling and latency issues.
>>>
>>> For the extent idea, is there any merit in maybe doing this for all 
>>> contig blobs, and not just cleared stuff? Or is the workload you are 
>>> seeing only benefit users that want cleared stuff? Wondering if this 
>>> would benefit all users that want contig? Like if we hypothetically 
>>> kept clear and dirty separate, like we do now, but with an improved 
>>> force_merge, and then have extent tracking for all contig blobs and 
>>> replace the try_harder stuff? When you do a contig alloc, the 
>>> individual clear/dirty is still all there within the range, so you 
>>> can skip re-clearing in some cases. I guess downside is overall more 
>>> fuzzy contig + clear/free path, but I guess you would never get 
>>> allocation failures, when there is sufficient contig space?
>> Yes, extending extent tracking to all contig allocations has merit, 
>> but the core problem remains - with the dual-tree design, we still 
>> need force_merge to undo the clear/dirty split before those extents 
>> can form. In cases like heavy small-allocation workloads (thousands 
>> of 4 KiB buffers) running first, the memory ends up massively 
>> fragmented across both trees. When a very large contiguous allocation 
>> (e.g., ~64 GiB) comes in later, the allocator fails with OOM even 
>> though sufficient total free memory exists, because the extent 
>> tracker can't find a contiguous range that was never allowed to merge 
>> in the first place. I think the dirty/clear split is fundamentally 
>> the problem - allowing the buddy allocator to merge unconditionally 
>> removes this barrier, and the clear tracker can then be layered on 
>> top as an optimization without blocking coalescing.
>>>
>>>>
>>>> Solution
>>>>
>>>> Replace the dual-tree design with:
>>>> - A single free_tree[order] rbtree for dirty and mixed free blocks
>>>>    (fully cleared free blocks float outside this tree)
>>>> - A lightweight out-of-band clear tracker (gpu_clear_tracker)
>>>>
>>>> Fully cleared free blocks are tracked outside the buddy trees using an
>>>> augmented interval rbtree, enabling O(log E) lookup of the largest
>>>> cleared extents.
>>>>
>>>> Buddy coalescing is now unconditional in __gpu_buddy_free(), 
>>>> regardless
>>>> of clear/dirty state. This removes the merge barrier and eliminates 
>>>> the
>>>> need for __force_merge().
>>>>
>>>> Benefits
>>>>
>>>> - Correct high-order allocations after mixed clear/dirty workloads
>>>> - Elimination of O(N x max_order) merge cost from the allocation path
>>>> - O(log E) cleared-extent lookup replacing O(N) scans
>>>> - Predictable allocation latency under fragmentation
>>>> - Reduced complexity with a single tree per order
>>>
>>> Since there is no separate tracking for dirty stuff, is the non- 
>>> cleared alloc path a bit more "fuzzy" now, with it potentially 
>>> stealing cleared memory, or is it the same behaviour still?
>> Right, on v4, the dirty and mixed (partially cleared) blocks are 
>> allocated for the non-cleared alloc path, which can end up stealing 
>> cleared memory. On v5, I plan to address this with a three-tier dirty 
>> allocation fallback: dirty → mixed → clear, driven by rbtree augment 
>> bits (subtree_has_dirty, subtree_has_mixed), each pass O(log N). The 
>> split-descent also applies the same preference at every level when 
>> carving a higher-order block, so cleared memory is preserved as much 
>> as possible and only used as a last resort.
>> Thoughts ?
>
> No objections from me. Do you want me to still look at v4 in depth, or 
> wait for v5? I only really looked at this from high level.
I will send the v5. Please review the next version.

Thanks,
Arun.
>
>>>
>>> For drivers that don't use free tracking, is there some benefit? Are 
>>> there any downsides there? I assume that clear tracker is always empty.
>> Correct, for drivers that don't clear memory, the clear tracker is 
>> always empty and they simply allocate from the free_tree[]. Benefits:
>>
>> Single tree per order instead of dual trees (fewer rbtree operations)
>> No force_merge path at all (unconditional coalescing at free time)
>> Simpler code path overall
>>
>> No real downsides - the clear tracker adds zero overhead when empty, 
>> and the augment bits would simply show all blocks as dirty, so the 
>> walk degenerates to a normal rbtree lookup with no extra cost.
>>
>> Regards,
>> Arun.
>>
>>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-10 13:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 11:29 [PATCH v4 1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Arunpravin Paneer Selvam
2026-05-27 11:29 ` [PATCH v4 2/2] gpu/tests/buddy: add clear-tracker allocation latency benchmarks Arunpravin Paneer Selvam
2026-05-27 14:30 ` ✗ CI.checkpatch: warning for series starting with [v4,1/2] gpu/buddy: replace dual-tree/force_merge with decoupled clear tracker Patchwork
2026-05-27 14:32 ` ✓ CI.KUnit: success " Patchwork
2026-05-27 15:24 ` ✓ Xe.CI.BAT: " Patchwork
2026-05-27 19:27 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-05-28 12:33 ` [PATCH v4 1/2] " Arunpravin Paneer Selvam
2026-05-29 17:41 ` Matthew Auld
2026-06-01 10:51   ` Arunpravin Paneer Selvam
2026-06-10  6:27     ` Arunpravin Paneer Selvam
2026-06-10  9:19     ` Matthew Auld
2026-06-10 13:06       ` Arunpravin Paneer Selvam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox