[RFC PATCH V7 0/9] Add memory page offlining support

public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed

* [RFC PATCH V7 0/9] Add memory page offlining support
@ 2026-04-13 13:16 Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy Tejas Upadhyay
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

This functionality represents a significant step in making
the xe driver gracefully handle hardware memory degradation.
By integrating with the DRM Buddy allocator, the driver
can permanently "carve out" faulty memory so it isn't reused
by subsequent allocations.

This series adds memory page offlining support with following:
1. drm/xe/svm: Use xe_vram_addr_to_region, avoid block->private usage
2. Link and track ttm BO's with physical addresses
3. Link LRC BO and its execution Queue
4. Extend BO purge to handle vram pages as well
5. Handle the generated physical address error by reserving addresses 4K page
6. Adds supporting debugfs to automate injection of physcal address error
7. Add buddy block allocation dump for debuggin buddy related issues
8. Add configfs for vram bad page reservation policy
9. Sysfs entry to provide statistics of bad gpu vram pages for user info

v7:
- Improve debugfs warning messages
- Use scope_guard for locking(MattB)
- Adapt addition of queue member of LRC BO(MattB)
- Extend and use xe_ttm_bo_purge API for vram pages(MattB)
- Handle dma_buf_map requests for native and remote(MattB)
- Address if in never initialized block, set block to NULL
V6:
- Add more specific tests to noncritical bo sections
- Handle smooth exit of user created exec queues
- Break code and make purge specific static API
V5:
- Sysfs "max_pages" addition
- Reset block->private NULL post purge
- Remove wedge, return -EIO to system controller will initiate reset
- Add debugfs tests to trigger different test scenarios manually and via igt
- Rename addr_to_tbo to addr_to_block and move under gpu/buddy.c
V4: API reworks, add configfs for policy reservation and apply config everywhere
V3: use res_to_mem_region to avoid use of block->private (MattA)
V2:
- some fixes and clean up on errors
- Added xe_vram_addr_to_region helper to avoid other use of block->private (MattB)

Debugfs shows test of different scenarios,
echo 0 > /sys/kernel/debug/dri/bdf/invalid_addr_vram0
where 0 is below address types to be tested,
enum mempage_offline_mode {
        MEMPAGE_OFFLINE_UNALLOCATED = 0,
        MEMPAGE_OFFLINE_USER_ALLOCATED = 1,
        MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED = 2,
        MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED = 3,
        MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED = 4,
        MEMPAGE_OFFLINE_RESERVED = 5,
};

IGT tests for testing this feature:
https://patchwork.freedesktop.org/patch/714751/

Results of above tests:
Using IGT_SRANDOM=1774610050 for randomisation
Opened device: /dev/dri/card0
Starting subtest: unallocated
Subtest unallocated: SUCCESS (1.834s)
Starting subtest: user-allocated
Subtest user-allocated: SUCCESS (1.832s)
Starting subtest: user-ggtt-allocated
Subtest user-ggtt-allocated: SUCCESS (1.871s)
Starting subtest: user-ppgtt-allocated
Subtest user-ppgtt-allocated: SUCCESS (1.843s)
Starting subtest: critical-allocated
Subtest critical-allocated: SUCCESS (1.824s)
Starting subtest: reserved
Subtest reserved: SUCCESS (0.032s)

Tejas Upadhyay (9):
  drm/xe: Link VRAM object with gpu buddy
  drm/gpu: Add gpu_buddy_addr_to_block helper
  drm/xe: Link LRC BO and its execution Queue
  drm/xe: Extend BO purge to handle vram pages as well
  drm/xe: Handle physical memory address error
  drm/xe/cri: Add debugfs to inject faulty vram address
  gpu/buddy: Add routine to dump allocated buddy blocks
  drm/xe/configfs: Add vram bad page reservation policy
  drm/xe/cri: Add sysfs interface for bad gpu vram pages

 drivers/gpu/buddy.c                        |  99 ++++++
 drivers/gpu/drm/xe/xe_bo.c                 |  16 +-
 drivers/gpu/drm/xe/xe_bo.h                 |   5 +-
 drivers/gpu/drm/xe/xe_bo_types.h           |   3 +
 drivers/gpu/drm/xe/xe_configfs.c           |  64 +++-
 drivers/gpu/drm/xe/xe_configfs.h           |   2 +
 drivers/gpu/drm/xe/xe_debugfs.c            | 171 ++++++++++
 drivers/gpu/drm/xe/xe_device.c             |  51 +++
 drivers/gpu/drm/xe/xe_device_sysfs.c       |   7 +
 drivers/gpu/drm/xe/xe_dma_buf.c            |   3 +
 drivers/gpu/drm/xe/xe_exec_queue.c         |  10 +-
 drivers/gpu/drm/xe/xe_pt.c                 |   3 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 361 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   2 +
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  32 ++
 include/linux/gpu_buddy.h                  |   3 +
 16 files changed, 821 insertions(+), 11 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-30  3:50   ` Matthew Brost
  2026-04-13 13:16 ` [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper Tejas Upadhyay
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

Setup to link TTM buffer object inside gpu buddy. This functionality
is critical for supporting the memory page offline feature on CRI,
where identified faulty pages must be traced back to their
originating buffer for safe removal.

V2(MattB): Clear block->private in xe_ttm_vram_mgr_del as well

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 5fd0d5506a7e..01a9b92772f8 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -54,6 +54,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
 	struct xe_ttm_vram_mgr_resource *vres;
 	struct gpu_buddy *mm = &mgr->mm;
+	struct gpu_buddy_block *block;
 	u64 size, min_page_size;
 	unsigned long lpfn;
 	int err;
@@ -138,6 +139,8 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	}
 
 	mgr->visible_avail -= vres->used_visible_size;
+	list_for_each_entry(block, &vres->blocks, link)
+		block->private = tbo;
 	mutex_unlock(&mgr->lock);
 
 	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
@@ -176,8 +179,11 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
 		to_xe_ttm_vram_mgr_resource(res);
 	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
 	struct gpu_buddy *mm = &mgr->mm;
+	struct gpu_buddy_block *block;
 
 	mutex_lock(&mgr->lock);
+	list_for_each_entry(block, &vres->blocks, link)
+		block->private = NULL;
 	gpu_buddy_free_list(mm, &vres->blocks, 0);
 	mgr->visible_avail += vres->used_visible_size;
 	mutex_unlock(&mgr->lock);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:28   ` Matthew Auld
  2026-04-13 17:30   ` Matthew Auld
  2026-04-13 13:16 ` [RFC PATCH V7 3/9] drm/xe: Link LRC BO and its execution Queue Tejas Upadhyay
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

Add helper with primary purpose is to efficiently trace a specific
physical memory address back to its corresponding TTM buffer object.

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/buddy.c       | 56 +++++++++++++++++++++++++++++++++++++++
 include/linux/gpu_buddy.h |  2 ++
 2 files changed, 58 insertions(+)

diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index 52686672e99f..2d26c2a0f971 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -589,6 +589,62 @@ void gpu_buddy_free_block(struct gpu_buddy *mm,
 }
 EXPORT_SYMBOL(gpu_buddy_free_block);
 
+/**
+ * gpu_buddy_addr_to_block - given physical address find a block
+ *
+ * @mm: GPU buddy manager
+ * @addr: Physical address
+ *
+ * Returns:
+ * gpu_buddy_block on success, NULL or error code on failure
+ */
+struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr)
+{
+	struct gpu_buddy_block *block;
+	LIST_HEAD(dfs);
+	u64 end;
+	int i;
+
+	end = addr + SZ_4K - 1;
+	for (i = 0; i < mm->n_roots; ++i)
+		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+	do {
+		u64 block_start;
+		u64 block_end;
+
+		block = list_first_entry_or_null(&dfs,
+						 struct gpu_buddy_block,
+						 tmp_link);
+		if (!block)
+			break;
+
+		list_del(&block->tmp_link);
+
+		block_start = gpu_buddy_block_offset(block);
+		block_end = block_start + gpu_buddy_block_size(mm, block) - 1;
+
+		if (!overlaps(addr, end, block_start, block_end))
+			continue;
+
+		if (contains(addr, end, block_start, block_end) &&
+		    !gpu_buddy_block_is_split(block)) {
+			if (gpu_buddy_block_is_free(block))
+				return NULL;
+			else if (gpu_buddy_block_is_allocated(block) && !mm->clear_avail)
+				return block;
+		}
+
+		if (gpu_buddy_block_is_split(block)) {
+			list_add(&block->right->tmp_link, &dfs);
+			list_add(&block->left->tmp_link, &dfs);
+		}
+	} while (1);
+
+	return ERR_PTR(-ENXIO);
+}
+EXPORT_SYMBOL(gpu_buddy_addr_to_block);
+
 static void __gpu_buddy_free_list(struct gpu_buddy *mm,
 				  struct list_head *objects,
 				  bool mark_clear,
diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
index 5fa917ba5450..957c69c560bc 100644
--- a/include/linux/gpu_buddy.h
+++ b/include/linux/gpu_buddy.h
@@ -231,6 +231,8 @@ void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear);
 
 void gpu_buddy_free_block(struct gpu_buddy *mm, struct gpu_buddy_block *block);
 
+struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr);
+
 void gpu_buddy_free_list(struct gpu_buddy *mm,
 			 struct list_head *objects,
 			 unsigned int flags);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 3/9] drm/xe: Link LRC BO and its execution Queue
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 4/9] drm/xe: Extend BO purge to handle vram pages as well Tejas Upadhyay
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

To establish a link between an LRC BO (Logical Ring Context
Buffer Object) and its corresponding execution Queue in the
drm/xe driver, you need to store a back-pointer to the queue
within the BO's private data structure. This allows the
driver to identify and take corrective action on the specific
queue if the LRC BO encounters an error (e.g., memory
corruption or eviction issues).

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_bo_types.h   | 3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
index ff8317bfc1ae..24f5da89dbde 100644
--- a/drivers/gpu/drm/xe/xe_bo_types.h
+++ b/drivers/gpu/drm/xe/xe_bo_types.h
@@ -19,6 +19,7 @@
 
 struct xe_device;
 struct xe_vm;
+struct xe_exec_queue;
 
 #define XE_BO_MAX_PLACEMENTS	3
 
@@ -39,6 +40,8 @@ struct xe_bo {
 	u32 flags;
 	/** @vm: VM this BO is attached to, for extobj this will be NULL */
 	struct xe_vm *vm;
+	/** @q: Queue this BO is attached to, mostly for LRC BO, NULL otherwise */
+	struct xe_exec_queue *q;
 	/** @tile: Tile this BO is attached to (kernel BO only) */
 	struct xe_tile *tile;
 	/** @placements: valid placements for this BO */
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index b287d0e0e60a..b3b80893c387 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -386,6 +386,7 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
 				goto err_lrc;
 			}
 
+			lrc->bo->q = q;
 			xe_exec_queue_set_lrc(q, lrc, i);
 
 			if (__lrc)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 4/9] drm/xe: Extend BO purge to handle vram pages as well
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (2 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 3/9] drm/xe: Link LRC BO and its execution Queue Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error Tejas Upadhyay
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

Recent driver update introduce support for purgeable buffer
objects (BOs), extending the API to include VRAM pages to
better manage memory pressure and enable memory offlining.

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 5 +----
 drivers/gpu/drm/xe/xe_bo.h | 1 +
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index a7c2dc7f224c..7baa326c9421 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -916,7 +916,7 @@ void xe_bo_set_purgeable_state(struct xe_bo *bo,
  *
  * Return: 0 on success, negative error code on failure
  */
-static int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx)
+int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx)
 {
 	struct xe_bo *bo = ttm_to_xe_bo(ttm_bo);
 	struct ttm_placement place = {};
@@ -924,9 +924,6 @@ static int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operatio
 
 	xe_bo_assert_held(bo);
 
-	if (!ttm_bo->ttm)
-		return 0;
-
 	if (!xe_bo_madv_is_dontneed(bo))
 		return 0;
 
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 68dea7d25a6b..9f55b3589caf 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -500,6 +500,7 @@ struct xe_bo_shrink_flags {
 long xe_bo_shrink(struct ttm_operation_ctx *ctx, struct ttm_buffer_object *bo,
 		  const struct xe_bo_shrink_flags flags,
 		  unsigned long *scanned);
+int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx);
 
 /**
  * xe_bo_is_mem_type - Whether the bo currently resides in the given
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (3 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 4/9] drm/xe: Extend BO purge to handle vram pages as well Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-30 11:28   ` Matthew Auld
  2026-04-13 13:16 ` [RFC PATCH V7 6/9] drm/xe/cri: Add debugfs to inject faulty vram address Tejas Upadhyay
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

This functionality represents a significant step in making
the xe driver gracefully handle hardware memory degradation.
By integrating with the DRM Buddy allocator, the driver
can permanently "carve out" faulty memory so it isn't reused
by subsequent allocations.

Buddy Block Reservation:
----------------------
When a memory address is reported as faulty, the driver instructs
the DRM Buddy allocator to reserve a block of the specific page
size (typically 4KB). This marks the memory as "dirty/used"
indefinitely.

Two-Stage Tracking:
-----------------
Offlined Pages:
Pages that have been successfully isolated and removed from the
available memory pool.

Queued Pages:
Addresses that have been flagged as faulty but are currently in
use by a process. These are tracked until the associated buffer
object (BO) is released or migrated, at which point they move
to the "offlined" state.

Sysfs Reporting:
--------------
The patch exposes these metrics through a standard interface,
allowing administrators to monitor VRAM health:
/sys/bus/pci/devices/<device_id>/vram_bad_bad_pages

V6:
- Use scope_guard for locking(MattB)
- Adapt addition of queue member of LRC BO(MattB)
- Extend and use xe_ttm_bo_purge API for vram pages(MattB)
- Handle dma_buf_map requests for native and remote(MattB)
- Address if in never initialized block, set block to NULL
V5:
- Categorise and handle BOs accordingly
- Fix crash found with new debugfs tests
V4:
- Set block->private NULL post bo purge
- Filter out gsm address early on
- Rebase
V3:
-rename api, remove tile dependency and add status of reservation
V2:
- Fix mm->avail counter issue
- Remove unused code and handle clean up in case of error

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c                 |  11 +-
 drivers/gpu/drm/xe/xe_bo.h                 |   4 +-
 drivers/gpu/drm/xe/xe_dma_buf.c            |   3 +
 drivers/gpu/drm/xe/xe_exec_queue.c         |   9 +-
 drivers/gpu/drm/xe/xe_pt.c                 |   3 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 267 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   1 +
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  28 +++
 8 files changed, 320 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 7baa326c9421..d84849cca0aa 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -158,7 +158,16 @@ bool xe_bo_is_vm_bound(struct xe_bo *bo)
 	return !list_empty(&bo->ttm.base.gpuva.list);
 }
 
-static bool xe_bo_is_user(struct xe_bo *bo)
+/**
+ * xe_bo_is_user - check if BO is user created BO
+ * @bo: The BO
+ *
+ * Check if  BO is user created BO. This requires the
+ * reservation lock for the BO to be held.
+ *
+ * Returns: boolean
+ */
+bool xe_bo_is_user(struct xe_bo *bo)
 {
 	return bo->flags & XE_BO_FLAG_USER;
 }
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 9f55b3589caf..073fae905073 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -277,7 +277,8 @@ static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo)
 {
 	if (likely(bo)) {
 		xe_bo_lock(bo, false);
-		xe_bo_unpin(bo);
+		if (!xe_bo_is_purged(bo))
+			xe_bo_unpin(bo);
 		xe_bo_unlock(bo);
 
 		xe_bo_put(bo);
@@ -501,6 +502,7 @@ long xe_bo_shrink(struct ttm_operation_ctx *ctx, struct ttm_buffer_object *bo,
 		  const struct xe_bo_shrink_flags flags,
 		  unsigned long *scanned);
 int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx);
+bool xe_bo_is_user(struct xe_bo *bo);
 
 /**
  * xe_bo_is_mem_type - Whether the bo currently resides in the given
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 7f9602b3363d..e36ea88292f5 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -104,6 +104,9 @@ static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach,
 	struct sg_table *sgt;
 	int r = 0;
 
+	if (xe_bo_is_purged(bo))
+		return ERR_PTR(-ENOENT);
+
 	if (!attach->peer2peer && !xe_bo_can_migrate(bo, XE_PL_TT))
 		return ERR_PTR(-EOPNOTSUPP);
 
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index b3b80893c387..40ffc598e0f8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -385,7 +385,6 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
 				err = PTR_ERR(lrc);
 				goto err_lrc;
 			}
-
 			lrc->bo->q = q;
 			xe_exec_queue_set_lrc(q, lrc, i);
 
@@ -1552,8 +1551,12 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q)
 	 * errors.
 	 */
 	lrc = q->lrc[0];
-	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
-	q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
+	xe_bo_lock(lrc->bo, false);
+	if (!xe_bo_is_purged(lrc->bo)) {
+		new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
+		q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
+	}
+	xe_bo_unlock(lrc->bo);
 
 	drm_dev_exit(idx);
 }
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 8e5f4f0dea3f..1764bae6e481 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -211,7 +211,8 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred)
 		return;
 
 	XE_WARN_ON(!list_empty(&pt->bo->ttm.base.gpuva.list));
-	xe_bo_unpin(pt->bo);
+	if (!xe_bo_is_purged(pt->bo))
+		xe_bo_unpin(pt->bo);
 	xe_bo_put_deferred(pt->bo, deferred);
 
 	if (pt->level > 0 && pt->num_live) {
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 01a9b92772f8..ac6f034852f7 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -13,7 +13,10 @@
 
 #include "xe_bo.h"
 #include "xe_device.h"
+#include "xe_exec_queue.h"
+#include "xe_lrc.h"
 #include "xe_res_cursor.h"
+#include "xe_ttm_stolen_mgr.h"
 #include "xe_ttm_vram_mgr.h"
 #include "xe_vram_types.h"
 
@@ -280,6 +283,24 @@ static const struct ttm_resource_manager_func xe_ttm_vram_mgr_func = {
 	.debug	= xe_ttm_vram_mgr_debug
 };
 
+static void xe_ttm_vram_free_bad_pages(struct drm_device *dev, struct xe_ttm_vram_mgr *mgr)
+{
+	struct xe_ttm_vram_offline_resource *pos, *n;
+
+	list_for_each_entry_safe(pos, n, &mgr->offlined_pages, offlined_link) {
+		--mgr->n_offlined_pages;
+		gpu_buddy_free_list(&mgr->mm, &pos->blocks, 0);
+		mgr->visible_avail += pos->used_visible_size;
+		list_del(&pos->offlined_link);
+		kfree(pos);
+	}
+	list_for_each_entry_safe(pos, n, &mgr->queued_pages, queued_link) {
+		list_del(&pos->queued_link);
+		mgr->n_queued_pages--;
+		kfree(pos);
+	}
+}
+
 static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 {
 	struct xe_device *xe = to_xe_device(dev);
@@ -291,6 +312,10 @@ static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 	if (ttm_resource_manager_evict_all(&xe->ttm, man))
 		return;
 
+	mutex_lock(&mgr->lock);
+	xe_ttm_vram_free_bad_pages(dev, mgr);
+	mutex_unlock(&mgr->lock);
+
 	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
 
 	gpu_buddy_fini(&mgr->mm);
@@ -319,6 +344,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	man->func = &xe_ttm_vram_mgr_func;
 	mgr->mem_type = mem_type;
 	mutex_init(&mgr->lock);
+	INIT_LIST_HEAD(&mgr->offlined_pages);
+	INIT_LIST_HEAD(&mgr->queued_pages);
 	mgr->default_page_size = default_page_size;
 	mgr->visible_size = io_size;
 	mgr->visible_avail = io_size;
@@ -474,3 +501,243 @@ u64 xe_ttm_vram_get_avail(struct ttm_resource_manager *man)
 
 	return avail;
 }
+
+static int xe_ttm_vram_purge_page(struct xe_device *xe, struct xe_bo *bo)
+{
+	struct ttm_operation_ctx ctx = {};
+	struct xe_vm *vm;
+	u32	flags;
+	int ret = 0;
+
+	xe_bo_lock(bo, false);
+	vm = bo->vm;
+	flags = bo->flags;
+	xe_bo_unlock(bo);
+	/*  Ban VM if BO is PPGTT */
+	if (flags & XE_BO_FLAG_PAGETABLE) {
+		down_write(&vm->lock);
+		xe_vm_kill(vm, true);
+		up_write(&vm->lock);
+	}
+
+	xe_bo_lock(bo, false);
+	/*  Ban exec queue if BO is lrc */
+	if (bo->q && xe_exec_queue_get_unless_zero(bo->q)) {
+		/* ban queue */
+		xe_exec_queue_kill(bo->q);
+		xe_exec_queue_put(bo->q);
+	}
+
+	xe_bo_set_purgeable_state(bo, XE_MADV_PURGEABLE_DONTNEED);
+	ttm_bo_unmap_virtual(&bo->ttm);   /* nuke CPU mmap + VRAM IO mappings */
+	if (xe_bo_is_pinned(bo))
+		xe_bo_unpin(bo);
+	ret = xe_ttm_bo_purge(&bo->ttm, &ctx);
+	xe_bo_unlock(bo);
+
+	return ret;
+}
+
+static int xe_ttm_vram_reserve_page_at_addr(struct xe_device *xe, unsigned long addr,
+					    struct xe_ttm_vram_mgr *vram_mgr, struct gpu_buddy *mm)
+{
+	struct xe_ttm_vram_offline_resource *nentry;
+	struct ttm_buffer_object *tbo = NULL;
+	struct gpu_buddy_block *block;
+	struct gpu_buddy_block *b, *m;
+	enum reserve_status {
+		pending = 0,
+		fail
+	};
+	u64 size = SZ_4K;
+	int ret = 0;
+
+	scoped_guard(mutex, &vram_mgr->lock) {
+		block = gpu_buddy_addr_to_block(mm, addr);
+		if (PTR_ERR(block) == -ENXIO)
+			/* VRAM region check passed earlier; safe to proceed */
+			block = NULL;
+
+		nentry = kzalloc_obj(*nentry);
+		if (!nentry)
+			return -ENOMEM;
+		INIT_LIST_HEAD(&nentry->blocks);
+		nentry->status = pending;
+		nentry->addr = addr;
+
+		if (block) {
+			struct xe_bo *pbo;
+
+			WARN_ON(!block->private);
+			tbo = block->private;
+			pbo = ttm_to_xe_bo(tbo);
+
+			/* Get reference safely - BO may have zero refcount */
+			if (!xe_bo_get_unless_zero(pbo)) {
+				kfree(nentry);
+				return -ENOENT;
+			}
+			/* Critical kernel BO? */
+			if ((pbo->ttm.type == ttm_bo_type_kernel &&
+			     !(pbo->flags & XE_BO_FLAG_PINNED_LATE_RESTORE)) ||
+			    (xe_bo_is_user(pbo) && xe_bo_is_pinned(pbo))) {
+				kfree(nentry);
+				xe_ttm_vram_free_bad_pages(&xe->drm, vram_mgr);
+				xe_bo_put(pbo);
+				drm_err(&xe->drm,
+					"%s: addr: 0x%lx is critical kernel bo, requesting SBR\n",
+					__func__, addr);
+				/* Hint System controller driver for reset with -EIO  */
+				return -EIO;
+			}
+			nentry->id = ++vram_mgr->n_queued_pages;
+			list_add(&nentry->queued_link, &vram_mgr->queued_pages);
+		}
+	}
+	if (block) {
+		struct xe_ttm_vram_offline_resource *pos, *n;
+		struct xe_bo *pbo = ttm_to_xe_bo(tbo);
+
+		/* Purge BO containing address - reference held from above */
+		ret = xe_ttm_vram_purge_page(xe, pbo);
+		xe_bo_put(pbo);
+		if (ret) {
+			nentry->status = fail;
+			return ret;
+		}
+
+		/* Reserve page at address addr*/
+		scoped_guard(mutex, &vram_mgr->lock) {
+			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
+						     size, size, &nentry->blocks,
+						     GPU_BUDDY_RANGE_ALLOCATION);
+
+			if (ret) {
+				drm_warn(&xe->drm, "Could not reserve page at addr:0x%lx, ret:%d\n",
+					 addr, ret);
+				nentry->status = fail;
+				return ret;
+			}
+
+			list_for_each_entry_safe(b, m, &nentry->blocks, link)
+				b->private = NULL;
+
+			if ((addr + size) <= vram_mgr->visible_size) {
+				nentry->used_visible_size = size;
+			} else {
+				list_for_each_entry(b, &nentry->blocks, link) {
+					u64 start = gpu_buddy_block_offset(b);
+
+					if (start < vram_mgr->visible_size) {
+						u64 end = start + gpu_buddy_block_size(mm, b);
+
+						nentry->used_visible_size +=
+							min(end, vram_mgr->visible_size) - start;
+					}
+				}
+			}
+			vram_mgr->visible_avail -= nentry->used_visible_size;
+			list_for_each_entry_safe(pos, n, &vram_mgr->queued_pages, queued_link) {
+				if (pos->id == nentry->id) {
+					--vram_mgr->n_queued_pages;
+				list_del(&pos->queued_link);
+				break;
+				}
+			}
+			list_add(&nentry->offlined_link, &vram_mgr->offlined_pages);
+			/* TODO: FW Integration: Send command to FW for offlining page */
+			++vram_mgr->n_offlined_pages;
+			return ret;
+		}
+	} else {
+		scoped_guard(mutex, &vram_mgr->lock) {
+			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
+						     size, size, &nentry->blocks,
+						     GPU_BUDDY_RANGE_ALLOCATION);
+			if (ret) {
+				drm_warn(&xe->drm, "Could not reserve page at addr:0x%lx, ret:%d\n",
+					 addr, ret);
+				nentry->status = fail;
+				return ret;
+			}
+
+			list_for_each_entry_safe(b, m, &nentry->blocks, link)
+				b->private = NULL;
+
+			if ((addr + size) <= vram_mgr->visible_size) {
+				nentry->used_visible_size = size;
+			} else {
+				struct gpu_buddy_block *block;
+
+				list_for_each_entry(block, &nentry->blocks, link) {
+					u64 start = gpu_buddy_block_offset(block);
+
+					if (start < vram_mgr->visible_size) {
+						u64 end = start + gpu_buddy_block_size(mm, block);
+
+						nentry->used_visible_size +=
+							min(end, vram_mgr->visible_size) - start;
+					}
+				}
+			}
+			vram_mgr->visible_avail -= nentry->used_visible_size;
+			nentry->id = ++vram_mgr->n_offlined_pages;
+			list_add(&nentry->offlined_link, &vram_mgr->offlined_pages);
+			/* TODO: FW Integration: Send command to FW for offlining page */
+		}
+	}
+	/* Success */
+	return ret;
+}
+
+static struct xe_vram_region *xe_ttm_vram_addr_to_region(struct xe_device *xe,
+							 resource_size_t addr)
+{
+	unsigned long stolen_base = xe_ttm_stolen_gpu_offset(xe);
+	struct xe_vram_region *vr;
+	struct xe_tile *tile;
+	int id;
+
+	/* Addr from stolen memory? */
+	if (addr + SZ_4K >= stolen_base)
+		return NULL;
+
+	for_each_tile(tile, xe, id) {
+		vr = tile->mem.vram;
+		if ((addr <= vr->dpa_base + vr->actual_physical_size) &&
+		    (addr + SZ_4K >= vr->dpa_base))
+			return vr;
+	}
+	return NULL;
+}
+
+/**
+ * xe_ttm_vram_handle_addr_fault - Handle vram physical address error flaged
+ * @xe: pointer to parent device
+ * @addr: physical faulty address
+ *
+ * Handle the physcial faulty address error on specific tile.
+ *
+ * Returns 0 for success, negative error code otherwise.
+ */
+int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr)
+{
+	struct xe_ttm_vram_mgr *vram_mgr;
+	struct xe_vram_region *vr;
+	struct gpu_buddy *mm;
+	int ret;
+
+	vr = xe_ttm_vram_addr_to_region(xe, addr);
+	if (!vr) {
+		drm_err(&xe->drm, "%s:%d addr:%lx error requesting SBR\n",
+			__func__, __LINE__, addr);
+		/* Hint System controller driver for reset with -EIO  */
+		return -EIO;
+	}
+	vram_mgr = &vr->ttm;
+	mm = &vram_mgr->mm;
+	/* Reserve page at address */
+	ret = xe_ttm_vram_reserve_page_at_addr(xe, addr, vram_mgr, mm);
+	return ret;
+}
+EXPORT_SYMBOL(xe_ttm_vram_handle_addr_fault);
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 87b7fae5edba..8ef06d9d44f7 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -31,6 +31,7 @@ u64 xe_ttm_vram_get_cpu_visible_size(struct ttm_resource_manager *man);
 void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
 			  u64 *used, u64 *used_visible);
 
+int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr);
 static inline struct xe_ttm_vram_mgr_resource *
 to_xe_ttm_vram_mgr_resource(struct ttm_resource *res)
 {
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 9106da056b49..3ad7966798eb 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -19,6 +19,14 @@ struct xe_ttm_vram_mgr {
 	struct ttm_resource_manager manager;
 	/** @mm: DRM buddy allocator which manages the VRAM */
 	struct gpu_buddy mm;
+	/** @offlined_pages: List of offlined pages */
+	struct list_head offlined_pages;
+	/** @n_offlined_pages: Number of offlined pages */
+	u16 n_offlined_pages;
+	/** @queued_pages: List of queued pages */
+	struct list_head queued_pages;
+	/** @n_queued_pages: Number of queued pages */
+	u16 n_queued_pages;
 	/** @visible_size: Proped size of the CPU visible portion */
 	u64 visible_size;
 	/** @visible_avail: CPU visible portion still unallocated */
@@ -45,4 +53,24 @@ struct xe_ttm_vram_mgr_resource {
 	unsigned long flags;
 };
 
+/**
+ * struct xe_ttm_vram_offline_resource - Xe TTM VRAM offline  resource
+ */
+struct xe_ttm_vram_offline_resource {
+	/** @offlined_link: Link to offlined pages */
+	struct list_head offlined_link;
+	/** @queued_link: Link to queued pages */
+	struct list_head queued_link;
+	/** @blocks: list of DRM buddy blocks */
+	struct list_head blocks;
+	/** @used_visible_size: How many CPU visible bytes this resource is using */
+	u64 used_visible_size;
+	/** @id: The id of an offline resource */
+	u16 id;
+	/** @addr: Address of faulty memory location reported by HW */
+	unsigned long addr;
+	/** @status: reservation status of resource */
+	bool status;
+};
+
 #endif
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 6/9] drm/xe/cri: Add debugfs to inject faulty vram address
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (4 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 7/9] gpu/buddy: Add routine to dump allocated buddy blocks Tejas Upadhyay
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

Add debugfs which can help testing feature with manual error injection.
Adding a debugfs interface to the drm/xe driver allows manual injection
of faulty VRAM addresses, facilitating the testing of the CRI memory
page offline feature before it is fully functional. The implementation
involves creating a debugfs entry, likely under
/sys/kernel/debug/dri/bdf/invalid_addr_vram0,
to accept specific faulty addresses for validation.

For example,
echo 0 > /sys/kernel/debug/dri/bdf/invalid_addr_vram0
where 0 is below address types to be tested,
enum mempage_offline_mode {
        MEMPAGE_OFFLINE_UNALLOCATED = 0,
        MEMPAGE_OFFLINE_USER_ALLOCATED = 1,
        MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED = 2,
        MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED = 3,
        MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED = 4,
        MEMPAGE_OFFLINE_RESERVED = 5
};

v4:
- Use scope_guard around lock, adapt bo->q and enhance warn messages
v3:
- Add more specific noncritical bo tests
v2:
- Add mode based automated test vs manual address feed

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c            | 171 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |   2 +
 2 files changed, 173 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index c9d4484821af..b8168f3992e0 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -14,6 +14,7 @@
 #include "regs/xe_pmt.h"
 #include "xe_bo.h"
 #include "xe_device.h"
+#include "xe_exec_queue_types.h"
 #include "xe_force_wake.h"
 #include "xe_gt.h"
 #include "xe_gt_debugfs.h"
@@ -21,6 +22,7 @@
 #include "xe_guc_ads.h"
 #include "xe_hw_engine.h"
 #include "xe_mmio.h"
+#include "xe_migrate.h"
 #include "xe_pm.h"
 #include "xe_psmi.h"
 #include "xe_pxp_debugfs.h"
@@ -29,6 +31,8 @@
 #include "xe_sriov_vf.h"
 #include "xe_step.h"
 #include "xe_tile_debugfs.h"
+#include "xe_ttm_stolen_mgr.h"
+#include "xe_ttm_vram_mgr.h"
 #include "xe_vsec.h"
 #include "xe_wa.h"
 
@@ -40,6 +44,14 @@
 
 DECLARE_FAULT_ATTR(gt_reset_failure);
 DECLARE_FAULT_ATTR(inject_csc_hw_error);
+enum mempage_offline_mode {
+	MEMPAGE_OFFLINE_UNALLOCATED = 0,
+	MEMPAGE_OFFLINE_USER_ALLOCATED = 1,
+	MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED = 2,
+	MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED = 3,
+	MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED = 4,
+	MEMPAGE_OFFLINE_RESERVED = 5,
+};
 
 static void read_residency_counter(struct xe_device *xe, struct xe_mmio *mmio,
 				   u32 offset, const char *name, struct drm_printer *p)
@@ -544,6 +556,154 @@ static const struct file_operations disable_late_binding_fops = {
 	.write = disable_late_binding_set,
 };
 
+static ssize_t addr_fault_reporting_show(struct file *f, char __user *ubuf,
+					 size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	char buf[32];
+	int len;
+
+	len = scnprintf(buf, sizeof(buf), "%lld\n", xe->mem.vram->ttm.offline_mode);
+
+	return simple_read_from_buffer(ubuf, size, pos, buf, len);
+}
+
+static int mempage_exec_offline(struct xe_device *xe, u64 mode)
+{
+	struct xe_tile *tile = xe_device_get_root_tile(xe);
+	struct xe_vram_region *vr = tile->mem.vram;
+	struct ttm_buffer_object *tbo = NULL;
+	struct xe_ttm_vram_mgr *vram_mgr;
+	struct gpu_buddy_block *block;
+	bool do_offline = false;
+	struct gpu_buddy *mm;
+	struct xe_bo *bo;
+	u64 addr = 0x0;
+	int ret = 0;
+
+	vram_mgr = &vr->ttm;
+	mm = &vram_mgr->mm;
+	addr = vr->dpa_base;
+	while (addr <= vr->dpa_base + vr->actual_physical_size) {
+		scoped_guard(mutex, &vram_mgr->lock) {
+			block = gpu_buddy_addr_to_block(mm, addr);
+			if (!block && mode == MEMPAGE_OFFLINE_UNALLOCATED)
+				do_offline = true;
+			if (block && PTR_ERR(block) != -ENXIO) {
+				if (!block->private) {
+					addr = addr + SZ_4K;
+					do_offline = false;
+					continue;
+				}
+				tbo = block->private;
+				bo = ttm_to_xe_bo(tbo);
+				if (bo->ttm.type == ttm_bo_type_device &&
+				    bo->flags & XE_BO_FLAG_USER &&
+				    bo->flags & XE_BO_FLAG_VRAM_MASK &&
+				    mode == MEMPAGE_OFFLINE_USER_ALLOCATED) {
+					do_offline = true;
+				} else if (bo->q &&
+					   mode == MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED) {
+					/* lrc */
+					struct xe_vm *migrate_vm;
+
+					migrate_vm = xe_migrate_get_vm(tile->migrate);
+					if (migrate_vm != bo->q->vm)
+						do_offline = true;
+					xe_vm_put(migrate_vm);
+				} else if (bo->ttm.type == ttm_bo_type_kernel &&
+					   bo->flags & XE_BO_FLAG_FORCE_USER_VRAM &&
+					   bo->flags & XE_BO_FLAG_PAGETABLE &&
+					   mode == MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED) {
+					/* ppgtt */
+					do_offline = true;
+				} else if (bo->ttm.type == ttm_bo_type_kernel &&
+					   !(bo->flags & XE_BO_FLAG_FORCE_USER_VRAM) &&
+					   mode == MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED) {
+					do_offline = true;
+				}
+			}
+		}
+		if (do_offline) {
+			/* Report fault */
+			ret = xe_ttm_vram_handle_addr_fault(xe, addr);
+			if (ret) {
+				if ((ret == -EIO) &&
+				    mode == MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED) {
+					addr = addr + SZ_4K;
+					if (do_offline)
+						do_offline = false;
+					continue;
+				}
+				break;
+			}
+			/* Verify addr + SZ_4K is allocated */
+			scoped_guard(mutex, &vram_mgr->lock) {
+				block = gpu_buddy_addr_to_block(mm, addr);
+				if (!block || PTR_ERR(block) == -ENXIO || block->private)
+					ret = -EBUSY;
+			}
+			break;
+		}
+		addr = addr + SZ_4K;
+		if (do_offline)
+			do_offline = false;
+	}
+	if (!do_offline)
+		drm_warn(&xe->drm, "no such object, ret:%d\n", ret);
+
+	return ret;
+}
+
+static ssize_t addr_fault_reporting_set(struct file *f, const char __user *ubuf,
+					size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	int ret = 0;
+	u64 mode;
+
+	ret = kstrtou64_from_user(ubuf, size, 0, &mode);
+	if (ret)
+		return ret;
+
+	switch (mode) {
+	case MEMPAGE_OFFLINE_UNALLOCATED:
+	case MEMPAGE_OFFLINE_USER_ALLOCATED:
+	case MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED:
+	case MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED:
+	case MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED:
+		ret = mempage_exec_offline(xe, mode);
+		break;
+	case MEMPAGE_OFFLINE_RESERVED:
+		u64 stolen_base;
+
+		stolen_base = xe_ttm_stolen_gpu_offset(xe);
+		ret = xe_ttm_vram_handle_addr_fault(xe, stolen_base);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	xe->mem.vram->ttm.offline_mode = mode;
+	if (!ret || (ret == -EIO &&
+		     (mode == MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED ||
+		      mode == MEMPAGE_OFFLINE_RESERVED))) {
+		drm_info(&xe->drm, "offline mode %llu passed ret:%d\n", mode, ret);
+	} else {
+		drm_warn(&xe->drm, "offline mode %llu failed, ret:%d\n", mode, ret);
+		return ret;
+	}
+
+	return size;
+}
+
+static const struct file_operations addr_fault_reporting_fops = {
+	.owner = THIS_MODULE,
+	.read = addr_fault_reporting_show,
+	.write = addr_fault_reporting_set,
+};
+
 void xe_debugfs_register(struct xe_device *xe)
 {
 	struct ttm_device *bdev = &xe->ttm;
@@ -600,6 +760,17 @@ void xe_debugfs_register(struct xe_device *xe)
 	if (man)
 		ttm_resource_manager_create_debugfs(man, root, "stolen_mm");
 
+	if (xe->info.platform == XE_CRESCENTISLAND) {
+		man = ttm_manager_type(bdev, XE_PL_VRAM0);
+		if (man) {
+			char name[20];
+
+			snprintf(name, sizeof(name), "invalid_addr_vram%d", 0);
+			debugfs_create_file(name, 0600, root, xe,
+					    &addr_fault_reporting_fops);
+		}
+	}
+
 	for_each_tile(tile, xe, tile_id)
 		xe_tile_debugfs_register(tile);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 3ad7966798eb..07ed88b47e04 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -37,6 +37,8 @@ struct xe_ttm_vram_mgr {
 	struct mutex lock;
 	/** @mem_type: The TTM memory type */
 	u32 mem_type;
+	/** @offline_mode: debugfs hook for setting page offline mode */
+	u64 offline_mode;
 };
 
 /**
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 7/9] gpu/buddy: Add routine to dump allocated buddy blocks
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (5 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 6/9] drm/xe/cri: Add debugfs to inject faulty vram address Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 8/9] drm/xe/configfs: Add vram bad page reservation policy Tejas Upadhyay
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

To implement the ability to see allocated blocks under a specific VRAM
instance in the drm driver, new api is introduced. While existing structs
often show the free block list, this addition provides a comprehensive view
of all currently resident VRAM allocations.

Dump will look like,

[  +0.000003] xe 0000:03:00.0: [drm] 0x00000002f8000000-0x00000002f8800000: 8388608
[  +0.000005] xe 0000:03:00.0: [drm] 0x00000002f8800000-0x00000002f8840000: 262144
[  +0.000004] xe 0000:03:00.0: [drm] 0x00000002f8840000-0x00000002f8860000: 131072
[  +0.000004] xe 0000:03:00.0: [drm] 0x00000002f8860000-0x00000002f8870000: 65536
[  +0.000005] xe 0000:03:00.0: [drm] 0x00000002f9000000-0x00000002f9800000: 8388608
[  +0.000004] xe 0000:03:00.0: [drm] 0x00000002f9800000-0x00000002f9880000: 524288
[  +0.000005] xe 0000:03:00.0: [drm] 0x00000002f9880000-0x00000002f9884000: 16384
[  +0.000004] xe 0000:03:00.0: [drm] 0x00000002f9900000-0x00000002f9980000: 524288
[  +0.000005] xe 0000:03:00.0: [drm] 0x00000002f9980000-0x00000002f9988000: 32768
[  +0.000004] xe 0000:03:00.0: [drm] 0x00000002f9988000-0x00000002f998c000: 16384

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/buddy.c       | 43 +++++++++++++++++++++++++++++++++++++++
 include/linux/gpu_buddy.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index 2d26c2a0f971..3e3f0dbbbda6 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -10,6 +10,7 @@
 #include <linux/sizes.h>
 
 #include <linux/gpu_buddy.h>
+#include <drm/drm_print.h>
 
 /**
  * gpu_buddy_assert - assert a condition in the buddy allocator
@@ -1289,6 +1290,48 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 }
 EXPORT_SYMBOL(gpu_buddy_block_trim);
 
+/**
+ * gpu_buddy_dump_allocated_blocks - print all allocated blocks in drm buddy
+ *
+ * @dev: drm device
+ * @mm: DRM buddy manager to look into
+ * @p: drm printer to print info
+ *
+ * Looks into buddy manager for each block and their status and if allocated
+ * print allocated block range and size
+ *
+ * Returns:
+ * void
+ */
+void gpu_buddy_dump_allocated_blocks(struct gpu_buddy *mm)
+{
+	struct gpu_buddy_block *block;
+	LIST_HEAD(dfs);
+	int i;
+
+	for (i = 0; i < mm->n_roots; ++i)
+		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+	do {
+		block = list_first_entry_or_null(&dfs,
+						 struct gpu_buddy_block,
+						 tmp_link);
+		if (!block)
+			break;
+
+		list_del(&block->tmp_link);
+
+		if (gpu_buddy_block_is_allocated(block))
+			gpu_buddy_block_print(mm, block);
+
+		if (gpu_buddy_block_is_split(block)) {
+			list_add(&block->right->tmp_link, &dfs);
+			list_add(&block->left->tmp_link, &dfs);
+		}
+	} while (1);
+}
+EXPORT_SYMBOL(gpu_buddy_dump_allocated_blocks);
+
 static struct gpu_buddy_block *
 __gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 			 u64 start, u64 end,
diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
index 957c69c560bc..0a09603fa8b6 100644
--- a/include/linux/gpu_buddy.h
+++ b/include/linux/gpu_buddy.h
@@ -226,6 +226,7 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 			 u64 *start,
 			 u64 new_size,
 			 struct list_head *blocks);
+void gpu_buddy_dump_allocated_blocks(struct gpu_buddy *mm);
 
 void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear);
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 8/9] drm/xe/configfs: Add vram bad page reservation policy
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (6 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 7/9] gpu/buddy: Add routine to dump allocated buddy blocks Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 13:16 ` [RFC PATCH V7 9/9] drm/xe/cri: Add sysfs interface for bad gpu vram pages Tejas Upadhyay
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

The interface enables setting the policy for how bad pages are
handled in VRAM. This is crucial for maintaining system
stability in scenarios where VRAM degradation occurs.

By default policy will be "reserve", which can be changed to
"logging" only.

v2:
- Add CRI check and rebase

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c     | 64 +++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_configfs.h     |  2 +
 drivers/gpu/drm/xe/xe_device.c       | 44 +++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 11 +++++
 4 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 32102600a148..4ac5274a4090 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -61,7 +61,8 @@
  *	    ├── survivability_mode
  *	    ├── gt_types_allowed
  *	    ├── engines_allowed
- *	    └── enable_psmi
+ *          ├── enable_psmi
+ *          └── bad_page_reservation
  *
  * After configuring the attributes as per next section, the device can be
  * probed with::
@@ -159,6 +160,16 @@
  *
  * This attribute can only be set before binding to the device.
  *
+ * Bad pages reservation:
+ * ---------------------
+ *
+ * Disale vram bad pages reservation, instead just report it in dmesg.
+ *  Example to disable it::
+ *
+ *      # echo 0 > /sys/kernel/config/xe/0000:03:00.0/bad_page_reservation
+ *
+ * This attribute can only be set before binding to the device.
+ *
  * Context restore BB
  * ------------------
  *
@@ -262,6 +273,7 @@ struct xe_config_group_device {
 		struct wa_bb ctx_restore_mid_bb[XE_ENGINE_CLASS_MAX];
 		bool survivability_mode;
 		bool enable_psmi;
+		bool bad_page_reservation;
 		struct {
 			unsigned int max_vfs;
 			bool admin_only_pf;
@@ -281,6 +293,7 @@ static const struct xe_config_device device_defaults = {
 	.engines_allowed = U64_MAX,
 	.survivability_mode = false,
 	.enable_psmi = false,
+	.bad_page_reservation = true,
 	.sriov = {
 		.max_vfs = XE_DEFAULT_MAX_VFS,
 		.admin_only_pf = XE_DEFAULT_ADMIN_ONLY_PF,
@@ -575,6 +588,32 @@ static ssize_t enable_psmi_store(struct config_item *item, const char *page, siz
 	return len;
 }
 
+static ssize_t bad_page_reservation_show(struct config_item *item, char *page)
+{
+	struct xe_config_device *dev = to_xe_config_device(item);
+
+	return sprintf(page, "%d\n", dev->bad_page_reservation);
+}
+
+static ssize_t bad_page_reservation_store(struct config_item *item, const char *page, size_t len)
+{
+	struct xe_config_group_device *dev = to_xe_config_group_device(item);
+	bool val;
+	int ret;
+
+	ret = kstrtobool(page, &val);
+	if (ret)
+		return ret;
+
+	guard(mutex)(&dev->lock);
+	if (is_bound(dev))
+		return -EBUSY;
+
+	dev->config.bad_page_reservation = val;
+
+	return len;
+}
+
 static bool wa_bb_read_advance(bool dereference, char **p,
 			       const char *append, size_t len,
 			       size_t *max_size)
@@ -813,6 +852,7 @@ static ssize_t ctx_restore_post_bb_store(struct config_item *item,
 CONFIGFS_ATTR(, ctx_restore_mid_bb);
 CONFIGFS_ATTR(, ctx_restore_post_bb);
 CONFIGFS_ATTR(, enable_psmi);
+CONFIGFS_ATTR(, bad_page_reservation);
 CONFIGFS_ATTR(, engines_allowed);
 CONFIGFS_ATTR(, gt_types_allowed);
 CONFIGFS_ATTR(, survivability_mode);
@@ -821,6 +861,7 @@ static struct configfs_attribute *xe_config_device_attrs[] = {
 	&attr_ctx_restore_mid_bb,
 	&attr_ctx_restore_post_bb,
 	&attr_enable_psmi,
+	&attr_bad_page_reservation,
 	&attr_engines_allowed,
 	&attr_gt_types_allowed,
 	&attr_survivability_mode,
@@ -1098,6 +1139,7 @@ static void dump_custom_dev_config(struct pci_dev *pdev,
 	PRI_CUSTOM_ATTR("%llx", gt_types_allowed);
 	PRI_CUSTOM_ATTR("%llx", engines_allowed);
 	PRI_CUSTOM_ATTR("%d", enable_psmi);
+	PRI_CUSTOM_ATTR("%d", bad_page_reservation);
 	PRI_CUSTOM_ATTR("%d", survivability_mode);
 	PRI_CUSTOM_ATTR("%u", sriov.admin_only_pf);
 
@@ -1225,6 +1267,26 @@ bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)
 	return ret;
 }
 
+/**
+ * xe_configfs_get_bad_page_reservation - get configfs bad_page_reservation setting
+ * @pdev: pci device
+ *
+ * Return: bad_page_reservation setting in configfs
+ */
+bool xe_configfs_get_bad_page_reservation(struct pci_dev *pdev)
+{
+	struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
+	bool ret;
+
+	if (!dev)
+		return device_defaults.bad_page_reservation;
+
+	ret = dev->config.bad_page_reservation;
+	config_group_put(&dev->group);
+
+	return ret;
+}
+
 /**
  * xe_configfs_get_ctx_restore_mid_bb - get configfs ctx_restore_mid_bb setting
  * @pdev: pci device
diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
index 07d62bf0c152..c107d84b2c62 100644
--- a/drivers/gpu/drm/xe/xe_configfs.h
+++ b/drivers/gpu/drm/xe/xe_configfs.h
@@ -23,6 +23,7 @@ bool xe_configfs_primary_gt_allowed(struct pci_dev *pdev);
 bool xe_configfs_media_gt_allowed(struct pci_dev *pdev);
 u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
 bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
+bool xe_configfs_get_bad_page_reservation(struct pci_dev *pdev);
 u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev,
 				       enum xe_engine_class class,
 				       const u32 **cs);
@@ -42,6 +43,7 @@ static inline bool xe_configfs_primary_gt_allowed(struct pci_dev *pdev) { return
 static inline bool xe_configfs_media_gt_allowed(struct pci_dev *pdev) { return true; }
 static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
 static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
+static inline bool xe_configfs_get_bad_page_reservation(struct pci_dev *pdev) { return true; }
 static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev,
 						     enum xe_engine_class class,
 						     const u32 **cs) { return 0; }
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 4d4d7a35e089..f3ace86799fd 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -25,6 +25,7 @@
 #include "regs/xe_regs.h"
 #include "xe_bo.h"
 #include "xe_bo_evict.h"
+#include "xe_configfs.h"
 #include "xe_debugfs.h"
 #include "xe_defaults.h"
 #include "xe_devcoredump.h"
@@ -69,6 +70,7 @@
 #include "xe_tile.h"
 #include "xe_ttm_stolen_mgr.h"
 #include "xe_ttm_sys_mgr.h"
+#include "xe_ttm_vram_mgr.h"
 #include "xe_vm.h"
 #include "xe_vm_madvise.h"
 #include "xe_vram.h"
@@ -842,6 +844,44 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
 		xe_pm_runtime_put(xe);
 }
 
+static int xe_device_process_bad_pages(struct xe_device *xe)
+{
+	unsigned long offlined[1] = {0x0};
+	unsigned long queued[1] = {0x3000};
+	int n_bad_pages = ARRAY_SIZE(offlined) + ARRAY_SIZE(queued);
+	unsigned long *bad_pages;
+	bool policy;
+	u8 i;
+
+	if (xe->info.platform != XE_CRESCENTISLAND)
+		return 0;
+
+	/* TODO: FW Integration: Query FW for offline/queued pages */
+
+	if (!n_bad_pages)
+		return 0;
+	bad_pages = kmalloc_array(n_bad_pages, sizeof(unsigned long), GFP_KERNEL);
+	if (!bad_pages)
+		return -ENOMEM;
+
+	for (int i = 0; i < ARRAY_SIZE(offlined); i++)
+		bad_pages[i] = offlined[i];
+	for (int i = 0; i < ARRAY_SIZE(queued); i++)
+		bad_pages[ARRAY_SIZE(offlined) + i] = queued[i];
+
+	/* Read policy from configfs */
+	policy = xe_configfs_get_bad_page_reservation(to_pci_dev(xe->drm.dev));
+	for (i = 0; i < n_bad_pages; i++) {
+		if (!policy)
+			drm_err(&xe->drm, "0x%lx is reported as corrupted address by HW\n",
+				bad_pages[i]);
+		else
+			xe_ttm_vram_handle_addr_fault(xe, bad_pages[i]);
+	}
+	kfree(bad_pages);
+	return 0;
+}
+
 int xe_device_probe(struct xe_device *xe)
 {
 	struct xe_tile *tile;
@@ -911,6 +951,10 @@ int xe_device_probe(struct xe_device *xe)
 	if (err)
 		return err;
 
+	err = xe_device_process_bad_pages(xe);
+	if (err)
+		return err;
+
 	/*
 	 * Now that GT is initialized (TTM in particular),
 	 * we can try to init display, and inherit the initial fb.
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index ac6f034852f7..6808368aa5a1 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -12,6 +12,7 @@
 #include <drm/ttm/ttm_range_manager.h>
 
 #include "xe_bo.h"
+#include "xe_configfs.h"
 #include "xe_device.h"
 #include "xe_exec_queue.h"
 #include "xe_lrc.h"
@@ -725,6 +726,7 @@ int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr)
 	struct xe_ttm_vram_mgr *vram_mgr;
 	struct xe_vram_region *vr;
 	struct gpu_buddy *mm;
+	bool policy;
 	int ret;
 
 	vr = xe_ttm_vram_addr_to_region(xe, addr);
@@ -736,6 +738,15 @@ int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr)
 	}
 	vram_mgr = &vr->ttm;
 	mm = &vram_mgr->mm;
+
+	policy = xe_configfs_get_bad_page_reservation(to_pci_dev(xe->drm.dev));
+	if (!policy) {
+		drm_err(&xe->drm, "0x%lx is reported as corrupted address by HW\n",
+			addr);
+		/* TODO: FW Integration: Report to FW to drop addr from SRAM queue */
+		return 0;
+	}
+
 	/* Reserve page at address */
 	ret = xe_ttm_vram_reserve_page_at_addr(xe, addr, vram_mgr, mm);
 	return ret;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH V7 9/9] drm/xe/cri: Add sysfs interface for bad gpu vram pages
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (7 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 8/9] drm/xe/configfs: Add vram bad page reservation policy Tejas Upadhyay
@ 2026-04-13 13:16 ` Tejas Upadhyay
  2026-04-13 16:36 ` ✗ CI.checkpatch: warning for Add memory page offlining support (rev7) Patchwork
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Tejas Upadhyay @ 2026-04-13 13:16 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, matthew.brost, thomas.hellstrom,
	himal.prasad.ghimiray, Tejas Upadhyay

Starting CRI, Include a sysfs interface designed to expose information
about bad VRAM pages—those identified as having hardware faults
(e.g., ECC errors). This interface allows userspace tools and
administrators to monitor the health of the GPU's local memory and
track the status of page retirement.To get details on bad gpu vram
pages can be found under /sys/bus/pci/devices/bdf/vram_bad_pages.

Where The format is, pfn : gpu page size : flags

flags:
R: reserved, this gpu page is reserved.
P: pending for reserve, this gpu page is marked as bad, will be reserved in next window of page_reserve.
F: unable to reserve. this gpu page can’t be reserved due to some reasons.

For example if you read using cat /sys/bus/pci/devices/bdf/vram_bad_pages,
max_pages : 10000
0x00000000 : 0x00001000 : R
0x00001234 : 0x00001000 : P

v2:
- Add max_pages info as per updated design doc
- Rebase

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c             |  7 ++
 drivers/gpu/drm/xe/xe_device_sysfs.c       |  7 ++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 77 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |  1 +
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  2 +
 5 files changed, 94 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index f3ace86799fd..9b535c0802cb 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -848,6 +848,8 @@ static int xe_device_process_bad_pages(struct xe_device *xe)
 {
 	unsigned long offlined[1] = {0x0};
 	unsigned long queued[1] = {0x3000};
+	struct ttm_resource_manager *man;
+	struct xe_ttm_vram_mgr *mgr;
 	int n_bad_pages = ARRAY_SIZE(offlined) + ARRAY_SIZE(queued);
 	unsigned long *bad_pages;
 	bool policy;
@@ -857,6 +859,11 @@ static int xe_device_process_bad_pages(struct xe_device *xe)
 		return 0;
 
 	/* TODO: FW Integration: Query FW for offline/queued pages */
+	/* retrieve and fill max_pages from FW */
+	man = ttm_manager_type(&xe->ttm, XE_PL_VRAM0);
+	WARN_ON(!man);
+	mgr = to_xe_ttm_vram_mgr(man);
+	mgr->max_pages = 10000;
 
 	if (!n_bad_pages)
 		return 0;
diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
index a73e0e957cb0..47c5be4180fe 100644
--- a/drivers/gpu/drm/xe/xe_device_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
@@ -8,12 +8,14 @@
 #include <linux/pci.h>
 #include <linux/sysfs.h>
 
+#include "xe_configfs.h"
 #include "xe_device.h"
 #include "xe_device_sysfs.h"
 #include "xe_mmio.h"
 #include "xe_pcode_api.h"
 #include "xe_pcode.h"
 #include "xe_pm.h"
+#include "xe_ttm_vram_mgr.h"
 
 /**
  * DOC: Xe device sysfs
@@ -267,6 +269,7 @@ static const struct attribute_group auto_link_downgrade_attr_group = {
 int xe_device_sysfs_init(struct xe_device *xe)
 {
 	struct device *dev = xe->drm.dev;
+	bool policy;
 	int ret;
 
 	if (xe->d3cold.capable) {
@@ -285,5 +288,9 @@ int xe_device_sysfs_init(struct xe_device *xe)
 			return ret;
 	}
 
+	policy = xe_configfs_get_bad_page_reservation(to_pci_dev(dev));
+	if (xe->info.platform == XE_CRESCENTISLAND && policy)
+		xe_ttm_vram_sysfs_init(xe);
+
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 6808368aa5a1..71e7103e4371 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -752,3 +752,80 @@ int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr)
 	return ret;
 }
 EXPORT_SYMBOL(xe_ttm_vram_handle_addr_fault);
+
+static void xe_ttm_vram_dump_bad_pages_info(char *buf, struct xe_ttm_vram_mgr *mgr)
+{
+	const unsigned int element_size = sizeof("0xabcdabcd : 0x12345678 : R\n") - 1;
+	const unsigned int maxpage_size = sizeof("max_pages: 10000\n") - 1;
+	struct xe_ttm_vram_offline_resource *pos, *n;
+	struct gpu_buddy_block *block;
+	ssize_t s = 0;
+
+	mutex_lock(&mgr->lock);
+	s += scnprintf(&buf[s], maxpage_size + 1, "max_pages: %d\n", mgr->max_pages);
+	list_for_each_entry_safe(pos, n, &mgr->offlined_pages, offlined_link) {
+		block = list_first_entry(&pos->blocks,
+					 struct gpu_buddy_block,
+					 link);
+		s += scnprintf(&buf[s], element_size + 1,
+			       "0x%08llx : 0x%08llx : %1s\n",
+			       gpu_buddy_block_offset(block) >> PAGE_SHIFT,
+			       gpu_buddy_block_size(&mgr->mm, block),
+			       "R");
+	}
+	list_for_each_entry_safe(pos, n, &mgr->queued_pages, queued_link) {
+		block = list_first_entry(&pos->blocks,
+					 struct gpu_buddy_block,
+					 link);
+		s += scnprintf(&buf[s], element_size + 1,
+			       "0x%08llx : 0x%08llx : %1s\n",
+			       gpu_buddy_block_offset(block) >> PAGE_SHIFT,
+			       gpu_buddy_block_size(&mgr->mm, block),
+			       pos->status ? "P" : "F");
+	}
+	mutex_unlock(&mgr->lock);
+}
+
+static ssize_t vram_bad_pages_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct xe_device *xe = pdev_to_xe_device(pdev);
+	struct ttm_resource_manager *man;
+	struct xe_ttm_vram_mgr *mgr;
+
+	man = ttm_manager_type(&xe->ttm, XE_PL_VRAM0);
+	if (man) {
+		mgr = to_xe_ttm_vram_mgr(man);
+		xe_ttm_vram_dump_bad_pages_info(buf, mgr);
+	}
+
+	return sysfs_emit(buf, "%s\n", buf);
+}
+static DEVICE_ATTR_RO(vram_bad_pages);
+
+static void xe_ttm_vram_sysfs_fini(void *arg)
+{
+	struct xe_device *xe = arg;
+
+	device_remove_file(xe->drm.dev, &dev_attr_vram_bad_pages);
+}
+
+/**
+ * xe_ttm_vram_sysfs_init - Initialize vram sysfs component
+ * @tile: Xe Tile object
+ *
+ * It needs to be initialized after the main tile component is ready
+ *
+ * Returns: 0 on success, negative error code on error.
+ */
+int xe_ttm_vram_sysfs_init(struct xe_device *xe)
+{
+	int err;
+
+	err = device_create_file(xe->drm.dev, &dev_attr_vram_bad_pages);
+	if (err)
+		return 0;
+
+	return devm_add_action_or_reset(xe->drm.dev, xe_ttm_vram_sysfs_fini, xe);
+}
+EXPORT_SYMBOL(xe_ttm_vram_sysfs_init);
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 8ef06d9d44f7..c33e1a8d9217 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -32,6 +32,7 @@ void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
 			  u64 *used, u64 *used_visible);
 
 int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr);
+int xe_ttm_vram_sysfs_init(struct xe_device *xe);
 static inline struct xe_ttm_vram_mgr_resource *
 to_xe_ttm_vram_mgr_resource(struct ttm_resource *res)
 {
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 07ed88b47e04..b23796066a1a 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -39,6 +39,8 @@ struct xe_ttm_vram_mgr {
 	u32 mem_type;
 	/** @offline_mode: debugfs hook for setting page offline mode */
 	u64 offline_mode;
+	/** @max_pages: max pages that can be in offline queue retrieved from FW */
+	u16 max_pages;
 };
 
 /**
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper
  2026-04-13 13:16 ` [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper Tejas Upadhyay
@ 2026-04-13 13:28   ` Matthew Auld
  2026-04-13 17:30   ` Matthew Auld
  1 sibling, 0 replies; 21+ messages in thread
From: Matthew Auld @ 2026-04-13 13:28 UTC (permalink / raw)
  To: Tejas Upadhyay, intel-xe
  Cc: matthew.brost, thomas.hellstrom, himal.prasad.ghimiray

On 13/04/2026 14:16, Tejas Upadhyay wrote:
> Add helper with primary purpose is to efficiently trace a specific
> physical memory address back to its corresponding TTM buffer object.
> 
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

I think you missed the feedback from the prev revision?

> ---
>   drivers/gpu/buddy.c       | 56 +++++++++++++++++++++++++++++++++++++++
>   include/linux/gpu_buddy.h |  2 ++
>   2 files changed, 58 insertions(+)
> 
> diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
> index 52686672e99f..2d26c2a0f971 100644
> --- a/drivers/gpu/buddy.c
> +++ b/drivers/gpu/buddy.c
> @@ -589,6 +589,62 @@ void gpu_buddy_free_block(struct gpu_buddy *mm,
>   }
>   EXPORT_SYMBOL(gpu_buddy_free_block);
>   
> +/**
> + * gpu_buddy_addr_to_block - given physical address find a block
> + *
> + * @mm: GPU buddy manager
> + * @addr: Physical address
> + *
> + * Returns:
> + * gpu_buddy_block on success, NULL or error code on failure
> + */
> +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr)
> +{
> +	struct gpu_buddy_block *block;
> +	LIST_HEAD(dfs);
> +	u64 end;
> +	int i;
> +
> +	end = addr + SZ_4K - 1;
> +	for (i = 0; i < mm->n_roots; ++i)
> +		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
> +
> +	do {
> +		u64 block_start;
> +		u64 block_end;
> +
> +		block = list_first_entry_or_null(&dfs,
> +						 struct gpu_buddy_block,
> +						 tmp_link);
> +		if (!block)
> +			break;
> +
> +		list_del(&block->tmp_link);
> +
> +		block_start = gpu_buddy_block_offset(block);
> +		block_end = block_start + gpu_buddy_block_size(mm, block) - 1;
> +
> +		if (!overlaps(addr, end, block_start, block_end))
> +			continue;
> +
> +		if (contains(addr, end, block_start, block_end) &&
> +		    !gpu_buddy_block_is_split(block)) {
> +			if (gpu_buddy_block_is_free(block))
> +				return NULL;
> +			else if (gpu_buddy_block_is_allocated(block) && !mm->clear_avail)
> +				return block;
> +		}
> +
> +		if (gpu_buddy_block_is_split(block)) {
> +			list_add(&block->right->tmp_link, &dfs);
> +			list_add(&block->left->tmp_link, &dfs);
> +		}
> +	} while (1);
> +
> +	return ERR_PTR(-ENXIO);
> +}
> +EXPORT_SYMBOL(gpu_buddy_addr_to_block);
> +
>   static void __gpu_buddy_free_list(struct gpu_buddy *mm,
>   				  struct list_head *objects,
>   				  bool mark_clear,
> diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
> index 5fa917ba5450..957c69c560bc 100644
> --- a/include/linux/gpu_buddy.h
> +++ b/include/linux/gpu_buddy.h
> @@ -231,6 +231,8 @@ void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear);
>   
>   void gpu_buddy_free_block(struct gpu_buddy *mm, struct gpu_buddy_block *block);
>   
> +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr);
> +
>   void gpu_buddy_free_list(struct gpu_buddy *mm,
>   			 struct list_head *objects,
>   			 unsigned int flags);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✗ CI.checkpatch: warning for Add memory page offlining support (rev7)
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (8 preceding siblings ...)
  2026-04-13 13:16 ` [RFC PATCH V7 9/9] drm/xe/cri: Add sysfs interface for bad gpu vram pages Tejas Upadhyay
@ 2026-04-13 16:36 ` Patchwork
  2026-04-13 16:37 ` ✓ CI.KUnit: success " Patchwork
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2026-04-13 16:36 UTC (permalink / raw)
  To: Upadhyay, Tejas; +Cc: intel-xe

== Series Details ==

Series: Add memory page offlining support (rev7)
URL   : https://patchwork.freedesktop.org/series/161473/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
1f57ba1afceae32108bd24770069f764d940a0e4
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit ed54940d366e5169807d6c963280c7b17dced8bd
Author: Tejas Upadhyay <tejas.upadhyay@intel.com>
Date:   Mon Apr 13 18:46:30 2026 +0530

    drm/xe/cri: Add sysfs interface for bad gpu vram pages
    
    Starting CRI, Include a sysfs interface designed to expose information
    about bad VRAM pages—those identified as having hardware faults
    (e.g., ECC errors). This interface allows userspace tools and
    administrators to monitor the health of the GPU's local memory and
    track the status of page retirement.To get details on bad gpu vram
    pages can be found under /sys/bus/pci/devices/bdf/vram_bad_pages.
    
    Where The format is, pfn : gpu page size : flags
    
    flags:
    R: reserved, this gpu page is reserved.
    P: pending for reserve, this gpu page is marked as bad, will be reserved in next window of page_reserve.
    F: unable to reserve. this gpu page can’t be reserved due to some reasons.
    
    For example if you read using cat /sys/bus/pci/devices/bdf/vram_bad_pages,
    max_pages : 10000
    0x00000000 : 0x00001000 : R
    0x00001234 : 0x00001000 : P
    
    v2:
    - Add max_pages info as per updated design doc
    - Rebase
    
    Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
+ /mt/dim checkpatch 3112208d6d7f657d0866921cceb5b3afe1688fb2 drm-intel
867fc1040060 drm/xe: Link VRAM object with gpu buddy
21c395b8f655 drm/gpu: Add gpu_buddy_addr_to_block helper
022cc6fdb455 drm/xe: Link LRC BO and its execution Queue
446c4c2bfe62 drm/xe: Extend BO purge to handle vram pages as well
e61b911ab398 drm/xe: Handle physical memory address error
-:13: ERROR:BAD_COMMIT_SEPARATOR: Invalid commit separator - some tools may have problems applying this
#13: 
----------------------

-:20: ERROR:BAD_COMMIT_SEPARATOR: Invalid commit separator - some tools may have problems applying this
#20: 
-----------------

-:32: ERROR:BAD_COMMIT_SEPARATOR: Invalid commit separator - some tools may have problems applying this
#32: 
--------------

total: 3 errors, 0 warnings, 0 checks, 412 lines checked
ae99092ef17c drm/xe/cri: Add debugfs to inject faulty vram address
13bb94ad7bc0 gpu/buddy: Add routine to dump allocated buddy blocks
-:13: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#13: 
[  +0.000003] xe 0000:03:00.0: [drm] 0x00000002f8000000-0x00000002f8800000: 8388608

total: 0 errors, 1 warnings, 0 checks, 62 lines checked
5ca5a4c98f8b drm/xe/configfs: Add vram bad page reservation policy
ed54940d366e drm/xe/cri: Add sysfs interface for bad gpu vram pages
-:20: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#20: 
P: pending for reserve, this gpu page is marked as bad, will be reserved in next window of page_reserve.

total: 0 errors, 1 warnings, 0 checks, 144 lines checked



^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✓ CI.KUnit: success for Add memory page offlining support (rev7)
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (9 preceding siblings ...)
  2026-04-13 16:36 ` ✗ CI.checkpatch: warning for Add memory page offlining support (rev7) Patchwork
@ 2026-04-13 16:37 ` Patchwork
  2026-04-13 17:43 ` ✓ Xe.CI.BAT: " Patchwork
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2026-04-13 16:37 UTC (permalink / raw)
  To: Upadhyay, Tejas; +Cc: intel-xe

== Series Details ==

Series: Add memory page offlining support (rev7)
URL   : https://patchwork.freedesktop.org/series/161473/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[16:36:30] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[16:36:35] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=25
[16:37:12] Starting KUnit Kernel (1/1)...
[16:37:12] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[16:37:12] ================== guc_buf (11 subtests) ===================
[16:37:12] [PASSED] test_smallest
[16:37:12] [PASSED] test_largest
[16:37:12] [PASSED] test_granular
[16:37:12] [PASSED] test_unique
[16:37:12] [PASSED] test_overlap
[16:37:12] [PASSED] test_reusable
[16:37:12] [PASSED] test_too_big
[16:37:12] [PASSED] test_flush
[16:37:12] [PASSED] test_lookup
[16:37:12] [PASSED] test_data
[16:37:12] [PASSED] test_class
[16:37:12] ===================== [PASSED] guc_buf =====================
[16:37:12] =================== guc_dbm (7 subtests) ===================
[16:37:12] [PASSED] test_empty
[16:37:12] [PASSED] test_default
[16:37:12] ======================== test_size  ========================
[16:37:12] [PASSED] 4
[16:37:12] [PASSED] 8
[16:37:12] [PASSED] 32
[16:37:12] [PASSED] 256
[16:37:12] ==================== [PASSED] test_size ====================
[16:37:12] ======================= test_reuse  ========================
[16:37:12] [PASSED] 4
[16:37:12] [PASSED] 8
[16:37:12] [PASSED] 32
[16:37:12] [PASSED] 256
[16:37:12] =================== [PASSED] test_reuse ====================
[16:37:12] =================== test_range_overlap  ====================
[16:37:12] [PASSED] 4
[16:37:12] [PASSED] 8
[16:37:12] [PASSED] 32
[16:37:12] [PASSED] 256
[16:37:12] =============== [PASSED] test_range_overlap ================
[16:37:12] =================== test_range_compact  ====================
[16:37:12] [PASSED] 4
[16:37:12] [PASSED] 8
[16:37:12] [PASSED] 32
[16:37:12] [PASSED] 256
[16:37:12] =============== [PASSED] test_range_compact ================
[16:37:12] ==================== test_range_spare  =====================
[16:37:12] [PASSED] 4
[16:37:12] [PASSED] 8
[16:37:12] [PASSED] 32
[16:37:12] [PASSED] 256
[16:37:12] ================ [PASSED] test_range_spare =================
[16:37:12] ===================== [PASSED] guc_dbm =====================
[16:37:12] =================== guc_idm (6 subtests) ===================
[16:37:12] [PASSED] bad_init
[16:37:12] [PASSED] no_init
[16:37:12] [PASSED] init_fini
[16:37:12] [PASSED] check_used
[16:37:12] [PASSED] check_quota
[16:37:12] [PASSED] check_all
[16:37:12] ===================== [PASSED] guc_idm =====================
[16:37:12] ================== no_relay (3 subtests) ===================
[16:37:12] [PASSED] xe_drops_guc2pf_if_not_ready
[16:37:12] [PASSED] xe_drops_guc2vf_if_not_ready
[16:37:12] [PASSED] xe_rejects_send_if_not_ready
[16:37:12] ==================== [PASSED] no_relay =====================
[16:37:12] ================== pf_relay (14 subtests) ==================
[16:37:12] [PASSED] pf_rejects_guc2pf_too_short
[16:37:12] [PASSED] pf_rejects_guc2pf_too_long
[16:37:12] [PASSED] pf_rejects_guc2pf_no_payload
[16:37:12] [PASSED] pf_fails_no_payload
[16:37:12] [PASSED] pf_fails_bad_origin
[16:37:12] [PASSED] pf_fails_bad_type
[16:37:12] [PASSED] pf_txn_reports_error
[16:37:12] [PASSED] pf_txn_sends_pf2guc
[16:37:12] [PASSED] pf_sends_pf2guc
[16:37:12] [SKIPPED] pf_loopback_nop
[16:37:12] [SKIPPED] pf_loopback_echo
[16:37:12] [SKIPPED] pf_loopback_fail
[16:37:12] [SKIPPED] pf_loopback_busy
[16:37:12] [SKIPPED] pf_loopback_retry
[16:37:12] ==================== [PASSED] pf_relay =====================
[16:37:12] ================== vf_relay (3 subtests) ===================
[16:37:12] [PASSED] vf_rejects_guc2vf_too_short
[16:37:12] [PASSED] vf_rejects_guc2vf_too_long
[16:37:12] [PASSED] vf_rejects_guc2vf_no_payload
[16:37:12] ==================== [PASSED] vf_relay =====================
[16:37:12] ================ pf_gt_config (9 subtests) =================
[16:37:12] [PASSED] fair_contexts_1vf
[16:37:12] [PASSED] fair_doorbells_1vf
[16:37:12] [PASSED] fair_ggtt_1vf
[16:37:12] ====================== fair_vram_1vf  ======================
[16:37:12] [PASSED] 3.50 GiB
[16:37:12] [PASSED] 11.5 GiB
[16:37:12] [PASSED] 15.5 GiB
[16:37:12] [PASSED] 31.5 GiB
[16:37:12] [PASSED] 63.5 GiB
[16:37:12] [PASSED] 1.91 GiB
[16:37:12] ================== [PASSED] fair_vram_1vf ==================
[16:37:12] ================ fair_vram_1vf_admin_only  =================
[16:37:12] [PASSED] 3.50 GiB
[16:37:12] [PASSED] 11.5 GiB
[16:37:12] [PASSED] 15.5 GiB
[16:37:12] [PASSED] 31.5 GiB
[16:37:12] [PASSED] 63.5 GiB
[16:37:12] [PASSED] 1.91 GiB
[16:37:12] ============ [PASSED] fair_vram_1vf_admin_only =============
[16:37:12] ====================== fair_contexts  ======================
[16:37:12] [PASSED] 1 VF
[16:37:12] [PASSED] 2 VFs
[16:37:12] [PASSED] 3 VFs
[16:37:12] [PASSED] 4 VFs
[16:37:12] [PASSED] 5 VFs
[16:37:12] [PASSED] 6 VFs
[16:37:12] [PASSED] 7 VFs
[16:37:12] [PASSED] 8 VFs
[16:37:12] [PASSED] 9 VFs
[16:37:12] [PASSED] 10 VFs
[16:37:12] [PASSED] 11 VFs
[16:37:12] [PASSED] 12 VFs
[16:37:12] [PASSED] 13 VFs
[16:37:12] [PASSED] 14 VFs
[16:37:12] [PASSED] 15 VFs
[16:37:12] [PASSED] 16 VFs
[16:37:12] [PASSED] 17 VFs
[16:37:12] [PASSED] 18 VFs
[16:37:12] [PASSED] 19 VFs
[16:37:12] [PASSED] 20 VFs
[16:37:12] [PASSED] 21 VFs
[16:37:12] [PASSED] 22 VFs
[16:37:12] [PASSED] 23 VFs
[16:37:12] [PASSED] 24 VFs
[16:37:12] [PASSED] 25 VFs
[16:37:12] [PASSED] 26 VFs
[16:37:12] [PASSED] 27 VFs
[16:37:12] [PASSED] 28 VFs
[16:37:12] [PASSED] 29 VFs
[16:37:12] [PASSED] 30 VFs
[16:37:12] [PASSED] 31 VFs
[16:37:12] [PASSED] 32 VFs
[16:37:12] [PASSED] 33 VFs
[16:37:12] [PASSED] 34 VFs
[16:37:12] [PASSED] 35 VFs
[16:37:12] [PASSED] 36 VFs
[16:37:12] [PASSED] 37 VFs
[16:37:12] [PASSED] 38 VFs
[16:37:12] [PASSED] 39 VFs
[16:37:12] [PASSED] 40 VFs
[16:37:12] [PASSED] 41 VFs
[16:37:12] [PASSED] 42 VFs
[16:37:12] [PASSED] 43 VFs
[16:37:12] [PASSED] 44 VFs
[16:37:12] [PASSED] 45 VFs
[16:37:12] [PASSED] 46 VFs
[16:37:12] [PASSED] 47 VFs
[16:37:12] [PASSED] 48 VFs
[16:37:12] [PASSED] 49 VFs
[16:37:12] [PASSED] 50 VFs
[16:37:12] [PASSED] 51 VFs
[16:37:12] [PASSED] 52 VFs
[16:37:12] [PASSED] 53 VFs
[16:37:12] [PASSED] 54 VFs
[16:37:12] [PASSED] 55 VFs
[16:37:12] [PASSED] 56 VFs
[16:37:12] [PASSED] 57 VFs
[16:37:12] [PASSED] 58 VFs
[16:37:12] [PASSED] 59 VFs
[16:37:12] [PASSED] 60 VFs
[16:37:12] [PASSED] 61 VFs
[16:37:12] [PASSED] 62 VFs
[16:37:12] [PASSED] 63 VFs
[16:37:12] ================== [PASSED] fair_contexts ==================
[16:37:12] ===================== fair_doorbells  ======================
[16:37:12] [PASSED] 1 VF
[16:37:12] [PASSED] 2 VFs
[16:37:12] [PASSED] 3 VFs
[16:37:12] [PASSED] 4 VFs
[16:37:12] [PASSED] 5 VFs
[16:37:12] [PASSED] 6 VFs
[16:37:12] [PASSED] 7 VFs
[16:37:12] [PASSED] 8 VFs
[16:37:12] [PASSED] 9 VFs
[16:37:12] [PASSED] 10 VFs
[16:37:12] [PASSED] 11 VFs
[16:37:12] [PASSED] 12 VFs
[16:37:12] [PASSED] 13 VFs
[16:37:12] [PASSED] 14 VFs
[16:37:12] [PASSED] 15 VFs
[16:37:12] [PASSED] 16 VFs
[16:37:12] [PASSED] 17 VFs
[16:37:12] [PASSED] 18 VFs
[16:37:12] [PASSED] 19 VFs
[16:37:12] [PASSED] 20 VFs
[16:37:12] [PASSED] 21 VFs
[16:37:12] [PASSED] 22 VFs
[16:37:12] [PASSED] 23 VFs
[16:37:12] [PASSED] 24 VFs
[16:37:12] [PASSED] 25 VFs
[16:37:12] [PASSED] 26 VFs
[16:37:12] [PASSED] 27 VFs
[16:37:12] [PASSED] 28 VFs
[16:37:12] [PASSED] 29 VFs
[16:37:12] [PASSED] 30 VFs
[16:37:12] [PASSED] 31 VFs
[16:37:12] [PASSED] 32 VFs
[16:37:12] [PASSED] 33 VFs
[16:37:12] [PASSED] 34 VFs
[16:37:12] [PASSED] 35 VFs
[16:37:12] [PASSED] 36 VFs
[16:37:12] [PASSED] 37 VFs
[16:37:12] [PASSED] 38 VFs
[16:37:12] [PASSED] 39 VFs
[16:37:12] [PASSED] 40 VFs
[16:37:12] [PASSED] 41 VFs
[16:37:12] [PASSED] 42 VFs
[16:37:12] [PASSED] 43 VFs
[16:37:12] [PASSED] 44 VFs
[16:37:12] [PASSED] 45 VFs
[16:37:12] [PASSED] 46 VFs
[16:37:12] [PASSED] 47 VFs
[16:37:12] [PASSED] 48 VFs
[16:37:12] [PASSED] 49 VFs
[16:37:12] [PASSED] 50 VFs
[16:37:12] [PASSED] 51 VFs
[16:37:12] [PASSED] 52 VFs
[16:37:12] [PASSED] 53 VFs
[16:37:12] [PASSED] 54 VFs
[16:37:12] [PASSED] 55 VFs
[16:37:12] [PASSED] 56 VFs
[16:37:12] [PASSED] 57 VFs
[16:37:12] [PASSED] 58 VFs
[16:37:12] [PASSED] 59 VFs
[16:37:12] [PASSED] 60 VFs
[16:37:12] [PASSED] 61 VFs
[16:37:12] [PASSED] 62 VFs
[16:37:12] [PASSED] 63 VFs
[16:37:12] ================= [PASSED] fair_doorbells ==================
[16:37:12] ======================== fair_ggtt  ========================
[16:37:12] [PASSED] 1 VF
[16:37:12] [PASSED] 2 VFs
[16:37:12] [PASSED] 3 VFs
[16:37:12] [PASSED] 4 VFs
[16:37:13] [PASSED] 5 VFs
[16:37:13] [PASSED] 6 VFs
[16:37:13] [PASSED] 7 VFs
[16:37:13] [PASSED] 8 VFs
[16:37:13] [PASSED] 9 VFs
[16:37:13] [PASSED] 10 VFs
[16:37:13] [PASSED] 11 VFs
[16:37:13] [PASSED] 12 VFs
[16:37:13] [PASSED] 13 VFs
[16:37:13] [PASSED] 14 VFs
[16:37:13] [PASSED] 15 VFs
[16:37:13] [PASSED] 16 VFs
[16:37:13] [PASSED] 17 VFs
[16:37:13] [PASSED] 18 VFs
[16:37:13] [PASSED] 19 VFs
[16:37:13] [PASSED] 20 VFs
[16:37:13] [PASSED] 21 VFs
[16:37:13] [PASSED] 22 VFs
[16:37:13] [PASSED] 23 VFs
[16:37:13] [PASSED] 24 VFs
[16:37:13] [PASSED] 25 VFs
[16:37:13] [PASSED] 26 VFs
[16:37:13] [PASSED] 27 VFs
[16:37:13] [PASSED] 28 VFs
[16:37:13] [PASSED] 29 VFs
[16:37:13] [PASSED] 30 VFs
[16:37:13] [PASSED] 31 VFs
[16:37:13] [PASSED] 32 VFs
[16:37:13] [PASSED] 33 VFs
[16:37:13] [PASSED] 34 VFs
[16:37:13] [PASSED] 35 VFs
[16:37:13] [PASSED] 36 VFs
[16:37:13] [PASSED] 37 VFs
[16:37:13] [PASSED] 38 VFs
[16:37:13] [PASSED] 39 VFs
[16:37:13] [PASSED] 40 VFs
[16:37:13] [PASSED] 41 VFs
[16:37:13] [PASSED] 42 VFs
[16:37:13] [PASSED] 43 VFs
[16:37:13] [PASSED] 44 VFs
[16:37:13] [PASSED] 45 VFs
[16:37:13] [PASSED] 46 VFs
[16:37:13] [PASSED] 47 VFs
[16:37:13] [PASSED] 48 VFs
[16:37:13] [PASSED] 49 VFs
[16:37:13] [PASSED] 50 VFs
[16:37:13] [PASSED] 51 VFs
[16:37:13] [PASSED] 52 VFs
[16:37:13] [PASSED] 53 VFs
[16:37:13] [PASSED] 54 VFs
[16:37:13] [PASSED] 55 VFs
[16:37:13] [PASSED] 56 VFs
[16:37:13] [PASSED] 57 VFs
[16:37:13] [PASSED] 58 VFs
[16:37:13] [PASSED] 59 VFs
[16:37:13] [PASSED] 60 VFs
[16:37:13] [PASSED] 61 VFs
[16:37:13] [PASSED] 62 VFs
[16:37:13] [PASSED] 63 VFs
[16:37:13] ==================== [PASSED] fair_ggtt ====================
[16:37:13] ======================== fair_vram  ========================
[16:37:13] [PASSED] 1 VF
[16:37:13] [PASSED] 2 VFs
[16:37:13] [PASSED] 3 VFs
[16:37:13] [PASSED] 4 VFs
[16:37:13] [PASSED] 5 VFs
[16:37:13] [PASSED] 6 VFs
[16:37:13] [PASSED] 7 VFs
[16:37:13] [PASSED] 8 VFs
[16:37:13] [PASSED] 9 VFs
[16:37:13] [PASSED] 10 VFs
[16:37:13] [PASSED] 11 VFs
[16:37:13] [PASSED] 12 VFs
[16:37:13] [PASSED] 13 VFs
[16:37:13] [PASSED] 14 VFs
[16:37:13] [PASSED] 15 VFs
[16:37:13] [PASSED] 16 VFs
[16:37:13] [PASSED] 17 VFs
[16:37:13] [PASSED] 18 VFs
[16:37:13] [PASSED] 19 VFs
[16:37:13] [PASSED] 20 VFs
[16:37:13] [PASSED] 21 VFs
[16:37:13] [PASSED] 22 VFs
[16:37:13] [PASSED] 23 VFs
[16:37:13] [PASSED] 24 VFs
[16:37:13] [PASSED] 25 VFs
[16:37:13] [PASSED] 26 VFs
[16:37:13] [PASSED] 27 VFs
[16:37:13] [PASSED] 28 VFs
[16:37:13] [PASSED] 29 VFs
[16:37:13] [PASSED] 30 VFs
[16:37:13] [PASSED] 31 VFs
[16:37:13] [PASSED] 32 VFs
[16:37:13] [PASSED] 33 VFs
[16:37:13] [PASSED] 34 VFs
[16:37:13] [PASSED] 35 VFs
[16:37:13] [PASSED] 36 VFs
[16:37:13] [PASSED] 37 VFs
[16:37:13] [PASSED] 38 VFs
[16:37:13] [PASSED] 39 VFs
[16:37:13] [PASSED] 40 VFs
[16:37:13] [PASSED] 41 VFs
[16:37:13] [PASSED] 42 VFs
[16:37:13] [PASSED] 43 VFs
[16:37:13] [PASSED] 44 VFs
[16:37:13] [PASSED] 45 VFs
[16:37:13] [PASSED] 46 VFs
[16:37:13] [PASSED] 47 VFs
[16:37:13] [PASSED] 48 VFs
[16:37:13] [PASSED] 49 VFs
[16:37:13] [PASSED] 50 VFs
[16:37:13] [PASSED] 51 VFs
[16:37:13] [PASSED] 52 VFs
[16:37:13] [PASSED] 53 VFs
[16:37:13] [PASSED] 54 VFs
[16:37:13] [PASSED] 55 VFs
[16:37:13] [PASSED] 56 VFs
[16:37:13] [PASSED] 57 VFs
[16:37:13] [PASSED] 58 VFs
[16:37:13] [PASSED] 59 VFs
[16:37:13] [PASSED] 60 VFs
[16:37:13] [PASSED] 61 VFs
[16:37:13] [PASSED] 62 VFs
[16:37:13] [PASSED] 63 VFs
[16:37:13] ==================== [PASSED] fair_vram ====================
[16:37:13] ================== [PASSED] pf_gt_config ===================
[16:37:13] ===================== lmtt (1 subtest) =====================
[16:37:13] ======================== test_ops  =========================
[16:37:13] [PASSED] 2-level
[16:37:13] [PASSED] multi-level
[16:37:13] ==================== [PASSED] test_ops =====================
[16:37:13] ====================== [PASSED] lmtt =======================
[16:37:13] ================= pf_service (11 subtests) =================
[16:37:13] [PASSED] pf_negotiate_any
[16:37:13] [PASSED] pf_negotiate_base_match
[16:37:13] [PASSED] pf_negotiate_base_newer
[16:37:13] [PASSED] pf_negotiate_base_next
[16:37:13] [SKIPPED] pf_negotiate_base_older
[16:37:13] [PASSED] pf_negotiate_base_prev
[16:37:13] [PASSED] pf_negotiate_latest_match
[16:37:13] [PASSED] pf_negotiate_latest_newer
[16:37:13] [PASSED] pf_negotiate_latest_next
[16:37:13] [SKIPPED] pf_negotiate_latest_older
[16:37:13] [SKIPPED] pf_negotiate_latest_prev
[16:37:13] =================== [PASSED] pf_service ====================
[16:37:13] ================= xe_guc_g2g (2 subtests) ==================
[16:37:13] ============== xe_live_guc_g2g_kunit_default  ==============
[16:37:13] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[16:37:13] ============== xe_live_guc_g2g_kunit_allmem  ===============
[16:37:13] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[16:37:13] =================== [SKIPPED] xe_guc_g2g ===================
[16:37:13] =================== xe_mocs (2 subtests) ===================
[16:37:13] ================ xe_live_mocs_kernel_kunit  ================
[16:37:13] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[16:37:13] ================ xe_live_mocs_reset_kunit  =================
[16:37:13] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[16:37:13] ==================== [SKIPPED] xe_mocs =====================
[16:37:13] ================= xe_migrate (2 subtests) ==================
[16:37:13] ================= xe_migrate_sanity_kunit  =================
[16:37:13] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[16:37:13] ================== xe_validate_ccs_kunit  ==================
[16:37:13] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[16:37:13] =================== [SKIPPED] xe_migrate ===================
[16:37:13] ================== xe_dma_buf (1 subtest) ==================
[16:37:13] ==================== xe_dma_buf_kunit  =====================
[16:37:13] ================ [SKIPPED] xe_dma_buf_kunit ================
[16:37:13] =================== [SKIPPED] xe_dma_buf ===================
[16:37:13] ================= xe_bo_shrink (1 subtest) =================
[16:37:13] =================== xe_bo_shrink_kunit  ====================
[16:37:13] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[16:37:13] ================== [SKIPPED] xe_bo_shrink ==================
[16:37:13] ==================== xe_bo (2 subtests) ====================
[16:37:13] ================== xe_ccs_migrate_kunit  ===================
[16:37:13] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[16:37:13] ==================== xe_bo_evict_kunit  ====================
[16:37:13] =============== [SKIPPED] xe_bo_evict_kunit ================
[16:37:13] ===================== [SKIPPED] xe_bo ======================
[16:37:13] ==================== args (13 subtests) ====================
[16:37:13] [PASSED] count_args_test
[16:37:13] [PASSED] call_args_example
[16:37:13] [PASSED] call_args_test
[16:37:13] [PASSED] drop_first_arg_example
[16:37:13] [PASSED] drop_first_arg_test
[16:37:13] [PASSED] first_arg_example
[16:37:13] [PASSED] first_arg_test
[16:37:13] [PASSED] last_arg_example
[16:37:13] [PASSED] last_arg_test
[16:37:13] [PASSED] pick_arg_example
[16:37:13] [PASSED] if_args_example
[16:37:13] [PASSED] if_args_test
[16:37:13] [PASSED] sep_comma_example
[16:37:13] ====================== [PASSED] args =======================
[16:37:13] =================== xe_pci (3 subtests) ====================
[16:37:13] ==================== check_graphics_ip  ====================
[16:37:13] [PASSED] 12.00 Xe_LP
[16:37:13] [PASSED] 12.10 Xe_LP+
[16:37:13] [PASSED] 12.55 Xe_HPG
[16:37:13] [PASSED] 12.60 Xe_HPC
[16:37:13] [PASSED] 12.70 Xe_LPG
[16:37:13] [PASSED] 12.71 Xe_LPG
[16:37:13] [PASSED] 12.74 Xe_LPG+
[16:37:13] [PASSED] 20.01 Xe2_HPG
[16:37:13] [PASSED] 20.02 Xe2_HPG
[16:37:13] [PASSED] 20.04 Xe2_LPG
[16:37:13] [PASSED] 30.00 Xe3_LPG
[16:37:13] [PASSED] 30.01 Xe3_LPG
[16:37:13] [PASSED] 30.03 Xe3_LPG
[16:37:13] [PASSED] 30.04 Xe3_LPG
[16:37:13] [PASSED] 30.05 Xe3_LPG
[16:37:13] [PASSED] 35.10 Xe3p_LPG
[16:37:13] [PASSED] 35.11 Xe3p_XPC
[16:37:13] ================ [PASSED] check_graphics_ip ================
[16:37:13] ===================== check_media_ip  ======================
[16:37:13] [PASSED] 12.00 Xe_M
[16:37:13] [PASSED] 12.55 Xe_HPM
[16:37:13] [PASSED] 13.00 Xe_LPM+
[16:37:13] [PASSED] 13.01 Xe2_HPM
[16:37:13] [PASSED] 20.00 Xe2_LPM
[16:37:13] [PASSED] 30.00 Xe3_LPM
[16:37:13] [PASSED] 30.02 Xe3_LPM
[16:37:13] [PASSED] 35.00 Xe3p_LPM
[16:37:13] [PASSED] 35.03 Xe3p_HPM
[16:37:13] ================= [PASSED] check_media_ip ==================
[16:37:13] =================== check_platform_desc  ===================
[16:37:13] [PASSED] 0x9A60 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A68 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A70 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A40 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A49 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A59 (TIGERLAKE)
[16:37:13] [PASSED] 0x9A78 (TIGERLAKE)
[16:37:13] [PASSED] 0x9AC0 (TIGERLAKE)
[16:37:13] [PASSED] 0x9AC9 (TIGERLAKE)
[16:37:13] [PASSED] 0x9AD9 (TIGERLAKE)
[16:37:13] [PASSED] 0x9AF8 (TIGERLAKE)
[16:37:13] [PASSED] 0x4C80 (ROCKETLAKE)
[16:37:13] [PASSED] 0x4C8A (ROCKETLAKE)
[16:37:13] [PASSED] 0x4C8B (ROCKETLAKE)
[16:37:13] [PASSED] 0x4C8C (ROCKETLAKE)
[16:37:13] [PASSED] 0x4C90 (ROCKETLAKE)
[16:37:13] [PASSED] 0x4C9A (ROCKETLAKE)
[16:37:13] [PASSED] 0x4680 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4682 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4688 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x468A (ALDERLAKE_S)
[16:37:13] [PASSED] 0x468B (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4690 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4692 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4693 (ALDERLAKE_S)
[16:37:13] [PASSED] 0x46A0 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46A1 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46A2 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46A3 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46A6 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46A8 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46AA (ALDERLAKE_P)
[16:37:13] [PASSED] 0x462A (ALDERLAKE_P)
[16:37:13] [PASSED] 0x4626 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x4628 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46B0 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46B1 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46B2 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46B3 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46C0 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46C1 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46C2 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46C3 (ALDERLAKE_P)
[16:37:13] [PASSED] 0x46D0 (ALDERLAKE_N)
[16:37:13] [PASSED] 0x46D1 (ALDERLAKE_N)
[16:37:13] [PASSED] 0x46D2 (ALDERLAKE_N)
[16:37:13] [PASSED] 0x46D3 (ALDERLAKE_N)
[16:37:13] [PASSED] 0x46D4 (ALDERLAKE_N)
[16:37:13] [PASSED] 0xA721 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7A1 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7A9 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7AC (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7AD (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA720 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7A0 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7A8 (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7AA (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA7AB (ALDERLAKE_P)
[16:37:13] [PASSED] 0xA780 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA781 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA782 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA783 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA788 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA789 (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA78A (ALDERLAKE_S)
[16:37:13] [PASSED] 0xA78B (ALDERLAKE_S)
[16:37:13] [PASSED] 0x4905 (DG1)
[16:37:13] [PASSED] 0x4906 (DG1)
[16:37:13] [PASSED] 0x4907 (DG1)
[16:37:13] [PASSED] 0x4908 (DG1)
[16:37:13] [PASSED] 0x4909 (DG1)
[16:37:13] [PASSED] 0x56C0 (DG2)
[16:37:13] [PASSED] 0x56C2 (DG2)
[16:37:13] [PASSED] 0x56C1 (DG2)
[16:37:13] [PASSED] 0x7D51 (METEORLAKE)
[16:37:13] [PASSED] 0x7DD1 (METEORLAKE)
[16:37:13] [PASSED] 0x7D41 (METEORLAKE)
[16:37:13] [PASSED] 0x7D67 (METEORLAKE)
[16:37:13] [PASSED] 0xB640 (METEORLAKE)
[16:37:13] [PASSED] 0x56A0 (DG2)
[16:37:13] [PASSED] 0x56A1 (DG2)
[16:37:13] [PASSED] 0x56A2 (DG2)
[16:37:13] [PASSED] 0x56BE (DG2)
[16:37:13] [PASSED] 0x56BF (DG2)
[16:37:13] [PASSED] 0x5690 (DG2)
[16:37:13] [PASSED] 0x5691 (DG2)
[16:37:13] [PASSED] 0x5692 (DG2)
[16:37:13] [PASSED] 0x56A5 (DG2)
[16:37:13] [PASSED] 0x56A6 (DG2)
[16:37:13] [PASSED] 0x56B0 (DG2)
[16:37:13] [PASSED] 0x56B1 (DG2)
[16:37:13] [PASSED] 0x56BA (DG2)
[16:37:13] [PASSED] 0x56BB (DG2)
[16:37:13] [PASSED] 0x56BC (DG2)
[16:37:13] [PASSED] 0x56BD (DG2)
[16:37:13] [PASSED] 0x5693 (DG2)
[16:37:13] [PASSED] 0x5694 (DG2)
[16:37:13] [PASSED] 0x5695 (DG2)
[16:37:13] [PASSED] 0x56A3 (DG2)
[16:37:13] [PASSED] 0x56A4 (DG2)
[16:37:13] [PASSED] 0x56B2 (DG2)
[16:37:13] [PASSED] 0x56B3 (DG2)
[16:37:13] [PASSED] 0x5696 (DG2)
[16:37:13] [PASSED] 0x5697 (DG2)
[16:37:13] [PASSED] 0xB69 (PVC)
[16:37:13] [PASSED] 0xB6E (PVC)
[16:37:13] [PASSED] 0xBD4 (PVC)
[16:37:13] [PASSED] 0xBD5 (PVC)
[16:37:13] [PASSED] 0xBD6 (PVC)
[16:37:13] [PASSED] 0xBD7 (PVC)
[16:37:13] [PASSED] 0xBD8 (PVC)
[16:37:13] [PASSED] 0xBD9 (PVC)
[16:37:13] [PASSED] 0xBDA (PVC)
[16:37:13] [PASSED] 0xBDB (PVC)
[16:37:13] [PASSED] 0xBE0 (PVC)
[16:37:13] [PASSED] 0xBE1 (PVC)
[16:37:13] [PASSED] 0xBE5 (PVC)
[16:37:13] [PASSED] 0x7D40 (METEORLAKE)
[16:37:13] [PASSED] 0x7D45 (METEORLAKE)
[16:37:13] [PASSED] 0x7D55 (METEORLAKE)
[16:37:13] [PASSED] 0x7D60 (METEORLAKE)
[16:37:13] [PASSED] 0x7DD5 (METEORLAKE)
[16:37:13] [PASSED] 0x6420 (LUNARLAKE)
[16:37:13] [PASSED] 0x64A0 (LUNARLAKE)
[16:37:13] [PASSED] 0x64B0 (LUNARLAKE)
[16:37:13] [PASSED] 0xE202 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE209 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE20B (BATTLEMAGE)
[16:37:13] [PASSED] 0xE20C (BATTLEMAGE)
[16:37:13] [PASSED] 0xE20D (BATTLEMAGE)
[16:37:13] [PASSED] 0xE210 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE211 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE212 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE216 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE220 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE221 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE222 (BATTLEMAGE)
[16:37:13] [PASSED] 0xE223 (BATTLEMAGE)
[16:37:13] [PASSED] 0xB080 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB081 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB082 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB083 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB084 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB085 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB086 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB087 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB08F (PANTHERLAKE)
[16:37:13] [PASSED] 0xB090 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB0A0 (PANTHERLAKE)
[16:37:13] [PASSED] 0xB0B0 (PANTHERLAKE)
[16:37:13] [PASSED] 0xFD80 (PANTHERLAKE)
[16:37:13] [PASSED] 0xFD81 (PANTHERLAKE)
[16:37:13] [PASSED] 0xD740 (NOVALAKE_S)
[16:37:13] [PASSED] 0xD741 (NOVALAKE_S)
[16:37:13] [PASSED] 0xD742 (NOVALAKE_S)
[16:37:13] [PASSED] 0xD743 (NOVALAKE_S)
[16:37:13] [PASSED] 0xD744 (NOVALAKE_S)
[16:37:13] [PASSED] 0xD745 (NOVALAKE_S)
[16:37:13] [PASSED] 0x674C (CRESCENTISLAND)
[16:37:13] [PASSED] 0xD750 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD751 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD752 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD753 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD754 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD755 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD756 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD757 (NOVALAKE_P)
[16:37:13] [PASSED] 0xD75F (NOVALAKE_P)
[16:37:13] =============== [PASSED] check_platform_desc ===============
[16:37:13] ===================== [PASSED] xe_pci ======================
[16:37:13] =================== xe_rtp (2 subtests) ====================
[16:37:13] =============== xe_rtp_process_to_sr_tests  ================
[16:37:13] [PASSED] coalesce-same-reg
[16:37:13] [PASSED] no-match-no-add
[16:37:13] [PASSED] match-or
[16:37:13] [PASSED] match-or-xfail
[16:37:13] [PASSED] no-match-no-add-multiple-rules
[16:37:13] [PASSED] two-regs-two-entries
[16:37:13] [PASSED] clr-one-set-other
[16:37:13] [PASSED] set-field
[16:37:13] [PASSED] conflict-duplicate
stty: 'standard input': Inappropriate ioctl for device
[16:37:13] [PASSED] conflict-not-disjoint
[16:37:13] [PASSED] conflict-reg-type
[16:37:13] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[16:37:13] ================== xe_rtp_process_tests  ===================
[16:37:13] [PASSED] active1
[16:37:13] [PASSED] active2
[16:37:13] [PASSED] active-inactive
[16:37:13] [PASSED] inactive-active
[16:37:13] [PASSED] inactive-1st_or_active-inactive
[16:37:13] [PASSED] inactive-2nd_or_active-inactive
[16:37:13] [PASSED] inactive-last_or_active-inactive
[16:37:13] [PASSED] inactive-no_or_active-inactive
[16:37:13] ============== [PASSED] xe_rtp_process_tests ===============
[16:37:13] ===================== [PASSED] xe_rtp ======================
[16:37:13] ==================== xe_wa (1 subtest) =====================
[16:37:13] ======================== xe_wa_gt  =========================
[16:37:13] [PASSED] TIGERLAKE B0
[16:37:13] [PASSED] DG1 A0
[16:37:13] [PASSED] DG1 B0
[16:37:13] [PASSED] ALDERLAKE_S A0
[16:37:13] [PASSED] ALDERLAKE_S B0
[16:37:13] [PASSED] ALDERLAKE_S C0
[16:37:13] [PASSED] ALDERLAKE_S D0
[16:37:13] [PASSED] ALDERLAKE_P A0
[16:37:13] [PASSED] ALDERLAKE_P B0
[16:37:13] [PASSED] ALDERLAKE_P C0
[16:37:13] [PASSED] ALDERLAKE_S RPLS D0
[16:37:13] [PASSED] ALDERLAKE_P RPLU E0
[16:37:13] [PASSED] DG2 G10 C0
[16:37:13] [PASSED] DG2 G11 B1
[16:37:13] [PASSED] DG2 G12 A1
[16:37:13] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[16:37:13] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[16:37:13] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[16:37:13] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[16:37:13] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[16:37:13] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[16:37:13] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[16:37:13] ==================== [PASSED] xe_wa_gt =====================
[16:37:13] ====================== [PASSED] xe_wa ======================
[16:37:13] ============================================================
[16:37:13] Testing complete. Ran 597 tests: passed: 579, skipped: 18
[16:37:13] Elapsed time: 42.800s total, 4.585s configuring, 37.597s building, 0.574s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[16:37:13] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[16:37:14] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=25
[16:37:43] Starting KUnit Kernel (1/1)...
[16:37:43] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[16:37:43] ============ drm_test_pick_cmdline (2 subtests) ============
[16:37:43] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[16:37:43] =============== drm_test_pick_cmdline_named  ===============
[16:37:43] [PASSED] NTSC
[16:37:43] [PASSED] NTSC-J
[16:37:43] [PASSED] PAL
[16:37:43] [PASSED] PAL-M
[16:37:43] =========== [PASSED] drm_test_pick_cmdline_named ===========
[16:37:43] ============== [PASSED] drm_test_pick_cmdline ==============
[16:37:43] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[16:37:43] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[16:37:43] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[16:37:43] =========== drm_validate_clone_mode (2 subtests) ===========
[16:37:43] ============== drm_test_check_in_clone_mode  ===============
[16:37:43] [PASSED] in_clone_mode
[16:37:43] [PASSED] not_in_clone_mode
[16:37:43] ========== [PASSED] drm_test_check_in_clone_mode ===========
[16:37:43] =============== drm_test_check_valid_clones  ===============
[16:37:43] [PASSED] not_in_clone_mode
[16:37:43] [PASSED] valid_clone
[16:37:43] [PASSED] invalid_clone
[16:37:43] =========== [PASSED] drm_test_check_valid_clones ===========
[16:37:43] ============= [PASSED] drm_validate_clone_mode =============
[16:37:43] ============= drm_validate_modeset (1 subtest) =============
[16:37:43] [PASSED] drm_test_check_connector_changed_modeset
[16:37:43] ============== [PASSED] drm_validate_modeset ===============
[16:37:43] ====== drm_test_bridge_get_current_state (2 subtests) ======
[16:37:43] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[16:37:43] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[16:37:43] ======== [PASSED] drm_test_bridge_get_current_state ========
[16:37:43] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[16:37:43] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[16:37:43] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[16:37:43] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[16:37:43] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[16:37:43] ============== drm_bridge_alloc (2 subtests) ===============
[16:37:43] [PASSED] drm_test_drm_bridge_alloc_basic
[16:37:43] [PASSED] drm_test_drm_bridge_alloc_get_put
[16:37:43] ================ [PASSED] drm_bridge_alloc =================
[16:37:43] ============= drm_cmdline_parser (40 subtests) =============
[16:37:43] [PASSED] drm_test_cmdline_force_d_only
[16:37:43] [PASSED] drm_test_cmdline_force_D_only_dvi
[16:37:43] [PASSED] drm_test_cmdline_force_D_only_hdmi
[16:37:43] [PASSED] drm_test_cmdline_force_D_only_not_digital
[16:37:43] [PASSED] drm_test_cmdline_force_e_only
[16:37:43] [PASSED] drm_test_cmdline_res
[16:37:43] [PASSED] drm_test_cmdline_res_vesa
[16:37:43] [PASSED] drm_test_cmdline_res_vesa_rblank
[16:37:43] [PASSED] drm_test_cmdline_res_rblank
[16:37:43] [PASSED] drm_test_cmdline_res_bpp
[16:37:43] [PASSED] drm_test_cmdline_res_refresh
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[16:37:43] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[16:37:43] [PASSED] drm_test_cmdline_res_margins_force_on
[16:37:43] [PASSED] drm_test_cmdline_res_vesa_margins
[16:37:43] [PASSED] drm_test_cmdline_name
[16:37:43] [PASSED] drm_test_cmdline_name_bpp
[16:37:43] [PASSED] drm_test_cmdline_name_option
[16:37:43] [PASSED] drm_test_cmdline_name_bpp_option
[16:37:43] [PASSED] drm_test_cmdline_rotate_0
[16:37:43] [PASSED] drm_test_cmdline_rotate_90
[16:37:43] [PASSED] drm_test_cmdline_rotate_180
[16:37:43] [PASSED] drm_test_cmdline_rotate_270
[16:37:43] [PASSED] drm_test_cmdline_hmirror
[16:37:43] [PASSED] drm_test_cmdline_vmirror
[16:37:43] [PASSED] drm_test_cmdline_margin_options
[16:37:43] [PASSED] drm_test_cmdline_multiple_options
[16:37:43] [PASSED] drm_test_cmdline_bpp_extra_and_option
[16:37:43] [PASSED] drm_test_cmdline_extra_and_option
[16:37:43] [PASSED] drm_test_cmdline_freestanding_options
[16:37:43] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[16:37:43] [PASSED] drm_test_cmdline_panel_orientation
[16:37:43] ================ drm_test_cmdline_invalid  =================
[16:37:43] [PASSED] margin_only
[16:37:43] [PASSED] interlace_only
[16:37:43] [PASSED] res_missing_x
[16:37:43] [PASSED] res_missing_y
[16:37:43] [PASSED] res_bad_y
[16:37:43] [PASSED] res_missing_y_bpp
[16:37:43] [PASSED] res_bad_bpp
[16:37:43] [PASSED] res_bad_refresh
[16:37:43] [PASSED] res_bpp_refresh_force_on_off
[16:37:43] [PASSED] res_invalid_mode
[16:37:43] [PASSED] res_bpp_wrong_place_mode
[16:37:43] [PASSED] name_bpp_refresh
[16:37:43] [PASSED] name_refresh
[16:37:43] [PASSED] name_refresh_wrong_mode
[16:37:43] [PASSED] name_refresh_invalid_mode
[16:37:43] [PASSED] rotate_multiple
[16:37:43] [PASSED] rotate_invalid_val
[16:37:43] [PASSED] rotate_truncated
[16:37:43] [PASSED] invalid_option
[16:37:43] [PASSED] invalid_tv_option
[16:37:43] [PASSED] truncated_tv_option
[16:37:43] ============ [PASSED] drm_test_cmdline_invalid =============
[16:37:43] =============== drm_test_cmdline_tv_options  ===============
[16:37:43] [PASSED] NTSC
[16:37:43] [PASSED] NTSC_443
[16:37:43] [PASSED] NTSC_J
[16:37:43] [PASSED] PAL
[16:37:43] [PASSED] PAL_M
[16:37:43] [PASSED] PAL_N
[16:37:43] [PASSED] SECAM
[16:37:43] [PASSED] MONO_525
[16:37:43] [PASSED] MONO_625
[16:37:43] =========== [PASSED] drm_test_cmdline_tv_options ===========
[16:37:43] =============== [PASSED] drm_cmdline_parser ================
[16:37:43] ========== drmm_connector_hdmi_init (20 subtests) ==========
[16:37:43] [PASSED] drm_test_connector_hdmi_init_valid
[16:37:43] [PASSED] drm_test_connector_hdmi_init_bpc_8
[16:37:43] [PASSED] drm_test_connector_hdmi_init_bpc_10
[16:37:43] [PASSED] drm_test_connector_hdmi_init_bpc_12
[16:37:43] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[16:37:43] [PASSED] drm_test_connector_hdmi_init_bpc_null
[16:37:43] [PASSED] drm_test_connector_hdmi_init_formats_empty
[16:37:43] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[16:37:43] === drm_test_connector_hdmi_init_formats_yuv420_allowed  ===
[16:37:43] [PASSED] supported_formats=0x9 yuv420_allowed=1
[16:37:43] [PASSED] supported_formats=0x9 yuv420_allowed=0
[16:37:43] [PASSED] supported_formats=0x5 yuv420_allowed=1
[16:37:43] [PASSED] supported_formats=0x5 yuv420_allowed=0
[16:37:43] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[16:37:43] [PASSED] drm_test_connector_hdmi_init_null_ddc
[16:37:43] [PASSED] drm_test_connector_hdmi_init_null_product
[16:37:43] [PASSED] drm_test_connector_hdmi_init_null_vendor
[16:37:43] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[16:37:43] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[16:37:43] [PASSED] drm_test_connector_hdmi_init_product_valid
[16:37:43] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[16:37:43] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[16:37:43] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[16:37:43] ========= drm_test_connector_hdmi_init_type_valid  =========
[16:37:43] [PASSED] HDMI-A
[16:37:43] [PASSED] HDMI-B
[16:37:43] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[16:37:43] ======== drm_test_connector_hdmi_init_type_invalid  ========
[16:37:43] [PASSED] Unknown
[16:37:43] [PASSED] VGA
[16:37:43] [PASSED] DVI-I
[16:37:43] [PASSED] DVI-D
[16:37:43] [PASSED] DVI-A
[16:37:43] [PASSED] Composite
[16:37:43] [PASSED] SVIDEO
[16:37:43] [PASSED] LVDS
[16:37:43] [PASSED] Component
[16:37:43] [PASSED] DIN
[16:37:43] [PASSED] DP
[16:37:43] [PASSED] TV
[16:37:43] [PASSED] eDP
[16:37:43] [PASSED] Virtual
[16:37:43] [PASSED] DSI
[16:37:43] [PASSED] DPI
[16:37:43] [PASSED] Writeback
[16:37:43] [PASSED] SPI
[16:37:43] [PASSED] USB
[16:37:43] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[16:37:43] ============ [PASSED] drmm_connector_hdmi_init =============
[16:37:43] ============= drmm_connector_init (3 subtests) =============
[16:37:43] [PASSED] drm_test_drmm_connector_init
[16:37:43] [PASSED] drm_test_drmm_connector_init_null_ddc
[16:37:43] ========= drm_test_drmm_connector_init_type_valid  =========
[16:37:43] [PASSED] Unknown
[16:37:43] [PASSED] VGA
[16:37:43] [PASSED] DVI-I
[16:37:43] [PASSED] DVI-D
[16:37:43] [PASSED] DVI-A
[16:37:43] [PASSED] Composite
[16:37:43] [PASSED] SVIDEO
[16:37:43] [PASSED] LVDS
[16:37:43] [PASSED] Component
[16:37:43] [PASSED] DIN
[16:37:43] [PASSED] DP
[16:37:43] [PASSED] HDMI-A
[16:37:43] [PASSED] HDMI-B
[16:37:43] [PASSED] TV
[16:37:43] [PASSED] eDP
[16:37:43] [PASSED] Virtual
[16:37:43] [PASSED] DSI
[16:37:43] [PASSED] DPI
[16:37:43] [PASSED] Writeback
[16:37:43] [PASSED] SPI
[16:37:43] [PASSED] USB
[16:37:43] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[16:37:43] =============== [PASSED] drmm_connector_init ===============
[16:37:43] ========= drm_connector_dynamic_init (6 subtests) ==========
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_init
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_init_properties
[16:37:43] ===== drm_test_drm_connector_dynamic_init_type_valid  ======
[16:37:43] [PASSED] Unknown
[16:37:43] [PASSED] VGA
[16:37:43] [PASSED] DVI-I
[16:37:43] [PASSED] DVI-D
[16:37:43] [PASSED] DVI-A
[16:37:43] [PASSED] Composite
[16:37:43] [PASSED] SVIDEO
[16:37:43] [PASSED] LVDS
[16:37:43] [PASSED] Component
[16:37:43] [PASSED] DIN
[16:37:43] [PASSED] DP
[16:37:43] [PASSED] HDMI-A
[16:37:43] [PASSED] HDMI-B
[16:37:43] [PASSED] TV
[16:37:43] [PASSED] eDP
[16:37:43] [PASSED] Virtual
[16:37:43] [PASSED] DSI
[16:37:43] [PASSED] DPI
[16:37:43] [PASSED] Writeback
[16:37:43] [PASSED] SPI
[16:37:43] [PASSED] USB
[16:37:43] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[16:37:43] ======== drm_test_drm_connector_dynamic_init_name  =========
[16:37:43] [PASSED] Unknown
[16:37:43] [PASSED] VGA
[16:37:43] [PASSED] DVI-I
[16:37:43] [PASSED] DVI-D
[16:37:43] [PASSED] DVI-A
[16:37:43] [PASSED] Composite
[16:37:43] [PASSED] SVIDEO
[16:37:43] [PASSED] LVDS
[16:37:43] [PASSED] Component
[16:37:43] [PASSED] DIN
[16:37:43] [PASSED] DP
[16:37:43] [PASSED] HDMI-A
[16:37:43] [PASSED] HDMI-B
[16:37:43] [PASSED] TV
[16:37:43] [PASSED] eDP
[16:37:43] [PASSED] Virtual
[16:37:43] [PASSED] DSI
[16:37:43] [PASSED] DPI
[16:37:43] [PASSED] Writeback
[16:37:43] [PASSED] SPI
[16:37:43] [PASSED] USB
[16:37:43] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[16:37:43] =========== [PASSED] drm_connector_dynamic_init ============
[16:37:43] ==== drm_connector_dynamic_register_early (4 subtests) =====
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[16:37:43] ====== [PASSED] drm_connector_dynamic_register_early =======
[16:37:43] ======= drm_connector_dynamic_register (7 subtests) ========
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[16:37:43] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[16:37:43] ========= [PASSED] drm_connector_dynamic_register ==========
[16:37:43] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[16:37:43] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[16:37:43] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[16:37:43] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[16:37:43] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[16:37:43] ========== drm_test_get_tv_mode_from_name_valid  ===========
[16:37:43] [PASSED] NTSC
[16:37:43] [PASSED] NTSC-443
[16:37:43] [PASSED] NTSC-J
[16:37:43] [PASSED] PAL
[16:37:43] [PASSED] PAL-M
[16:37:43] [PASSED] PAL-N
[16:37:43] [PASSED] SECAM
[16:37:43] [PASSED] Mono
[16:37:43] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[16:37:43] [PASSED] drm_test_get_tv_mode_from_name_truncated
[16:37:43] ============ [PASSED] drm_get_tv_mode_from_name ============
[16:37:43] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[16:37:43] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[16:37:43] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[16:37:43] [PASSED] VIC 96
[16:37:43] [PASSED] VIC 97
[16:37:43] [PASSED] VIC 101
[16:37:43] [PASSED] VIC 102
[16:37:43] [PASSED] VIC 106
[16:37:43] [PASSED] VIC 107
[16:37:43] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[16:37:43] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[16:37:43] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[16:37:43] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[16:37:43] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[16:37:43] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[16:37:43] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[16:37:43] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[16:37:43] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[16:37:43] [PASSED] Automatic
[16:37:43] [PASSED] Full
[16:37:43] [PASSED] Limited 16:235
[16:37:43] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[16:37:43] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[16:37:43] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[16:37:43] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[16:37:43] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[16:37:43] [PASSED] RGB
[16:37:43] [PASSED] YUV 4:2:0
[16:37:43] [PASSED] YUV 4:2:2
[16:37:43] [PASSED] YUV 4:4:4
[16:37:43] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[16:37:43] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[16:37:43] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[16:37:43] ============= drm_damage_helper (21 subtests) ==============
[16:37:43] [PASSED] drm_test_damage_iter_no_damage
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_src_moved
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_not_visible
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[16:37:43] [PASSED] drm_test_damage_iter_no_damage_no_fb
[16:37:43] [PASSED] drm_test_damage_iter_simple_damage
[16:37:43] [PASSED] drm_test_damage_iter_single_damage
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_outside_src
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_src_moved
[16:37:43] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[16:37:43] [PASSED] drm_test_damage_iter_damage
[16:37:43] [PASSED] drm_test_damage_iter_damage_one_intersect
[16:37:43] [PASSED] drm_test_damage_iter_damage_one_outside
[16:37:43] [PASSED] drm_test_damage_iter_damage_src_moved
[16:37:43] [PASSED] drm_test_damage_iter_damage_not_visible
[16:37:43] ================ [PASSED] drm_damage_helper ================
[16:37:43] ============== drm_dp_mst_helper (3 subtests) ==============
[16:37:43] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[16:37:43] [PASSED] Clock 154000 BPP 30 DSC disabled
[16:37:43] [PASSED] Clock 234000 BPP 30 DSC disabled
[16:37:43] [PASSED] Clock 297000 BPP 24 DSC disabled
[16:37:43] [PASSED] Clock 332880 BPP 24 DSC enabled
[16:37:43] [PASSED] Clock 324540 BPP 24 DSC enabled
[16:37:43] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[16:37:43] ============== drm_test_dp_mst_calc_pbn_div  ===============
[16:37:43] [PASSED] Link rate 2000000 lane count 4
[16:37:43] [PASSED] Link rate 2000000 lane count 2
[16:37:43] [PASSED] Link rate 2000000 lane count 1
[16:37:43] [PASSED] Link rate 1350000 lane count 4
[16:37:43] [PASSED] Link rate 1350000 lane count 2
[16:37:43] [PASSED] Link rate 1350000 lane count 1
[16:37:43] [PASSED] Link rate 1000000 lane count 4
[16:37:43] [PASSED] Link rate 1000000 lane count 2
[16:37:43] [PASSED] Link rate 1000000 lane count 1
[16:37:43] [PASSED] Link rate 810000 lane count 4
[16:37:43] [PASSED] Link rate 810000 lane count 2
[16:37:43] [PASSED] Link rate 810000 lane count 1
[16:37:43] [PASSED] Link rate 540000 lane count 4
[16:37:43] [PASSED] Link rate 540000 lane count 2
[16:37:43] [PASSED] Link rate 540000 lane count 1
[16:37:43] [PASSED] Link rate 270000 lane count 4
[16:37:43] [PASSED] Link rate 270000 lane count 2
[16:37:43] [PASSED] Link rate 270000 lane count 1
[16:37:43] [PASSED] Link rate 162000 lane count 4
[16:37:43] [PASSED] Link rate 162000 lane count 2
[16:37:43] [PASSED] Link rate 162000 lane count 1
[16:37:43] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[16:37:43] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[16:37:43] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[16:37:43] [PASSED] DP_POWER_UP_PHY with port number
[16:37:43] [PASSED] DP_POWER_DOWN_PHY with port number
[16:37:43] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[16:37:43] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[16:37:43] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[16:37:43] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[16:37:43] [PASSED] DP_QUERY_PAYLOAD with port number
[16:37:43] [PASSED] DP_QUERY_PAYLOAD with VCPI
[16:37:43] [PASSED] DP_REMOTE_DPCD_READ with port number
[16:37:43] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[16:37:43] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[16:37:43] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[16:37:43] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[16:37:43] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[16:37:43] [PASSED] DP_REMOTE_I2C_READ with port number
[16:37:43] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[16:37:43] [PASSED] DP_REMOTE_I2C_READ with transactions array
[16:37:43] [PASSED] DP_REMOTE_I2C_WRITE with port number
[16:37:43] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[16:37:43] [PASSED] DP_REMOTE_I2C_WRITE with data array
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[16:37:43] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[16:37:43] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[16:37:43] ================ [PASSED] drm_dp_mst_helper ================
[16:37:43] ================== drm_exec (7 subtests) ===================
[16:37:43] [PASSED] sanitycheck
[16:37:43] [PASSED] test_lock
[16:37:43] [PASSED] test_lock_unlock
[16:37:43] [PASSED] test_duplicates
[16:37:43] [PASSED] test_prepare
[16:37:43] [PASSED] test_prepare_array
[16:37:43] [PASSED] test_multiple_loops
[16:37:43] ==================== [PASSED] drm_exec =====================
[16:37:43] =========== drm_format_helper_test (17 subtests) ===========
[16:37:43] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[16:37:43] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[16:37:43] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[16:37:43] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[16:37:43] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[16:37:43] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[16:37:43] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[16:37:43] ============= drm_test_fb_xrgb8888_to_bgr888  ==============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[16:37:43] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[16:37:43] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[16:37:43] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[16:37:43] ============== drm_test_fb_xrgb8888_to_mono  ===============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[16:37:43] ==================== drm_test_fb_swab  =====================
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ================ [PASSED] drm_test_fb_swab =================
[16:37:43] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[16:37:43] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[16:37:43] [PASSED] single_pixel_source_buffer
[16:37:43] [PASSED] single_pixel_clip_rectangle
[16:37:43] [PASSED] well_known_colors
[16:37:43] [PASSED] destination_pitch
[16:37:43] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[16:37:43] ================= drm_test_fb_clip_offset  =================
[16:37:43] [PASSED] pass through
[16:37:43] [PASSED] horizontal offset
[16:37:43] [PASSED] vertical offset
[16:37:43] [PASSED] horizontal and vertical offset
[16:37:43] [PASSED] horizontal offset (custom pitch)
[16:37:43] [PASSED] vertical offset (custom pitch)
[16:37:43] [PASSED] horizontal and vertical offset (custom pitch)
[16:37:43] ============= [PASSED] drm_test_fb_clip_offset =============
[16:37:43] =================== drm_test_fb_memcpy  ====================
[16:37:43] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[16:37:43] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[16:37:43] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[16:37:43] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[16:37:43] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[16:37:43] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[16:37:43] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[16:37:43] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[16:37:43] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[16:37:43] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[16:37:43] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[16:37:43] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[16:37:43] =============== [PASSED] drm_test_fb_memcpy ================
[16:37:43] ============= [PASSED] drm_format_helper_test ==============
[16:37:43] ================= drm_format (18 subtests) =================
[16:37:43] [PASSED] drm_test_format_block_width_invalid
[16:37:43] [PASSED] drm_test_format_block_width_one_plane
[16:37:43] [PASSED] drm_test_format_block_width_two_plane
[16:37:43] [PASSED] drm_test_format_block_width_three_plane
[16:37:43] [PASSED] drm_test_format_block_width_tiled
[16:37:43] [PASSED] drm_test_format_block_height_invalid
[16:37:43] [PASSED] drm_test_format_block_height_one_plane
[16:37:43] [PASSED] drm_test_format_block_height_two_plane
[16:37:43] [PASSED] drm_test_format_block_height_three_plane
[16:37:43] [PASSED] drm_test_format_block_height_tiled
[16:37:43] [PASSED] drm_test_format_min_pitch_invalid
[16:37:43] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[16:37:43] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[16:37:43] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[16:37:43] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[16:37:43] [PASSED] drm_test_format_min_pitch_two_plane
[16:37:43] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[16:37:43] [PASSED] drm_test_format_min_pitch_tiled
[16:37:43] =================== [PASSED] drm_format ====================
[16:37:43] ============== drm_framebuffer (10 subtests) ===============
[16:37:43] ========== drm_test_framebuffer_check_src_coords  ==========
[16:37:43] [PASSED] Success: source fits into fb
[16:37:43] [PASSED] Fail: overflowing fb with x-axis coordinate
[16:37:43] [PASSED] Fail: overflowing fb with y-axis coordinate
[16:37:43] [PASSED] Fail: overflowing fb with source width
[16:37:43] [PASSED] Fail: overflowing fb with source height
[16:37:43] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[16:37:43] [PASSED] drm_test_framebuffer_cleanup
[16:37:43] =============== drm_test_framebuffer_create  ===============
[16:37:43] [PASSED] ABGR8888 normal sizes
[16:37:43] [PASSED] ABGR8888 max sizes
[16:37:43] [PASSED] ABGR8888 pitch greater than min required
[16:37:43] [PASSED] ABGR8888 pitch less than min required
[16:37:43] [PASSED] ABGR8888 Invalid width
[16:37:43] [PASSED] ABGR8888 Invalid buffer handle
[16:37:43] [PASSED] No pixel format
[16:37:43] [PASSED] ABGR8888 Width 0
[16:37:43] [PASSED] ABGR8888 Height 0
[16:37:43] [PASSED] ABGR8888 Out of bound height * pitch combination
[16:37:43] [PASSED] ABGR8888 Large buffer offset
[16:37:43] [PASSED] ABGR8888 Buffer offset for inexistent plane
[16:37:43] [PASSED] ABGR8888 Invalid flag
[16:37:43] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[16:37:43] [PASSED] ABGR8888 Valid buffer modifier
[16:37:43] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[16:37:43] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] NV12 Normal sizes
[16:37:43] [PASSED] NV12 Max sizes
[16:37:43] [PASSED] NV12 Invalid pitch
[16:37:43] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[16:37:43] [PASSED] NV12 different  modifier per-plane
[16:37:43] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[16:37:43] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] NV12 Modifier for inexistent plane
[16:37:43] [PASSED] NV12 Handle for inexistent plane
[16:37:43] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[16:37:43] [PASSED] YVU420 Normal sizes
[16:37:43] [PASSED] YVU420 Max sizes
[16:37:43] [PASSED] YVU420 Invalid pitch
[16:37:43] [PASSED] YVU420 Different pitches
[16:37:43] [PASSED] YVU420 Different buffer offsets/pitches
[16:37:43] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[16:37:43] [PASSED] YVU420 Valid modifier
[16:37:43] [PASSED] YVU420 Different modifiers per plane
[16:37:43] [PASSED] YVU420 Modifier for inexistent plane
[16:37:43] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[16:37:43] [PASSED] X0L2 Normal sizes
[16:37:43] [PASSED] X0L2 Max sizes
[16:37:43] [PASSED] X0L2 Invalid pitch
[16:37:43] [PASSED] X0L2 Pitch greater than minimum required
[16:37:43] [PASSED] X0L2 Handle for inexistent plane
[16:37:43] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[16:37:43] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[16:37:43] [PASSED] X0L2 Valid modifier
[16:37:43] [PASSED] X0L2 Modifier for inexistent plane
[16:37:43] =========== [PASSED] drm_test_framebuffer_create ===========
[16:37:43] [PASSED] drm_test_framebuffer_free
[16:37:43] [PASSED] drm_test_framebuffer_init
[16:37:43] [PASSED] drm_test_framebuffer_init_bad_format
[16:37:43] [PASSED] drm_test_framebuffer_init_dev_mismatch
[16:37:43] [PASSED] drm_test_framebuffer_lookup
[16:37:43] [PASSED] drm_test_framebuffer_lookup_inexistent
[16:37:43] [PASSED] drm_test_framebuffer_modifiers_not_supported
[16:37:43] ================= [PASSED] drm_framebuffer =================
[16:37:43] ================ drm_gem_shmem (8 subtests) ================
[16:37:43] [PASSED] drm_gem_shmem_test_obj_create
[16:37:43] [PASSED] drm_gem_shmem_test_obj_create_private
[16:37:43] [PASSED] drm_gem_shmem_test_pin_pages
[16:37:43] [PASSED] drm_gem_shmem_test_vmap
[16:37:43] [PASSED] drm_gem_shmem_test_get_sg_table
[16:37:43] [PASSED] drm_gem_shmem_test_get_pages_sgt
[16:37:43] [PASSED] drm_gem_shmem_test_madvise
[16:37:43] [PASSED] drm_gem_shmem_test_purge
[16:37:43] ================== [PASSED] drm_gem_shmem ==================
[16:37:43] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[16:37:43] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420  =======
[16:37:43] [PASSED] Automatic
[16:37:43] [PASSED] Full
[16:37:43] [PASSED] Limited 16:235
[16:37:43] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[16:37:43] [PASSED] drm_test_check_disable_connector
[16:37:43] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[16:37:43] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[16:37:43] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[16:37:43] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[16:37:43] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[16:37:43] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[16:37:43] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[16:37:43] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[16:37:43] [PASSED] drm_test_check_output_bpc_dvi
[16:37:43] [PASSED] drm_test_check_output_bpc_format_vic_1
[16:37:43] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[16:37:43] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[16:37:43] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[16:37:43] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[16:37:43] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[16:37:43] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[16:37:43] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[16:37:43] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[16:37:43] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[16:37:43] [PASSED] drm_test_check_broadcast_rgb_value
[16:37:43] [PASSED] drm_test_check_bpc_8_value
[16:37:43] [PASSED] drm_test_check_bpc_10_value
[16:37:43] [PASSED] drm_test_check_bpc_12_value
[16:37:43] [PASSED] drm_test_check_format_value
[16:37:43] [PASSED] drm_test_check_tmds_char_value
[16:37:43] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[16:37:43] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[16:37:43] [PASSED] drm_test_check_mode_valid
[16:37:43] [PASSED] drm_test_check_mode_valid_reject
[16:37:43] [PASSED] drm_test_check_mode_valid_reject_rate
[16:37:43] [PASSED] drm_test_check_mode_valid_reject_max_clock
[16:37:43] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[16:37:43] = drm_atomic_helper_connector_hdmi_infoframes (5 subtests) =
[16:37:43] [PASSED] drm_test_check_infoframes
[16:37:43] [PASSED] drm_test_check_reject_avi_infoframe
[16:37:43] [PASSED] drm_test_check_reject_hdr_infoframe_bpc_8
[16:37:43] [PASSED] drm_test_check_reject_hdr_infoframe_bpc_10
[16:37:43] [PASSED] drm_test_check_reject_audio_infoframe
[16:37:43] === [PASSED] drm_atomic_helper_connector_hdmi_infoframes ===
[16:37:43] ================= drm_managed (2 subtests) =================
[16:37:43] [PASSED] drm_test_managed_release_action
[16:37:43] [PASSED] drm_test_managed_run_action
[16:37:43] =================== [PASSED] drm_managed ===================
[16:37:43] =================== drm_mm (6 subtests) ====================
[16:37:43] [PASSED] drm_test_mm_init
[16:37:43] [PASSED] drm_test_mm_debug
[16:37:43] [PASSED] drm_test_mm_align32
[16:37:43] [PASSED] drm_test_mm_align64
[16:37:43] [PASSED] drm_test_mm_lowest
[16:37:43] [PASSED] drm_test_mm_highest
[16:37:43] ===================== [PASSED] drm_mm ======================
[16:37:43] ============= drm_modes_analog_tv (5 subtests) =============
[16:37:43] [PASSED] drm_test_modes_analog_tv_mono_576i
[16:37:43] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[16:37:43] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[16:37:43] [PASSED] drm_test_modes_analog_tv_pal_576i
[16:37:43] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[16:37:43] =============== [PASSED] drm_modes_analog_tv ===============
[16:37:43] ============== drm_plane_helper (2 subtests) ===============
[16:37:43] =============== drm_test_check_plane_state  ================
[16:37:43] [PASSED] clipping_simple
[16:37:43] [PASSED] clipping_rotate_reflect
[16:37:43] [PASSED] positioning_simple
[16:37:43] [PASSED] upscaling
[16:37:43] [PASSED] downscaling
[16:37:43] [PASSED] rounding1
[16:37:43] [PASSED] rounding2
[16:37:43] [PASSED] rounding3
[16:37:43] [PASSED] rounding4
[16:37:43] =========== [PASSED] drm_test_check_plane_state ============
[16:37:43] =========== drm_test_check_invalid_plane_state  ============
[16:37:43] [PASSED] positioning_invalid
[16:37:43] [PASSED] upscaling_invalid
[16:37:43] [PASSED] downscaling_invalid
[16:37:43] ======= [PASSED] drm_test_check_invalid_plane_state ========
[16:37:43] ================ [PASSED] drm_plane_helper =================
[16:37:43] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[16:37:43] ====== drm_test_connector_helper_tv_get_modes_check  =======
[16:37:43] [PASSED] None
[16:37:43] [PASSED] PAL
[16:37:43] [PASSED] NTSC
[16:37:43] [PASSED] Both, NTSC Default
[16:37:43] [PASSED] Both, PAL Default
[16:37:43] [PASSED] Both, NTSC Default, with PAL on command-line
[16:37:43] [PASSED] Both, PAL Default, with NTSC on command-line
[16:37:43] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[16:37:43] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[16:37:43] ================== drm_rect (9 subtests) ===================
[16:37:43] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[16:37:43] [PASSED] drm_test_rect_clip_scaled_not_clipped
[16:37:43] [PASSED] drm_test_rect_clip_scaled_clipped
[16:37:43] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[16:37:43] ================= drm_test_rect_intersect  =================
[16:37:43] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[16:37:43] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[16:37:43] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[16:37:43] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[16:37:43] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[16:37:43] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[16:37:43] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[16:37:43] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[16:37:43] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[16:37:43] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[16:37:43] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[16:37:43] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[16:37:43] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[16:37:43] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[16:37:43] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[16:37:43] ============= [PASSED] drm_test_rect_intersect =============
[16:37:43] ================ drm_test_rect_calc_hscale  ================
[16:37:43] [PASSED] normal use
[16:37:43] [PASSED] out of max range
[16:37:43] [PASSED] out of min range
[16:37:43] [PASSED] zero dst
[16:37:43] [PASSED] negative src
[16:37:43] [PASSED] negative dst
[16:37:43] ============ [PASSED] drm_test_rect_calc_hscale ============
[16:37:43] ================ drm_test_rect_calc_vscale  ================
[16:37:43] [PASSED] normal use
[16:37:43] [PASSED] out of max range
[16:37:43] [PASSED] out of min range
[16:37:43] [PASSED] zero dst
[16:37:43] [PASSED] negative src
[16:37:43] [PASSED] negative dst
stty: 'standard input': Inappropriate ioctl for device
[16:37:43] ============ [PASSED] drm_test_rect_calc_vscale ============
[16:37:43] ================== drm_test_rect_rotate  ===================
[16:37:43] [PASSED] reflect-x
[16:37:43] [PASSED] reflect-y
[16:37:43] [PASSED] rotate-0
[16:37:43] [PASSED] rotate-90
[16:37:43] [PASSED] rotate-180
[16:37:43] [PASSED] rotate-270
[16:37:43] ============== [PASSED] drm_test_rect_rotate ===============
[16:37:43] ================ drm_test_rect_rotate_inv  =================
[16:37:43] [PASSED] reflect-x
[16:37:43] [PASSED] reflect-y
[16:37:43] [PASSED] rotate-0
[16:37:43] [PASSED] rotate-90
[16:37:43] [PASSED] rotate-180
[16:37:43] [PASSED] rotate-270
[16:37:43] ============ [PASSED] drm_test_rect_rotate_inv =============
[16:37:43] ==================== [PASSED] drm_rect =====================
[16:37:43] ============ drm_sysfb_modeset_test (1 subtest) ============
[16:37:43] ============ drm_test_sysfb_build_fourcc_list  =============
[16:37:43] [PASSED] no native formats
[16:37:43] [PASSED] XRGB8888 as native format
[16:37:43] [PASSED] remove duplicates
[16:37:43] [PASSED] convert alpha formats
[16:37:43] [PASSED] random formats
[16:37:43] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[16:37:43] ============= [PASSED] drm_sysfb_modeset_test ==============
[16:37:43] ================== drm_fixp (2 subtests) ===================
[16:37:43] [PASSED] drm_test_int2fixp
[16:37:43] [PASSED] drm_test_sm2fixp
[16:37:43] ==================== [PASSED] drm_fixp =====================
[16:37:43] ============================================================
[16:37:43] Testing complete. Ran 621 tests: passed: 621
[16:37:43] Elapsed time: 30.542s total, 1.647s configuring, 28.677s building, 0.167s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[16:37:43] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[16:37:45] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=25
[16:37:54] Starting KUnit Kernel (1/1)...
[16:37:54] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[16:37:55] ================= ttm_device (5 subtests) ==================
[16:37:55] [PASSED] ttm_device_init_basic
[16:37:55] [PASSED] ttm_device_init_multiple
[16:37:55] [PASSED] ttm_device_fini_basic
[16:37:55] [PASSED] ttm_device_init_no_vma_man
[16:37:55] ================== ttm_device_init_pools  ==================
[16:37:55] [PASSED] No DMA allocations, no DMA32 required
[16:37:55] [PASSED] DMA allocations, DMA32 required
[16:37:55] [PASSED] No DMA allocations, DMA32 required
[16:37:55] [PASSED] DMA allocations, no DMA32 required
[16:37:55] ============== [PASSED] ttm_device_init_pools ==============
[16:37:55] =================== [PASSED] ttm_device ====================
[16:37:55] ================== ttm_pool (8 subtests) ===================
[16:37:55] ================== ttm_pool_alloc_basic  ===================
[16:37:55] [PASSED] One page
[16:37:55] [PASSED] More than one page
[16:37:55] [PASSED] Above the allocation limit
[16:37:55] [PASSED] One page, with coherent DMA mappings enabled
[16:37:55] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[16:37:55] ============== [PASSED] ttm_pool_alloc_basic ===============
[16:37:55] ============== ttm_pool_alloc_basic_dma_addr  ==============
[16:37:55] [PASSED] One page
[16:37:55] [PASSED] More than one page
[16:37:55] [PASSED] Above the allocation limit
[16:37:55] [PASSED] One page, with coherent DMA mappings enabled
[16:37:55] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[16:37:55] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[16:37:55] [PASSED] ttm_pool_alloc_order_caching_match
[16:37:55] [PASSED] ttm_pool_alloc_caching_mismatch
[16:37:55] [PASSED] ttm_pool_alloc_order_mismatch
[16:37:55] [PASSED] ttm_pool_free_dma_alloc
[16:37:55] [PASSED] ttm_pool_free_no_dma_alloc
[16:37:55] [PASSED] ttm_pool_fini_basic
[16:37:55] ==================== [PASSED] ttm_pool =====================
[16:37:55] ================ ttm_resource (8 subtests) =================
[16:37:55] ================= ttm_resource_init_basic  =================
[16:37:55] [PASSED] Init resource in TTM_PL_SYSTEM
[16:37:55] [PASSED] Init resource in TTM_PL_VRAM
[16:37:55] [PASSED] Init resource in a private placement
[16:37:55] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[16:37:55] ============= [PASSED] ttm_resource_init_basic =============
[16:37:55] [PASSED] ttm_resource_init_pinned
[16:37:55] [PASSED] ttm_resource_fini_basic
[16:37:55] [PASSED] ttm_resource_manager_init_basic
[16:37:55] [PASSED] ttm_resource_manager_usage_basic
[16:37:55] [PASSED] ttm_resource_manager_set_used_basic
[16:37:55] [PASSED] ttm_sys_man_alloc_basic
[16:37:55] [PASSED] ttm_sys_man_free_basic
[16:37:55] ================== [PASSED] ttm_resource ===================
[16:37:55] =================== ttm_tt (15 subtests) ===================
[16:37:55] ==================== ttm_tt_init_basic  ====================
[16:37:55] [PASSED] Page-aligned size
[16:37:55] [PASSED] Extra pages requested
[16:37:55] ================ [PASSED] ttm_tt_init_basic ================
[16:37:55] [PASSED] ttm_tt_init_misaligned
[16:37:55] [PASSED] ttm_tt_fini_basic
[16:37:55] [PASSED] ttm_tt_fini_sg
[16:37:55] [PASSED] ttm_tt_fini_shmem
[16:37:55] [PASSED] ttm_tt_create_basic
[16:37:55] [PASSED] ttm_tt_create_invalid_bo_type
[16:37:55] [PASSED] ttm_tt_create_ttm_exists
[16:37:55] [PASSED] ttm_tt_create_failed
[16:37:55] [PASSED] ttm_tt_destroy_basic
[16:37:55] [PASSED] ttm_tt_populate_null_ttm
[16:37:55] [PASSED] ttm_tt_populate_populated_ttm
[16:37:55] [PASSED] ttm_tt_unpopulate_basic
[16:37:55] [PASSED] ttm_tt_unpopulate_empty_ttm
[16:37:55] [PASSED] ttm_tt_swapin_basic
[16:37:55] ===================== [PASSED] ttm_tt ======================
[16:37:55] =================== ttm_bo (14 subtests) ===================
[16:37:55] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[16:37:55] [PASSED] Cannot be interrupted and sleeps
[16:37:55] [PASSED] Cannot be interrupted, locks straight away
[16:37:55] [PASSED] Can be interrupted, sleeps
[16:37:55] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[16:37:55] [PASSED] ttm_bo_reserve_locked_no_sleep
[16:37:55] [PASSED] ttm_bo_reserve_no_wait_ticket
[16:37:55] [PASSED] ttm_bo_reserve_double_resv
[16:37:55] [PASSED] ttm_bo_reserve_interrupted
[16:37:55] [PASSED] ttm_bo_reserve_deadlock
[16:37:55] [PASSED] ttm_bo_unreserve_basic
[16:37:55] [PASSED] ttm_bo_unreserve_pinned
[16:37:55] [PASSED] ttm_bo_unreserve_bulk
[16:37:55] [PASSED] ttm_bo_fini_basic
[16:37:55] [PASSED] ttm_bo_fini_shared_resv
[16:37:55] [PASSED] ttm_bo_pin_basic
[16:37:55] [PASSED] ttm_bo_pin_unpin_resource
[16:37:55] [PASSED] ttm_bo_multiple_pin_one_unpin
[16:37:55] ===================== [PASSED] ttm_bo ======================
[16:37:55] ============== ttm_bo_validate (22 subtests) ===============
[16:37:55] ============== ttm_bo_init_reserved_sys_man  ===============
[16:37:55] [PASSED] Buffer object for userspace
[16:37:55] [PASSED] Kernel buffer object
[16:37:55] [PASSED] Shared buffer object
[16:37:55] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[16:37:55] ============== ttm_bo_init_reserved_mock_man  ==============
[16:37:55] [PASSED] Buffer object for userspace
[16:37:55] [PASSED] Kernel buffer object
[16:37:55] [PASSED] Shared buffer object
[16:37:55] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[16:37:55] [PASSED] ttm_bo_init_reserved_resv
[16:37:55] ================== ttm_bo_validate_basic  ==================
[16:37:55] [PASSED] Buffer object for userspace
[16:37:55] [PASSED] Kernel buffer object
[16:37:55] [PASSED] Shared buffer object
[16:37:55] ============== [PASSED] ttm_bo_validate_basic ==============
[16:37:55] [PASSED] ttm_bo_validate_invalid_placement
[16:37:55] ============= ttm_bo_validate_same_placement  ==============
[16:37:55] [PASSED] System manager
[16:37:55] [PASSED] VRAM manager
[16:37:55] ========= [PASSED] ttm_bo_validate_same_placement ==========
[16:37:55] [PASSED] ttm_bo_validate_failed_alloc
[16:37:55] [PASSED] ttm_bo_validate_pinned
[16:37:55] [PASSED] ttm_bo_validate_busy_placement
[16:37:55] ================ ttm_bo_validate_multihop  =================
[16:37:55] [PASSED] Buffer object for userspace
[16:37:55] [PASSED] Kernel buffer object
[16:37:55] [PASSED] Shared buffer object
[16:37:55] ============ [PASSED] ttm_bo_validate_multihop =============
[16:37:55] ========== ttm_bo_validate_no_placement_signaled  ==========
[16:37:55] [PASSED] Buffer object in system domain, no page vector
[16:37:55] [PASSED] Buffer object in system domain with an existing page vector
[16:37:55] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[16:37:55] ======== ttm_bo_validate_no_placement_not_signaled  ========
[16:37:55] [PASSED] Buffer object for userspace
[16:37:55] [PASSED] Kernel buffer object
[16:37:55] [PASSED] Shared buffer object
[16:37:55] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[16:37:55] [PASSED] ttm_bo_validate_move_fence_signaled
[16:37:55] ========= ttm_bo_validate_move_fence_not_signaled  =========
[16:37:55] [PASSED] Waits for GPU
[16:37:55] [PASSED] Tries to lock straight away
[16:37:55] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[16:37:55] [PASSED] ttm_bo_validate_swapout
[16:37:55] [PASSED] ttm_bo_validate_happy_evict
[16:37:55] [PASSED] ttm_bo_validate_all_pinned_evict
[16:37:55] [PASSED] ttm_bo_validate_allowed_only_evict
[16:37:55] [PASSED] ttm_bo_validate_deleted_evict
[16:37:55] [PASSED] ttm_bo_validate_busy_domain_evict
[16:37:55] [PASSED] ttm_bo_validate_evict_gutting
[16:37:55] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[16:37:55] ================= [PASSED] ttm_bo_validate =================
[16:37:55] ============================================================
[16:37:55] Testing complete. Ran 102 tests: passed: 102
[16:37:55] Elapsed time: 11.208s total, 1.672s configuring, 9.270s building, 0.226s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper
  2026-04-13 13:16 ` [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper Tejas Upadhyay
  2026-04-13 13:28   ` Matthew Auld
@ 2026-04-13 17:30   ` Matthew Auld
  2026-04-14  5:36     ` Upadhyay, Tejas
  1 sibling, 1 reply; 21+ messages in thread
From: Matthew Auld @ 2026-04-13 17:30 UTC (permalink / raw)
  To: Tejas Upadhyay, intel-xe
  Cc: matthew.brost, thomas.hellstrom, himal.prasad.ghimiray

On 13/04/2026 14:16, Tejas Upadhyay wrote:
> Add helper with primary purpose is to efficiently trace a specific
> physical memory address back to its corresponding TTM buffer object.
> 
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>   drivers/gpu/buddy.c       | 56 +++++++++++++++++++++++++++++++++++++++
>   include/linux/gpu_buddy.h |  2 ++
>   2 files changed, 58 insertions(+)
> 
> diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
> index 52686672e99f..2d26c2a0f971 100644
> --- a/drivers/gpu/buddy.c
> +++ b/drivers/gpu/buddy.c
> @@ -589,6 +589,62 @@ void gpu_buddy_free_block(struct gpu_buddy *mm,
>   }
>   EXPORT_SYMBOL(gpu_buddy_free_block);
>   
> +/**
> + * gpu_buddy_addr_to_block - given physical address find a block
> + *
> + * @mm: GPU buddy manager
> + * @addr: Physical address
> + *
> + * Returns:
> + * gpu_buddy_block on success, NULL or error code on failure
> + */
> +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr)
> +{
> +	struct gpu_buddy_block *block;
> +	LIST_HEAD(dfs);
> +	u64 end;
> +	int i;
> +
> +	end = addr + SZ_4K - 1;
> +	for (i = 0; i < mm->n_roots; ++i)
> +		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
> +
> +	do {
> +		u64 block_start;
> +		u64 block_end;
> +
> +		block = list_first_entry_or_null(&dfs,
> +						 struct gpu_buddy_block,
> +						 tmp_link);
> +		if (!block)
> +			break;
> +
> +		list_del(&block->tmp_link);
> +
> +		block_start = gpu_buddy_block_offset(block);
> +		block_end = block_start + gpu_buddy_block_size(mm, block) - 1;
> +
> +		if (!overlaps(addr, end, block_start, block_end))
> +			continue;
> +
> +		if (contains(addr, end, block_start, block_end) &&

Oops, this looks like the bug you were hitting. This is copy-pasta from 
range_alloc, where we want to allocate every node that fully contains 
the [addr, end], recursively splitting until that happens. But here we 
actually just want the first non-split overlapping node, since we can't 
split anything.

I think just re-write this whole thing as:

if (!overlaps(addr, end, block_start, block_end))
     continue;

if (gpu_buddy_block_is_allocated(block))
    return block;
else if (gpu_buddy_block_is_free(block))
    return NULL;

list_add(&block->right->tmp_link, &dfs);
list_add(&block->left->tmp_link, &dfs);

And this should fix your issue and simplifies this code a bit.

> +		    !gpu_buddy_block_is_split(block)) {
> +			if (gpu_buddy_block_is_free(block))
> +				return NULL;
> +			else if (gpu_buddy_block_is_allocated(block) && !mm->clear_avail)
> +				return block;
> +		}
> +
> +		if (gpu_buddy_block_is_split(block)) {
> +			list_add(&block->right->tmp_link, &dfs);
> +			list_add(&block->left->tmp_link, &dfs);
> +		}
> +	} while (1);
> +
> +	return ERR_PTR(-ENXIO);
> +}
> +EXPORT_SYMBOL(gpu_buddy_addr_to_block);
> +
>   static void __gpu_buddy_free_list(struct gpu_buddy *mm,
>   				  struct list_head *objects,
>   				  bool mark_clear,
> diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
> index 5fa917ba5450..957c69c560bc 100644
> --- a/include/linux/gpu_buddy.h
> +++ b/include/linux/gpu_buddy.h
> @@ -231,6 +231,8 @@ void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear);
>   
>   void gpu_buddy_free_block(struct gpu_buddy *mm, struct gpu_buddy_block *block);
>   
> +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy *mm, u64 addr);
> +
>   void gpu_buddy_free_list(struct gpu_buddy *mm,
>   			 struct list_head *objects,
>   			 unsigned int flags);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✓ Xe.CI.BAT: success for Add memory page offlining support (rev7)
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (10 preceding siblings ...)
  2026-04-13 16:37 ` ✓ CI.KUnit: success " Patchwork
@ 2026-04-13 17:43 ` Patchwork
  2026-04-13 20:12 ` ✗ Xe.CI.FULL: failure " Patchwork
  2026-04-15 15:10 ` [RFC PATCH V7 0/9] Add memory page offlining support Upadhyay, Tejas
  13 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2026-04-13 17:43 UTC (permalink / raw)
  To: Upadhyay, Tejas; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

== Series Details ==

Series: Add memory page offlining support (rev7)
URL   : https://patchwork.freedesktop.org/series/161473/
State : success

== Summary ==

CI Bug Log - changes from xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54_BAT -> xe-pw-161473v7_BAT
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (11 -> 11)
------------------------------

  No changes in participating hosts


Changes
-------

  No changes found


Build changes
-------------

  * Linux: xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54 -> xe-pw-161473v7

  IGT_8854: 93abaf0170728f69bc27577e5b405f7a2a01b6fd @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54: 380e49c900a45f7c7c206b48862f946bcbda4c54
  xe-pw-161473v7: 161473v7

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/index.html

[-- Attachment #2: Type: text/html, Size: 1502 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✗ Xe.CI.FULL: failure for Add memory page offlining support (rev7)
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (11 preceding siblings ...)
  2026-04-13 17:43 ` ✓ Xe.CI.BAT: " Patchwork
@ 2026-04-13 20:12 ` Patchwork
  2026-04-15 15:10 ` [RFC PATCH V7 0/9] Add memory page offlining support Upadhyay, Tejas
  13 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2026-04-13 20:12 UTC (permalink / raw)
  To: Upadhyay, Tejas; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 32154 bytes --]

== Series Details ==

Series: Add memory page offlining support (rev7)
URL   : https://patchwork.freedesktop.org/series/161473/
State : failure

== Summary ==

CI Bug Log - changes from xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54_FULL -> xe-pw-161473v7_FULL
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with xe-pw-161473v7_FULL absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in xe-pw-161473v7_FULL, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (2 -> 2)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in xe-pw-161473v7_FULL:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_pm_rpm@system-suspend-idle:
    - shard-lnl:          [PASS][1] -> [DMESG-WARN][2]
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-lnl-8/igt@kms_pm_rpm@system-suspend-idle.html
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-lnl-8/igt@kms_pm_rpm@system-suspend-idle.html

  
Known issues
------------

  Here are the changes found in xe-pw-161473v7_FULL that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_big_fb@linear-32bpp-rotate-270:
    - shard-bmg:          NOTRUN -> [SKIP][3] ([Intel XE#2327]) +2 other tests skip
   [3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_big_fb@linear-32bpp-rotate-270.html

  * igt@kms_big_fb@yf-tiled-32bpp-rotate-0:
    - shard-bmg:          NOTRUN -> [SKIP][4] ([Intel XE#1124]) +8 other tests skip
   [4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_big_fb@yf-tiled-32bpp-rotate-0.html

  * igt@kms_big_fb@yf-tiled-addfb-size-offset-overflow:
    - shard-bmg:          NOTRUN -> [SKIP][5] ([Intel XE#607] / [Intel XE#7361])
   [5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_big_fb@yf-tiled-addfb-size-offset-overflow.html

  * igt@kms_bw@connected-linear-tiling-2-displays-2560x1440p:
    - shard-bmg:          NOTRUN -> [SKIP][6] ([Intel XE#7679])
   [6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_bw@connected-linear-tiling-2-displays-2560x1440p.html

  * igt@kms_bw@connected-linear-tiling-4-displays-1920x1080p:
    - shard-bmg:          NOTRUN -> [SKIP][7] ([Intel XE#7621])
   [7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_bw@connected-linear-tiling-4-displays-1920x1080p.html

  * igt@kms_bw@linear-tiling-3-displays-1920x1080p:
    - shard-bmg:          NOTRUN -> [SKIP][8] ([Intel XE#367] / [Intel XE#7354]) +1 other test skip
   [8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_bw@linear-tiling-3-displays-1920x1080p.html

  * igt@kms_ccs@crc-primary-suspend-y-tiled-gen12-rc-ccs:
    - shard-bmg:          NOTRUN -> [SKIP][9] ([Intel XE#3432])
   [9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_ccs@crc-primary-suspend-y-tiled-gen12-rc-ccs.html

  * igt@kms_ccs@crc-sprite-planes-basic-4-tiled-dg2-rc-ccs-cc:
    - shard-bmg:          NOTRUN -> [SKIP][10] ([Intel XE#2887]) +14 other tests skip
   [10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_ccs@crc-sprite-planes-basic-4-tiled-dg2-rc-ccs-cc.html

  * igt@kms_cdclk@plane-scaling:
    - shard-bmg:          NOTRUN -> [SKIP][11] ([Intel XE#2724] / [Intel XE#7449])
   [11]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_cdclk@plane-scaling.html

  * igt@kms_chamelium_color@degamma:
    - shard-bmg:          NOTRUN -> [SKIP][12] ([Intel XE#2325] / [Intel XE#7358])
   [12]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_chamelium_color@degamma.html

  * igt@kms_chamelium_edid@dp-edid-resolution-list:
    - shard-bmg:          NOTRUN -> [SKIP][13] ([Intel XE#2252]) +5 other tests skip
   [13]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_chamelium_edid@dp-edid-resolution-list.html

  * igt@kms_content_protection@dp-mst-type-1-suspend-resume:
    - shard-bmg:          NOTRUN -> [SKIP][14] ([Intel XE#6974])
   [14]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_content_protection@dp-mst-type-1-suspend-resume.html

  * igt@kms_content_protection@mei-interface:
    - shard-bmg:          NOTRUN -> [SKIP][15] ([Intel XE#7642])
   [15]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_content_protection@mei-interface.html

  * igt@kms_content_protection@uevent-hdcp14:
    - shard-bmg:          NOTRUN -> [FAIL][16] ([Intel XE#6707] / [Intel XE#7439]) +1 other test fail
   [16]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_content_protection@uevent-hdcp14.html

  * igt@kms_cursor_crc@cursor-offscreen-512x512:
    - shard-bmg:          NOTRUN -> [SKIP][17] ([Intel XE#2321] / [Intel XE#7355]) +2 other tests skip
   [17]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_cursor_crc@cursor-offscreen-512x512.html

  * igt@kms_cursor_crc@cursor-onscreen-256x85:
    - shard-bmg:          NOTRUN -> [SKIP][18] ([Intel XE#2320]) +3 other tests skip
   [18]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_cursor_crc@cursor-onscreen-256x85.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic:
    - shard-bmg:          NOTRUN -> [FAIL][19] ([Intel XE#7571])
   [19]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_cursor_legacy@flip-vs-cursor-atomic.html

  * igt@kms_dp_link_training@non-uhbr-mst:
    - shard-bmg:          NOTRUN -> [SKIP][20] ([Intel XE#4354] / [Intel XE#5882])
   [20]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_dp_link_training@non-uhbr-mst.html

  * igt@kms_dp_link_training@uhbr-mst:
    - shard-bmg:          NOTRUN -> [SKIP][21] ([Intel XE#4354] / [Intel XE#7386])
   [21]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_dp_link_training@uhbr-mst.html

  * igt@kms_dp_link_training@uhbr-sst:
    - shard-bmg:          NOTRUN -> [SKIP][22] ([Intel XE#4354] / [Intel XE#5870])
   [22]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_dp_link_training@uhbr-sst.html

  * igt@kms_dsc@dsc-fractional-bpp-with-bpc:
    - shard-bmg:          NOTRUN -> [SKIP][23] ([Intel XE#2244])
   [23]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_dsc@dsc-fractional-bpp-with-bpc.html

  * igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-dirtyfb-tests:
    - shard-bmg:          NOTRUN -> [SKIP][24] ([Intel XE#4422] / [Intel XE#7442])
   [24]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-dirtyfb-tests.html

  * igt@kms_feature_discovery@psr1:
    - shard-bmg:          NOTRUN -> [SKIP][25] ([Intel XE#2374] / [Intel XE#6127])
   [25]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_feature_discovery@psr1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-yftileccs-to-64bpp-yftile-upscaling:
    - shard-bmg:          NOTRUN -> [SKIP][26] ([Intel XE#7178] / [Intel XE#7351]) +1 other test skip
   [26]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_flip_scaled_crc@flip-32bpp-yftileccs-to-64bpp-yftile-upscaling.html

  * igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-upscaling:
    - shard-bmg:          NOTRUN -> [SKIP][27] ([Intel XE#7178] / [Intel XE#7349])
   [27]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-upscaling.html

  * igt@kms_flip_scaled_crc@flip-nv12-linear-to-nv12-linear-reflect-x:
    - shard-bmg:          NOTRUN -> [SKIP][28] ([Intel XE#7179])
   [28]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_flip_scaled_crc@flip-nv12-linear-to-nv12-linear-reflect-x.html

  * igt@kms_frontbuffer_tracking@fbc-argb161616f-draw-render:
    - shard-bmg:          NOTRUN -> [SKIP][29] ([Intel XE#7061] / [Intel XE#7356]) +2 other tests skip
   [29]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_frontbuffer_tracking@fbc-argb161616f-draw-render.html

  * igt@kms_frontbuffer_tracking@fbc-tiling-linear:
    - shard-bmg:          NOTRUN -> [SKIP][30] ([Intel XE#4141]) +10 other tests skip
   [30]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_frontbuffer_tracking@fbc-tiling-linear.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-2p-scndscrn-cur-indfb-draw-mmap-wc:
    - shard-bmg:          NOTRUN -> [SKIP][31] ([Intel XE#2311]) +29 other tests skip
   [31]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_frontbuffer_tracking@fbcdrrs-2p-scndscrn-cur-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-plflip-blt:
    - shard-bmg:          NOTRUN -> [SKIP][32] ([Intel XE#2313]) +24 other tests skip
   [32]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-plflip-blt.html

  * igt@kms_joiner@invalid-modeset-force-ultra-joiner:
    - shard-bmg:          NOTRUN -> [SKIP][33] ([Intel XE#6911] / [Intel XE#7466])
   [33]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_joiner@invalid-modeset-force-ultra-joiner.html

  * igt@kms_panel_fitting@legacy:
    - shard-bmg:          NOTRUN -> [SKIP][34] ([Intel XE#2486])
   [34]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_panel_fitting@legacy.html

  * igt@kms_plane@pixel-format-4-tiled-bmg-ccs-modifier@pipe-b-plane-5:
    - shard-bmg:          NOTRUN -> [SKIP][35] ([Intel XE#7130]) +1 other test skip
   [35]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_plane@pixel-format-4-tiled-bmg-ccs-modifier@pipe-b-plane-5.html

  * igt@kms_plane@pixel-format-y-tiled-modifier:
    - shard-bmg:          NOTRUN -> [SKIP][36] ([Intel XE#7283]) +2 other tests skip
   [36]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@kms_plane@pixel-format-y-tiled-modifier.html

  * igt@kms_plane_lowres@tiling-4:
    - shard-bmg:          NOTRUN -> [INCOMPLETE][37] ([Intel XE#5681])
   [37]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_plane_lowres@tiling-4.html

  * igt@kms_plane_lowres@tiling-4@pipe-d-dp-2:
    - shard-bmg:          NOTRUN -> [DMESG-FAIL][38] ([Intel XE#1727] / [Intel XE#6819])
   [38]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_plane_lowres@tiling-4@pipe-d-dp-2.html

  * igt@kms_plane_lowres@tiling-y:
    - shard-bmg:          NOTRUN -> [SKIP][39] ([Intel XE#2393])
   [39]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_plane_lowres@tiling-y.html

  * igt@kms_plane_multiple@2x-tiling-y:
    - shard-bmg:          NOTRUN -> [SKIP][40] ([Intel XE#5021] / [Intel XE#7377])
   [40]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_plane_multiple@2x-tiling-y.html

  * igt@kms_pm_backlight@basic-brightness:
    - shard-bmg:          NOTRUN -> [SKIP][41] ([Intel XE#7376] / [Intel XE#870])
   [41]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_pm_backlight@basic-brightness.html

  * igt@kms_pm_dc@dc6-psr:
    - shard-bmg:          NOTRUN -> [SKIP][42] ([Intel XE#2392] / [Intel XE#6927])
   [42]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_pm_dc@dc6-psr.html

  * igt@kms_psr2_sf@psr2-plane-move-sf-dmg-area:
    - shard-bmg:          NOTRUN -> [SKIP][43] ([Intel XE#1489]) +8 other tests skip
   [43]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_psr2_sf@psr2-plane-move-sf-dmg-area.html

  * igt@kms_psr@fbc-psr-suspend:
    - shard-bmg:          NOTRUN -> [SKIP][44] ([Intel XE#2234] / [Intel XE#2850]) +11 other tests skip
   [44]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_psr@fbc-psr-suspend.html

  * igt@kms_rotation_crc@primary-y-tiled-reflect-x-90:
    - shard-bmg:          NOTRUN -> [SKIP][45] ([Intel XE#3904] / [Intel XE#7342]) +1 other test skip
   [45]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@kms_rotation_crc@primary-y-tiled-reflect-x-90.html

  * igt@kms_sharpness_filter@invalid-filter-with-scaler:
    - shard-bmg:          NOTRUN -> [SKIP][46] ([Intel XE#6503])
   [46]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_sharpness_filter@invalid-filter-with-scaler.html

  * igt@kms_tiled_display@basic-test-pattern-with-chamelium:
    - shard-bmg:          NOTRUN -> [SKIP][47] ([Intel XE#2426] / [Intel XE#5848])
   [47]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html

  * igt@kms_vrr@seamless-rr-switch-virtual:
    - shard-bmg:          NOTRUN -> [SKIP][48] ([Intel XE#1499])
   [48]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@kms_vrr@seamless-rr-switch-virtual.html

  * igt@xe_compute@eu-busy-10s:
    - shard-bmg:          NOTRUN -> [SKIP][49] ([Intel XE#6599])
   [49]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_compute@eu-busy-10s.html

  * igt@xe_configfs@engines-allowed-invalid:
    - shard-bmg:          [PASS][50] -> [DMESG-WARN][51] ([Intel XE#7725])
   [50]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-10/igt@xe_configfs@engines-allowed-invalid.html
   [51]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-6/igt@xe_configfs@engines-allowed-invalid.html

  * igt@xe_eudebug@basic-vm-bind-ufence:
    - shard-bmg:          NOTRUN -> [SKIP][52] ([Intel XE#7636]) +10 other tests skip
   [52]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_eudebug@basic-vm-bind-ufence.html

  * igt@xe_evict@evict-small-multi-queue:
    - shard-bmg:          NOTRUN -> [SKIP][53] ([Intel XE#7140])
   [53]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_evict@evict-small-multi-queue.html

  * igt@xe_exec_basic@multigpu-no-exec-null:
    - shard-bmg:          NOTRUN -> [SKIP][54] ([Intel XE#2322] / [Intel XE#7372]) +6 other tests skip
   [54]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_exec_basic@multigpu-no-exec-null.html

  * igt@xe_exec_fault_mode@many-multi-queue-rebind-prefetch:
    - shard-bmg:          NOTRUN -> [SKIP][55] ([Intel XE#7136]) +11 other tests skip
   [55]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_exec_fault_mode@many-multi-queue-rebind-prefetch.html

  * igt@xe_exec_multi_queue@many-execs-basic-smem:
    - shard-bmg:          NOTRUN -> [SKIP][56] ([Intel XE#6874]) +29 other tests skip
   [56]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_exec_multi_queue@many-execs-basic-smem.html

  * igt@xe_exec_threads@threads-multi-queue-cm-fd-rebind:
    - shard-bmg:          NOTRUN -> [SKIP][57] ([Intel XE#7138]) +7 other tests skip
   [57]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_exec_threads@threads-multi-queue-cm-fd-rebind.html

  * igt@xe_multigpu_svm@mgpu-pagefault-basic:
    - shard-bmg:          NOTRUN -> [SKIP][58] ([Intel XE#6964]) +2 other tests skip
   [58]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_multigpu_svm@mgpu-pagefault-basic.html

  * igt@xe_oa@oa-tlb-invalidate:
    - shard-bmg:          NOTRUN -> [SKIP][59] ([Intel XE#2248] / [Intel XE#7325] / [Intel XE#7393])
   [59]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_oa@oa-tlb-invalidate.html

  * igt@xe_pat@pat-sw-hw-suspend:
    - shard-bmg:          NOTRUN -> [SKIP][60] ([Intel XE#7590])
   [60]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_pat@pat-sw-hw-suspend.html

  * igt@xe_prefetch_fault@prefetch-fault:
    - shard-bmg:          NOTRUN -> [SKIP][61] ([Intel XE#7599])
   [61]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_prefetch_fault@prefetch-fault.html

  * igt@xe_pxp@pxp-optout:
    - shard-bmg:          NOTRUN -> [SKIP][62] ([Intel XE#4733] / [Intel XE#7417]) +1 other test skip
   [62]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_pxp@pxp-optout.html

  * igt@xe_query@multigpu-query-engines:
    - shard-bmg:          NOTRUN -> [SKIP][63] ([Intel XE#944])
   [63]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_query@multigpu-query-engines.html

  
#### Possible fixes ####

  * igt@kms_hdr@invalid-hdr:
    - shard-bmg:          [SKIP][64] ([Intel XE#1503]) -> [PASS][65]
   [64]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-9/igt@kms_hdr@invalid-hdr.html
   [65]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-1/igt@kms_hdr@invalid-hdr.html

  * igt@kms_vrr@cmrr@pipe-a-edp-1:
    - shard-lnl:          [FAIL][66] ([Intel XE#4459]) -> [PASS][67] +1 other test pass
   [66]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-lnl-5/igt@kms_vrr@cmrr@pipe-a-edp-1.html
   [67]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-lnl-4/igt@kms_vrr@cmrr@pipe-a-edp-1.html

  
#### Warnings ####

  * igt@xe_module_load@load:
    - shard-bmg:          ([PASS][68], [PASS][69], [PASS][70], [PASS][71], [PASS][72], [PASS][73], [PASS][74], [PASS][75], [PASS][76], [PASS][77], [PASS][78], [PASS][79], [PASS][80], [PASS][81], [PASS][82], [PASS][83], [PASS][84], [PASS][85], [PASS][86], [PASS][87], [DMESG-WARN][88], [DMESG-WARN][89], [PASS][90], [DMESG-WARN][91], [DMESG-WARN][92]) ([Intel XE#7725]) -> ([PASS][93], [PASS][94], [PASS][95], [PASS][96], [PASS][97], [PASS][98], [PASS][99], [PASS][100], [PASS][101], [PASS][102], [PASS][103], [PASS][104], [PASS][105], [PASS][106], [PASS][107], [PASS][108], [PASS][109], [SKIP][110], [DMESG-WARN][111], [PASS][112], [DMESG-WARN][113], [DMESG-WARN][114], [PASS][115], [PASS][116], [PASS][117], [PASS][118]) ([Intel XE#2457] / [Intel XE#7405] / [Intel XE#7725])
   [68]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-5/igt@xe_module_load@load.html
   [69]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-8/igt@xe_module_load@load.html
   [70]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-8/igt@xe_module_load@load.html
   [71]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-3/igt@xe_module_load@load.html
   [72]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-3/igt@xe_module_load@load.html
   [73]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-5/igt@xe_module_load@load.html
   [74]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-7/igt@xe_module_load@load.html
   [75]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-2/igt@xe_module_load@load.html
   [76]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-9/igt@xe_module_load@load.html
   [77]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-7/igt@xe_module_load@load.html
   [78]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-9/igt@xe_module_load@load.html
   [79]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-2/igt@xe_module_load@load.html
   [80]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-7/igt@xe_module_load@load.html
   [81]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-1/igt@xe_module_load@load.html
   [82]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-1/igt@xe_module_load@load.html
   [83]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-3/igt@xe_module_load@load.html
   [84]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-10/igt@xe_module_load@load.html
   [85]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-10/igt@xe_module_load@load.html
   [86]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-1/igt@xe_module_load@load.html
   [87]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-2/igt@xe_module_load@load.html
   [88]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-6/igt@xe_module_load@load.html
   [89]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-6/igt@xe_module_load@load.html
   [90]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-6/igt@xe_module_load@load.html
   [91]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-6/igt@xe_module_load@load.html
   [92]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54/shard-bmg-6/igt@xe_module_load@load.html
   [93]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_module_load@load.html
   [94]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_module_load@load.html
   [95]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_module_load@load.html
   [96]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-10/igt@xe_module_load@load.html
   [97]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-8/igt@xe_module_load@load.html
   [98]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-8/igt@xe_module_load@load.html
   [99]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_module_load@load.html
   [100]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-9/igt@xe_module_load@load.html
   [101]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-3/igt@xe_module_load@load.html
   [102]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-9/igt@xe_module_load@load.html
   [103]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-2/igt@xe_module_load@load.html
   [104]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-2/igt@xe_module_load@load.html
   [105]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-7/igt@xe_module_load@load.html
   [106]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-1/igt@xe_module_load@load.html
   [107]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-1/igt@xe_module_load@load.html
   [108]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-10/igt@xe_module_load@load.html
   [109]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-10/igt@xe_module_load@load.html
   [110]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_module_load@load.html
   [111]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-6/igt@xe_module_load@load.html
   [112]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-6/igt@xe_module_load@load.html
   [113]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-6/igt@xe_module_load@load.html
   [114]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-6/igt@xe_module_load@load.html
   [115]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_module_load@load.html
   [116]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-5/igt@xe_module_load@load.html
   [117]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-9/igt@xe_module_load@load.html
   [118]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/shard-bmg-8/igt@xe_module_load@load.html

  
  [Intel XE#1124]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1124
  [Intel XE#1489]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1489
  [Intel XE#1499]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1499
  [Intel XE#1503]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1503
  [Intel XE#1727]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1727
  [Intel XE#2234]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2234
  [Intel XE#2244]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2244
  [Intel XE#2248]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2248
  [Intel XE#2252]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2252
  [Intel XE#2311]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2311
  [Intel XE#2313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2313
  [Intel XE#2320]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2320
  [Intel XE#2321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2321
  [Intel XE#2322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2322
  [Intel XE#2325]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2325
  [Intel XE#2327]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2327
  [Intel XE#2374]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2374
  [Intel XE#2392]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2392
  [Intel XE#2393]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2393
  [Intel XE#2426]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2426
  [Intel XE#2457]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2457
  [Intel XE#2486]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2486
  [Intel XE#2724]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2724
  [Intel XE#2850]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2850
  [Intel XE#2887]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2887
  [Intel XE#3432]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3432
  [Intel XE#367]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/367
  [Intel XE#3904]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3904
  [Intel XE#4141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4141
  [Intel XE#4354]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4354
  [Intel XE#4422]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4422
  [Intel XE#4459]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4459
  [Intel XE#4733]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4733
  [Intel XE#5021]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5021
  [Intel XE#5681]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5681
  [Intel XE#5848]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5848
  [Intel XE#5870]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5870
  [Intel XE#5882]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5882
  [Intel XE#607]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/607
  [Intel XE#6127]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6127
  [Intel XE#6503]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6503
  [Intel XE#6599]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6599
  [Intel XE#6707]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6707
  [Intel XE#6819]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6819
  [Intel XE#6874]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6874
  [Intel XE#6911]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6911
  [Intel XE#6927]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6927
  [Intel XE#6964]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6964
  [Intel XE#6974]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6974
  [Intel XE#7061]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7061
  [Intel XE#7130]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7130
  [Intel XE#7136]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7136
  [Intel XE#7138]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7138
  [Intel XE#7140]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7140
  [Intel XE#7178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7178
  [Intel XE#7179]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7179
  [Intel XE#7283]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7283
  [Intel XE#7325]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7325
  [Intel XE#7342]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7342
  [Intel XE#7349]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7349
  [Intel XE#7351]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7351
  [Intel XE#7354]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7354
  [Intel XE#7355]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7355
  [Intel XE#7356]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7356
  [Intel XE#7358]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7358
  [Intel XE#7361]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7361
  [Intel XE#7372]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7372
  [Intel XE#7376]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7376
  [Intel XE#7377]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7377
  [Intel XE#7386]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7386
  [Intel XE#7393]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7393
  [Intel XE#7405]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7405
  [Intel XE#7417]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7417
  [Intel XE#7439]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7439
  [Intel XE#7442]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7442
  [Intel XE#7449]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7449
  [Intel XE#7466]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7466
  [Intel XE#7571]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7571
  [Intel XE#7590]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7590
  [Intel XE#7599]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7599
  [Intel XE#7621]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7621
  [Intel XE#7636]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7636
  [Intel XE#7642]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7642
  [Intel XE#7679]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7679
  [Intel XE#7725]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/7725
  [Intel XE#870]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/870
  [Intel XE#944]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/944


Build changes
-------------

  * Linux: xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54 -> xe-pw-161473v7

  IGT_8854: 93abaf0170728f69bc27577e5b405f7a2a01b6fd @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-4893-380e49c900a45f7c7c206b48862f946bcbda4c54: 380e49c900a45f7c7c206b48862f946bcbda4c54
  xe-pw-161473v7: 161473v7

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-161473v7/index.html

[-- Attachment #2: Type: text/html, Size: 34679 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper
  2026-04-13 17:30   ` Matthew Auld
@ 2026-04-14  5:36     ` Upadhyay, Tejas
  0 siblings, 0 replies; 21+ messages in thread
From: Upadhyay, Tejas @ 2026-04-14  5:36 UTC (permalink / raw)
  To: Auld, Matthew, intel-xe@lists.freedesktop.org
  Cc: Brost, Matthew, thomas.hellstrom@linux.intel.com,
	Ghimiray, Himal Prasad



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: 13 April 2026 23:00
> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
> xe@lists.freedesktop.org
> Cc: Brost, Matthew <matthew.brost@intel.com>;
> thomas.hellstrom@linux.intel.com; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray@intel.com>
> Subject: Re: [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block
> helper
> 
> On 13/04/2026 14:16, Tejas Upadhyay wrote:
> > Add helper with primary purpose is to efficiently trace a specific
> > physical memory address back to its corresponding TTM buffer object.
> >
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   drivers/gpu/buddy.c       | 56
> +++++++++++++++++++++++++++++++++++++++
> >   include/linux/gpu_buddy.h |  2 ++
> >   2 files changed, 58 insertions(+)
> >
> > diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c index
> > 52686672e99f..2d26c2a0f971 100644
> > --- a/drivers/gpu/buddy.c
> > +++ b/drivers/gpu/buddy.c
> > @@ -589,6 +589,62 @@ void gpu_buddy_free_block(struct gpu_buddy
> *mm,
> >   }
> >   EXPORT_SYMBOL(gpu_buddy_free_block);
> >
> > +/**
> > + * gpu_buddy_addr_to_block - given physical address find a block
> > + *
> > + * @mm: GPU buddy manager
> > + * @addr: Physical address
> > + *
> > + * Returns:
> > + * gpu_buddy_block on success, NULL or error code on failure  */
> > +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy
> *mm,
> > +u64 addr) {
> > +	struct gpu_buddy_block *block;
> > +	LIST_HEAD(dfs);
> > +	u64 end;
> > +	int i;
> > +
> > +	end = addr + SZ_4K - 1;
> > +	for (i = 0; i < mm->n_roots; ++i)
> > +		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
> > +
> > +	do {
> > +		u64 block_start;
> > +		u64 block_end;
> > +
> > +		block = list_first_entry_or_null(&dfs,
> > +						 struct gpu_buddy_block,
> > +						 tmp_link);
> > +		if (!block)
> > +			break;
> > +
> > +		list_del(&block->tmp_link);
> > +
> > +		block_start = gpu_buddy_block_offset(block);
> > +		block_end = block_start + gpu_buddy_block_size(mm, block) -
> 1;
> > +
> > +		if (!overlaps(addr, end, block_start, block_end))
> > +			continue;
> > +
> > +		if (contains(addr, end, block_start, block_end) &&
> 
> Oops, this looks like the bug you were hitting. This is copy-pasta from
> range_alloc, where we want to allocate every node that fully contains the
> [addr, end], recursively splitting until that happens. But here we actually just
> want the first non-split overlapping node, since we can't split anything.
> 
> I think just re-write this whole thing as:
> 
> if (!overlaps(addr, end, block_start, block_end))
>      continue;
> 
> if (gpu_buddy_block_is_allocated(block))
>     return block;
> else if (gpu_buddy_block_is_free(block))
>     return NULL;

Yes this works and eliminates bug we saw yesterday. I had doubt that what if we have address which if we map to 4k might be going to end in next split block, but it makes sense I think we will align to first split block or next split block based on address as where it falls and wont consider both adjacent split blocks at the same time. Clear.

Tejas
> 
> list_add(&block->right->tmp_link, &dfs); list_add(&block->left->tmp_link,
> &dfs);
> 
> And this should fix your issue and simplifies this code a bit.
> 
> > +		    !gpu_buddy_block_is_split(block)) {
> > +			if (gpu_buddy_block_is_free(block))
> > +				return NULL;
> > +			else if (gpu_buddy_block_is_allocated(block) &&
> !mm->clear_avail)
> > +				return block;
> > +		}
> > +
> > +		if (gpu_buddy_block_is_split(block)) {
> > +			list_add(&block->right->tmp_link, &dfs);
> > +			list_add(&block->left->tmp_link, &dfs);
> > +		}
> > +	} while (1);
> > +
> > +	return ERR_PTR(-ENXIO);
> > +}
> > +EXPORT_SYMBOL(gpu_buddy_addr_to_block);
> > +
> >   static void __gpu_buddy_free_list(struct gpu_buddy *mm,
> >   				  struct list_head *objects,
> >   				  bool mark_clear,
> > diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
> > index 5fa917ba5450..957c69c560bc 100644
> > --- a/include/linux/gpu_buddy.h
> > +++ b/include/linux/gpu_buddy.h
> > @@ -231,6 +231,8 @@ void gpu_buddy_reset_clear(struct gpu_buddy
> *mm,
> > bool is_clear);
> >
> >   void gpu_buddy_free_block(struct gpu_buddy *mm, struct
> > gpu_buddy_block *block);
> >
> > +struct gpu_buddy_block *gpu_buddy_addr_to_block(struct gpu_buddy
> *mm,
> > +u64 addr);
> > +
> >   void gpu_buddy_free_list(struct gpu_buddy *mm,
> >   			 struct list_head *objects,
> >   			 unsigned int flags);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [RFC PATCH V7 0/9] Add memory page offlining support
  2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
                   ` (12 preceding siblings ...)
  2026-04-13 20:12 ` ✗ Xe.CI.FULL: failure " Patchwork
@ 2026-04-15 15:10 ` Upadhyay, Tejas
  13 siblings, 0 replies; 21+ messages in thread
From: Upadhyay, Tejas @ 2026-04-15 15:10 UTC (permalink / raw)
  To: intel-xe@lists.freedesktop.org
  Cc: Auld, Matthew, Brost, Matthew, thomas.hellstrom@linux.intel.com,
	Ghimiray, Himal Prasad

Wait on reviewing this I am just respining  this with more updates as after RAS integration sync up (in xe-internal) there is some update. Also adding lockdep in gpu_buddy. Sending new version today or tomorrow.

Tejas

> -----Original Message-----
> From: Upadhyay, Tejas <tejas.upadhyay@intel.com>
> Sent: 13 April 2026 18:46
> To: intel-xe@lists.freedesktop.org
> Cc: Auld, Matthew <matthew.auld@intel.com>; Brost, Matthew
> <matthew.brost@intel.com>; thomas.hellstrom@linux.intel.com; Ghimiray,
> Himal Prasad <himal.prasad.ghimiray@intel.com>; Upadhyay, Tejas
> <tejas.upadhyay@intel.com>
> Subject: [RFC PATCH V7 0/9] Add memory page offlining support
> 
> This functionality represents a significant step in making the xe driver
> gracefully handle hardware memory degradation.
> By integrating with the DRM Buddy allocator, the driver can permanently
> "carve out" faulty memory so it isn't reused by subsequent allocations.
> 
> This series adds memory page offlining support with following:
> 1. drm/xe/svm: Use xe_vram_addr_to_region, avoid block->private usage 2.
> Link and track ttm BO's with physical addresses 3. Link LRC BO and its
> execution Queue 4. Extend BO purge to handle vram pages as well 5. Handle
> the generated physical address error by reserving addresses 4K page 6. Adds
> supporting debugfs to automate injection of physcal address error 7. Add
> buddy block allocation dump for debuggin buddy related issues 8. Add
> configfs for vram bad page reservation policy 9. Sysfs entry to provide
> statistics of bad gpu vram pages for user info
> 
> v7:
> - Improve debugfs warning messages
> - Use scope_guard for locking(MattB)
> - Adapt addition of queue member of LRC BO(MattB)
> - Extend and use xe_ttm_bo_purge API for vram pages(MattB)
> - Handle dma_buf_map requests for native and remote(MattB)
> - Address if in never initialized block, set block to NULL
> V6:
> - Add more specific tests to noncritical bo sections
> - Handle smooth exit of user created exec queues
> - Break code and make purge specific static API
> V5:
> - Sysfs "max_pages" addition
> - Reset block->private NULL post purge
> - Remove wedge, return -EIO to system controller will initiate reset
> - Add debugfs tests to trigger different test scenarios manually and via igt
> - Rename addr_to_tbo to addr_to_block and move under gpu/buddy.c
> V4: API reworks, add configfs for policy reservation and apply config
> everywhere
> V3: use res_to_mem_region to avoid use of block->private (MattA)
> V2:
> - some fixes and clean up on errors
> - Added xe_vram_addr_to_region helper to avoid other use of block->private
> (MattB)
> 
> Debugfs shows test of different scenarios, echo 0 >
> /sys/kernel/debug/dri/bdf/invalid_addr_vram0
> where 0 is below address types to be tested, enum mempage_offline_mode {
>         MEMPAGE_OFFLINE_UNALLOCATED = 0,
>         MEMPAGE_OFFLINE_USER_ALLOCATED = 1,
>         MEMPAGE_OFFLINE_KERNEL_USER_GGTT_ALLOCATED = 2,
>         MEMPAGE_OFFLINE_KERNEL_USER_PPGTT_ALLOCATED = 3,
>         MEMPAGE_OFFLINE_KERNEL_CRITICAL_ALLOCATED = 4,
>         MEMPAGE_OFFLINE_RESERVED = 5,
> };
> 
> IGT tests for testing this feature:
> https://patchwork.freedesktop.org/patch/714751/
> 
> Results of above tests:
> Using IGT_SRANDOM=1774610050 for randomisation Opened device:
> /dev/dri/card0 Starting subtest: unallocated Subtest unallocated: SUCCESS
> (1.834s) Starting subtest: user-allocated Subtest user-allocated: SUCCESS
> (1.832s) Starting subtest: user-ggtt-allocated Subtest user-ggtt-allocated:
> SUCCESS (1.871s) Starting subtest: user-ppgtt-allocated Subtest user-ppgtt-
> allocated: SUCCESS (1.843s) Starting subtest: critical-allocated Subtest critical-
> allocated: SUCCESS (1.824s) Starting subtest: reserved Subtest reserved:
> SUCCESS (0.032s)
> 
> Tejas Upadhyay (9):
>   drm/xe: Link VRAM object with gpu buddy
>   drm/gpu: Add gpu_buddy_addr_to_block helper
>   drm/xe: Link LRC BO and its execution Queue
>   drm/xe: Extend BO purge to handle vram pages as well
>   drm/xe: Handle physical memory address error
>   drm/xe/cri: Add debugfs to inject faulty vram address
>   gpu/buddy: Add routine to dump allocated buddy blocks
>   drm/xe/configfs: Add vram bad page reservation policy
>   drm/xe/cri: Add sysfs interface for bad gpu vram pages
> 
>  drivers/gpu/buddy.c                        |  99 ++++++
>  drivers/gpu/drm/xe/xe_bo.c                 |  16 +-
>  drivers/gpu/drm/xe/xe_bo.h                 |   5 +-
>  drivers/gpu/drm/xe/xe_bo_types.h           |   3 +
>  drivers/gpu/drm/xe/xe_configfs.c           |  64 +++-
>  drivers/gpu/drm/xe/xe_configfs.h           |   2 +
>  drivers/gpu/drm/xe/xe_debugfs.c            | 171 ++++++++++
>  drivers/gpu/drm/xe/xe_device.c             |  51 +++
>  drivers/gpu/drm/xe/xe_device_sysfs.c       |   7 +
>  drivers/gpu/drm/xe/xe_dma_buf.c            |   3 +
>  drivers/gpu/drm/xe/xe_exec_queue.c         |  10 +-
>  drivers/gpu/drm/xe/xe_pt.c                 |   3 +-
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 361
> +++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   2 +
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  32 ++
>  include/linux/gpu_buddy.h                  |   3 +
>  16 files changed, 821 insertions(+), 11 deletions(-)
> 
> --
> 2.52.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy
  2026-04-13 13:16 ` [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy Tejas Upadhyay
@ 2026-04-30  3:50   ` Matthew Brost
  0 siblings, 0 replies; 21+ messages in thread
From: Matthew Brost @ 2026-04-30  3:50 UTC (permalink / raw)
  To: Tejas Upadhyay
  Cc: intel-xe, matthew.auld, thomas.hellstrom, himal.prasad.ghimiray

On Mon, Apr 13, 2026 at 06:46:22PM +0530, Tejas Upadhyay wrote:
> Setup to link TTM buffer object inside gpu buddy. This functionality
> is critical for supporting the memory page offline feature on CRI,
> where identified faulty pages must be traced back to their
> originating buffer for safe removal.
> 
> V2(MattB): Clear block->private in xe_ttm_vram_mgr_del as well
> 
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> index 5fd0d5506a7e..01a9b92772f8 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> @@ -54,6 +54,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
>  	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
>  	struct xe_ttm_vram_mgr_resource *vres;
>  	struct gpu_buddy *mm = &mgr->mm;
> +	struct gpu_buddy_block *block;
>  	u64 size, min_page_size;
>  	unsigned long lpfn;
>  	int err;
> @@ -138,6 +139,8 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
>  	}
>  
>  	mgr->visible_avail -= vres->used_visible_size;
> +	list_for_each_entry(block, &vres->blocks, link)
> +		block->private = tbo;
>  	mutex_unlock(&mgr->lock);
>  
>  	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
> @@ -176,8 +179,11 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
>  		to_xe_ttm_vram_mgr_resource(res);
>  	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
>  	struct gpu_buddy *mm = &mgr->mm;
> +	struct gpu_buddy_block *block;
>  
>  	mutex_lock(&mgr->lock);
> +	list_for_each_entry(block, &vres->blocks, link)
> +		block->private = NULL;
>  	gpu_buddy_free_list(mm, &vres->blocks, 0);
>  	mgr->visible_avail += vres->used_visible_size;
>  	mutex_unlock(&mgr->lock);
> -- 
> 2.52.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error
  2026-04-13 13:16 ` [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error Tejas Upadhyay
@ 2026-04-30 11:28   ` Matthew Auld
  2026-05-04 10:52     ` Upadhyay, Tejas
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Auld @ 2026-04-30 11:28 UTC (permalink / raw)
  To: Tejas Upadhyay, intel-xe
  Cc: matthew.brost, thomas.hellstrom, himal.prasad.ghimiray

On 13/04/2026 14:16, Tejas Upadhyay wrote:
> This functionality represents a significant step in making
> the xe driver gracefully handle hardware memory degradation.
> By integrating with the DRM Buddy allocator, the driver
> can permanently "carve out" faulty memory so it isn't reused
> by subsequent allocations.
> 
> Buddy Block Reservation:
> ----------------------
> When a memory address is reported as faulty, the driver instructs
> the DRM Buddy allocator to reserve a block of the specific page
> size (typically 4KB). This marks the memory as "dirty/used"
> indefinitely.
> 
> Two-Stage Tracking:
> -----------------
> Offlined Pages:
> Pages that have been successfully isolated and removed from the
> available memory pool.
> 
> Queued Pages:
> Addresses that have been flagged as faulty but are currently in
> use by a process. These are tracked until the associated buffer
> object (BO) is released or migrated, at which point they move
> to the "offlined" state.
> 
> Sysfs Reporting:
> --------------
> The patch exposes these metrics through a standard interface,
> allowing administrators to monitor VRAM health:
> /sys/bus/pci/devices/<device_id>/vram_bad_bad_pages
> 
> V6:
> - Use scope_guard for locking(MattB)
> - Adapt addition of queue member of LRC BO(MattB)
> - Extend and use xe_ttm_bo_purge API for vram pages(MattB)
> - Handle dma_buf_map requests for native and remote(MattB)
> - Address if in never initialized block, set block to NULL
> V5:
> - Categorise and handle BOs accordingly
> - Fix crash found with new debugfs tests
> V4:
> - Set block->private NULL post bo purge
> - Filter out gsm address early on
> - Rebase
> V3:
> -rename api, remove tile dependency and add status of reservation
> V2:
> - Fix mm->avail counter issue
> - Remove unused code and handle clean up in case of error
> 
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_bo.c                 |  11 +-
>   drivers/gpu/drm/xe/xe_bo.h                 |   4 +-
>   drivers/gpu/drm/xe/xe_dma_buf.c            |   3 +
>   drivers/gpu/drm/xe/xe_exec_queue.c         |   9 +-
>   drivers/gpu/drm/xe/xe_pt.c                 |   3 +-
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 267 +++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   1 +
>   drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  28 +++
>   8 files changed, 320 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 7baa326c9421..d84849cca0aa 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -158,7 +158,16 @@ bool xe_bo_is_vm_bound(struct xe_bo *bo)
>   	return !list_empty(&bo->ttm.base.gpuva.list);
>   }
>   
> -static bool xe_bo_is_user(struct xe_bo *bo)
> +/**
> + * xe_bo_is_user - check if BO is user created BO
> + * @bo: The BO
> + *
> + * Check if  BO is user created BO. This requires the
> + * reservation lock for the BO to be held.
> + *
> + * Returns: boolean
> + */
> +bool xe_bo_is_user(struct xe_bo *bo)
>   {
>   	return bo->flags & XE_BO_FLAG_USER;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
> index 9f55b3589caf..073fae905073 100644
> --- a/drivers/gpu/drm/xe/xe_bo.h
> +++ b/drivers/gpu/drm/xe/xe_bo.h
> @@ -277,7 +277,8 @@ static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo)
>   {
>   	if (likely(bo)) {
>   		xe_bo_lock(bo, false);
> -		xe_bo_unpin(bo);
> +		if (!xe_bo_is_purged(bo))
> +			xe_bo_unpin(bo);
>   		xe_bo_unlock(bo);
>   
>   		xe_bo_put(bo);
> @@ -501,6 +502,7 @@ long xe_bo_shrink(struct ttm_operation_ctx *ctx, struct ttm_buffer_object *bo,
>   		  const struct xe_bo_shrink_flags flags,
>   		  unsigned long *scanned);
>   int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx);
> +bool xe_bo_is_user(struct xe_bo *bo);
>   
>   /**
>    * xe_bo_is_mem_type - Whether the bo currently resides in the given
> diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
> index 7f9602b3363d..e36ea88292f5 100644
> --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> @@ -104,6 +104,9 @@ static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach,
>   	struct sg_table *sgt;
>   	int r = 0;
>   
> +	if (xe_bo_is_purged(bo))
> +		return ERR_PTR(-ENOENT);

Is it not the case that we can already purge something that gets 
exported via dma-buf? Is this actually new for this patch?

> +
>   	if (!attach->peer2peer && !xe_bo_can_migrate(bo, XE_PL_TT))
>   		return ERR_PTR(-EOPNOTSUPP);
>   
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index b3b80893c387..40ffc598e0f8 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -385,7 +385,6 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>   				err = PTR_ERR(lrc);
>   				goto err_lrc;
>   			}
> -
>   			lrc->bo->q = q;
>   			xe_exec_queue_set_lrc(q, lrc, i);
>   
> @@ -1552,8 +1551,12 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q)
>   	 * errors.
>   	 */
>   	lrc = q->lrc[0];
> -	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
> -	q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
> +	xe_bo_lock(lrc->bo, false);
> +	if (!xe_bo_is_purged(lrc->bo)) {
> +		new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
> +		q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
> +	}
> +	xe_bo_unlock(lrc->bo);
>   
>   	drm_dev_exit(idx);
>   }
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 8e5f4f0dea3f..1764bae6e481 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -211,7 +211,8 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags, struct llist_head *deferred)
>   		return;
>   
>   	XE_WARN_ON(!list_empty(&pt->bo->ttm.base.gpuva.list));
> -	xe_bo_unpin(pt->bo);
> +	if (!xe_bo_is_purged(pt->bo))
> +		xe_bo_unpin(pt->bo);
>   	xe_bo_put_deferred(pt->bo, deferred);
>   
>   	if (pt->level > 0 && pt->num_live) {
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> index 01a9b92772f8..ac6f034852f7 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> @@ -13,7 +13,10 @@
>   
>   #include "xe_bo.h"
>   #include "xe_device.h"
> +#include "xe_exec_queue.h"
> +#include "xe_lrc.h"
>   #include "xe_res_cursor.h"
> +#include "xe_ttm_stolen_mgr.h"
>   #include "xe_ttm_vram_mgr.h"
>   #include "xe_vram_types.h"
>   
> @@ -280,6 +283,24 @@ static const struct ttm_resource_manager_func xe_ttm_vram_mgr_func = {
>   	.debug	= xe_ttm_vram_mgr_debug
>   };
>   
> +static void xe_ttm_vram_free_bad_pages(struct drm_device *dev, struct xe_ttm_vram_mgr *mgr)
> +{
> +	struct xe_ttm_vram_offline_resource *pos, *n;
> +
> +	list_for_each_entry_safe(pos, n, &mgr->offlined_pages, offlined_link) {
> +		--mgr->n_offlined_pages;
> +		gpu_buddy_free_list(&mgr->mm, &pos->blocks, 0);
> +		mgr->visible_avail += pos->used_visible_size;
> +		list_del(&pos->offlined_link);
> +		kfree(pos);
> +	}
> +	list_for_each_entry_safe(pos, n, &mgr->queued_pages, queued_link) {
> +		list_del(&pos->queued_link);
> +		mgr->n_queued_pages--;
> +		kfree(pos);
> +	}
> +}
> +
>   static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
>   {
>   	struct xe_device *xe = to_xe_device(dev);
> @@ -291,6 +312,10 @@ static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
>   	if (ttm_resource_manager_evict_all(&xe->ttm, man))
>   		return;
>   
> +	mutex_lock(&mgr->lock);
> +	xe_ttm_vram_free_bad_pages(dev, mgr);
> +	mutex_unlock(&mgr->lock);
> +
>   	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
>   
>   	gpu_buddy_fini(&mgr->mm);
> @@ -319,6 +344,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
>   	man->func = &xe_ttm_vram_mgr_func;
>   	mgr->mem_type = mem_type;
>   	mutex_init(&mgr->lock);
> +	INIT_LIST_HEAD(&mgr->offlined_pages);
> +	INIT_LIST_HEAD(&mgr->queued_pages);
>   	mgr->default_page_size = default_page_size;
>   	mgr->visible_size = io_size;
>   	mgr->visible_avail = io_size;
> @@ -474,3 +501,243 @@ u64 xe_ttm_vram_get_avail(struct ttm_resource_manager *man)
>   
>   	return avail;
>   }
> +
> +static int xe_ttm_vram_purge_page(struct xe_device *xe, struct xe_bo *bo)
> +{
> +	struct ttm_operation_ctx ctx = {};
> +	struct xe_vm *vm;
> +	u32	flags;
> +	int ret = 0;
> +
> +	xe_bo_lock(bo, false);
> +	vm = bo->vm;
> +	flags = bo->flags;
> +	xe_bo_unlock(bo);
> +	/*  Ban VM if BO is PPGTT */
> +	if (flags & XE_BO_FLAG_PAGETABLE) {
> +		down_write(&vm->lock);
> +		xe_vm_kill(vm, true);
> +		up_write(&vm->lock);
> +	}
> +
> +	xe_bo_lock(bo, false);
> +	/*  Ban exec queue if BO is lrc */
> +	if (bo->q && xe_exec_queue_get_unless_zero(bo->q)) {
> +		/* ban queue */
> +		xe_exec_queue_kill(bo->q);
> +		xe_exec_queue_put(bo->q);
> +	}
> +
> +	xe_bo_set_purgeable_state(bo, XE_MADV_PURGEABLE_DONTNEED);
> +	ttm_bo_unmap_virtual(&bo->ttm);   /* nuke CPU mmap + VRAM IO mappings */

I think ttm or xe must already do this somewhere, when we do the purge 
below?

> +	if (xe_bo_is_pinned(bo))
> +		xe_bo_unpin(bo);
> +	ret = xe_ttm_bo_purge(&bo->ttm, &ctx);
> +	xe_bo_unlock(bo);
> +
> +	return ret;
> +}
> +
> +static int xe_ttm_vram_reserve_page_at_addr(struct xe_device *xe, unsigned long addr,
> +					    struct xe_ttm_vram_mgr *vram_mgr, struct gpu_buddy *mm)
> +{
> +	struct xe_ttm_vram_offline_resource *nentry;
> +	struct ttm_buffer_object *tbo = NULL;
> +	struct gpu_buddy_block *block;
> +	struct gpu_buddy_block *b, *m;
> +	enum reserve_status {
> +		pending = 0,
> +		fail
> +	};
> +	u64 size = SZ_4K;
> +	int ret = 0;
> +
> +	scoped_guard(mutex, &vram_mgr->lock) {
> +		block = gpu_buddy_addr_to_block(mm, addr);
> +		if (PTR_ERR(block) == -ENXIO)

Maybe make make this a programmer error and also chuck a warn? Only way 
this fires is when the addr is bogus, so something is bady wrong.

> +			/* VRAM region check passed earlier; safe to proceed */
> +			block = NULL;
> +
> +		nentry = kzalloc_obj(*nentry);
> +		if (!nentry)
> +			return -ENOMEM;
> +		INIT_LIST_HEAD(&nentry->blocks);
> +		nentry->status = pending;
> +		nentry->addr = addr;
> +
> +		if (block) {
> +			struct xe_bo *pbo;
> +
> +			WARN_ON(!block->private);
> +			tbo = block->private;
> +			pbo = ttm_to_xe_bo(tbo);
> +
> +			/* Get reference safely - BO may have zero refcount */
> +			if (!xe_bo_get_unless_zero(pbo)) {
> +				kfree(nentry);
> +				return -ENOENT;
> +			}
> +			/* Critical kernel BO? */
> +			if ((pbo->ttm.type == ttm_bo_type_kernel &&
> +			     !(pbo->flags & XE_BO_FLAG_PINNED_LATE_RESTORE)) ||
> +			    (xe_bo_is_user(pbo) && xe_bo_is_pinned(pbo))) {
> +				kfree(nentry);
> +				xe_ttm_vram_free_bad_pages(&xe->drm, vram_mgr);
> +				xe_bo_put(pbo);
> +				drm_err(&xe->drm,
> +					"%s: addr: 0x%lx is critical kernel bo, requesting SBR\n",
> +					__func__, addr);
> +				/* Hint System controller driver for reset with -EIO  */
> +				return -EIO;
> +			}
> +			nentry->id = ++vram_mgr->n_queued_pages;
> +			list_add(&nentry->queued_link, &vram_mgr->queued_pages);
> +		}
> +	}
> +	if (block) {
> +		struct xe_ttm_vram_offline_resource *pos, *n;
> +		struct xe_bo *pbo = ttm_to_xe_bo(tbo);
> +
> +		/* Purge BO containing address - reference held from above */
> +		ret = xe_ttm_vram_purge_page(xe, pbo);
> +		xe_bo_put(pbo);
> +		if (ret) {
> +			nentry->status = fail;
> +			return ret;
> +		}
> +
> +		/* Reserve page at address addr*/
> +		scoped_guard(mutex, &vram_mgr->lock) {
> +			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
> +						     size, size, &nentry->blocks,
> +						     GPU_BUDDY_RANGE_ALLOCATION);

I guess technically this is racy, with purging the pages and someone 
else re-allocating the pages before you can grab mgr->lock? But we think 
that is unlikely, and in the worst case we just wedge the device if we 
error out here?

> +
> +			if (ret) {
> +				drm_warn(&xe->drm, "Could not reserve page at addr:0x%lx, ret:%d\n",
> +					 addr, ret);
> +				nentry->status = fail;
> +				return ret;
> +			}
> +
> +			list_for_each_entry_safe(b, m, &nentry->blocks, link)

No need for the _safe, since we don't modify the list.

> +				b->private = NULL;
> +
> +			if ((addr + size) <= vram_mgr->visible_size) {
> +				nentry->used_visible_size = size;
> +			} else {
> +				list_for_each_entry(b, &nentry->blocks, link) {
> +					u64 start = gpu_buddy_block_offset(b);
> +
> +					if (start < vram_mgr->visible_size) {
> +						u64 end = start + gpu_buddy_block_size(mm, b);
> +
> +						nentry->used_visible_size +=
> +							min(end, vram_mgr->visible_size) - start;
> +					}
> +				}
> +			}
> +			vram_mgr->visible_avail -= nentry->used_visible_size;
> +			list_for_each_entry_safe(pos, n, &vram_mgr->queued_pages, queued_link) {
> +				if (pos->id == nentry->id) {
> +					--vram_mgr->n_queued_pages;
> +				list_del(&pos->queued_link);
> +				break;
> +				}
> +			}
> +			list_add(&nentry->offlined_link, &vram_mgr->offlined_pages);
> +			/* TODO: FW Integration: Send command to FW for offlining page */
> +			++vram_mgr->n_offlined_pages;
> +			return ret;
> +		}
> +	} else {
> +		scoped_guard(mutex, &vram_mgr->lock) {
> +			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
> +						     size, size, &nentry->blocks,
> +						     GPU_BUDDY_RANGE_ALLOCATION);
> +			if (ret) {
> +				drm_warn(&xe->drm, "Could not reserve page at addr:0x%lx, ret:%d\n",
> +					 addr, ret);
> +				nentry->status = fail;
> +				return ret;
> +			}
> +
> +			list_for_each_entry_safe(b, m, &nentry->blocks, link)

Same here.

> +				b->private = NULL;
> +
> +			if ((addr + size) <= vram_mgr->visible_size) {
> +				nentry->used_visible_size = size;
> +			} else {
> +				struct gpu_buddy_block *block;
> +
> +				list_for_each_entry(block, &nentry->blocks, link) {
> +					u64 start = gpu_buddy_block_offset(block);
> +
> +					if (start < vram_mgr->visible_size) {
> +						u64 end = start + gpu_buddy_block_size(mm, block);
> +
> +						nentry->used_visible_size +=
> +							min(end, vram_mgr->visible_size) - start;
> +					}
> +				}
> +			}

Could some of this be moved into a helper or refactored a bit? This and 
above look very similar, also this is not that different from what we 
already do for normal allocation flow?

Would it make sense to pull this out into something like:

int xe_ttm_vram_buddy_alloc(mgr, size, end, flags, priv, blocks, 
*visible_used)
{
     alloc_blocks(&blocks, ...);

     list_for_each_entry(b, blocks, link)
        b->private = priv;

    if (end <= vram_mgr->visible_size)
       *visibled_used = size;
    else {
       ....
    }

    vram_mgr->visible_avail -= *visible_used;
   ...
}

void xe_ttm_vram_free(mgr, blocks, *visible_used)
{
    list_for_each_entry(b, blocks, link)
        b->private = NULL;
    gpu_buddy_free_list(mm, blocks, 0);
    vram_mgr->visible_avail += *visible_used;
}

And then convert mgr_del/new to use that, and then also re-use here? I 
think that would make this easier to follow, plus less duplication.

> +			vram_mgr->visible_avail -= nentry->used_visible_size;
> +			nentry->id = ++vram_mgr->n_offlined_pages;
> +			list_add(&nentry->offlined_link, &vram_mgr->offlined_pages);
> +			/* TODO: FW Integration: Send command to FW for offlining page */
> +		}
> +	}
> +	/* Success */
> +	return ret;
> +}
> +
> +static struct xe_vram_region *xe_ttm_vram_addr_to_region(struct xe_device *xe,
> +							 resource_size_t addr)
> +{
> +	unsigned long stolen_base = xe_ttm_stolen_gpu_offset(xe);
> +	struct xe_vram_region *vr;
> +	struct xe_tile *tile;
> +	int id;
> +
> +	/* Addr from stolen memory? */
> +	if (addr + SZ_4K >= stolen_base)
> +		return NULL;
> +
> +	for_each_tile(tile, xe, id) {
> +		vr = tile->mem.vram;
> +		if ((addr <= vr->dpa_base + vr->actual_physical_size) &&
> +		    (addr + SZ_4K >= vr->dpa_base))
> +			return vr;
> +	}
> +	return NULL;
> +}
> +
> +/**
> + * xe_ttm_vram_handle_addr_fault - Handle vram physical address error flaged
> + * @xe: pointer to parent device
> + * @addr: physical faulty address
> + *
> + * Handle the physcial faulty address error on specific tile.
> + *
> + * Returns 0 for success, negative error code otherwise.
> + */
> +int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr)
> +{
> +	struct xe_ttm_vram_mgr *vram_mgr;
> +	struct xe_vram_region *vr;
> +	struct gpu_buddy *mm;
> +	int ret;
> +
> +	vr = xe_ttm_vram_addr_to_region(xe, addr);
> +	if (!vr) {
> +		drm_err(&xe->drm, "%s:%d addr:%lx error requesting SBR\n",
> +			__func__, __LINE__, addr);
> +		/* Hint System controller driver for reset with -EIO  */
> +		return -EIO;
> +	}
> +	vram_mgr = &vr->ttm;
> +	mm = &vram_mgr->mm;
> +	/* Reserve page at address */
> +	ret = xe_ttm_vram_reserve_page_at_addr(xe, addr, vram_mgr, mm);
> +	return ret;

Nit: return xe_ttm_vram_reserve_page_at_addr()

> +}
> +EXPORT_SYMBOL(xe_ttm_vram_handle_addr_fault);
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> index 87b7fae5edba..8ef06d9d44f7 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> @@ -31,6 +31,7 @@ u64 xe_ttm_vram_get_cpu_visible_size(struct ttm_resource_manager *man);
>   void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
>   			  u64 *used, u64 *used_visible);
>   
> +int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long addr);
>   static inline struct xe_ttm_vram_mgr_resource *
>   to_xe_ttm_vram_mgr_resource(struct ttm_resource *res)
>   {
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> index 9106da056b49..3ad7966798eb 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> @@ -19,6 +19,14 @@ struct xe_ttm_vram_mgr {
>   	struct ttm_resource_manager manager;
>   	/** @mm: DRM buddy allocator which manages the VRAM */
>   	struct gpu_buddy mm;
> +	/** @offlined_pages: List of offlined pages */
> +	struct list_head offlined_pages;
> +	/** @n_offlined_pages: Number of offlined pages */
> +	u16 n_offlined_pages;
> +	/** @queued_pages: List of queued pages */
> +	struct list_head queued_pages;
> +	/** @n_queued_pages: Number of queued pages */
> +	u16 n_queued_pages;
>   	/** @visible_size: Proped size of the CPU visible portion */
>   	u64 visible_size;
>   	/** @visible_avail: CPU visible portion still unallocated */
> @@ -45,4 +53,24 @@ struct xe_ttm_vram_mgr_resource {
>   	unsigned long flags;
>   };
>   
> +/**
> + * struct xe_ttm_vram_offline_resource - Xe TTM VRAM offline  resource
> + */
> +struct xe_ttm_vram_offline_resource {
> +	/** @offlined_link: Link to offlined pages */
> +	struct list_head offlined_link;
> +	/** @queued_link: Link to queued pages */
> +	struct list_head queued_link;
> +	/** @blocks: list of DRM buddy blocks */
> +	struct list_head blocks;
> +	/** @used_visible_size: How many CPU visible bytes this resource is using */
> +	u64 used_visible_size;
> +	/** @id: The id of an offline resource */
> +	u16 id;
> +	/** @addr: Address of faulty memory location reported by HW */
> +	unsigned long addr;
> +	/** @status: reservation status of resource */
> +	bool status;
> +};
> +
>   #endif


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error
  2026-04-30 11:28   ` Matthew Auld
@ 2026-05-04 10:52     ` Upadhyay, Tejas
  0 siblings, 0 replies; 21+ messages in thread
From: Upadhyay, Tejas @ 2026-05-04 10:52 UTC (permalink / raw)
  To: Auld, Matthew, intel-xe@lists.freedesktop.org
  Cc: Brost, Matthew, thomas.hellstrom@linux.intel.com,
	Ghimiray, Himal Prasad



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: 30 April 2026 16:59
> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
> xe@lists.freedesktop.org
> Cc: Brost, Matthew <matthew.brost@intel.com>;
> thomas.hellstrom@linux.intel.com; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray@intel.com>
> Subject: Re: [RFC PATCH V7 5/9] drm/xe: Handle physical memory address
> error
> 
> On 13/04/2026 14:16, Tejas Upadhyay wrote:
> > This functionality represents a significant step in making the xe
> > driver gracefully handle hardware memory degradation.
> > By integrating with the DRM Buddy allocator, the driver can
> > permanently "carve out" faulty memory so it isn't reused by subsequent
> > allocations.
> >
> > Buddy Block Reservation:
> > ----------------------
> > When a memory address is reported as faulty, the driver instructs the
> > DRM Buddy allocator to reserve a block of the specific page size
> > (typically 4KB). This marks the memory as "dirty/used"
> > indefinitely.
> >
> > Two-Stage Tracking:
> > -----------------
> > Offlined Pages:
> > Pages that have been successfully isolated and removed from the
> > available memory pool.
> >
> > Queued Pages:
> > Addresses that have been flagged as faulty but are currently in use by
> > a process. These are tracked until the associated buffer object (BO)
> > is released or migrated, at which point they move to the "offlined"
> > state.
> >
> > Sysfs Reporting:
> > --------------
> > The patch exposes these metrics through a standard interface, allowing
> > administrators to monitor VRAM health:
> > /sys/bus/pci/devices/<device_id>/vram_bad_bad_pages
> >
> > V6:
> > - Use scope_guard for locking(MattB)
> > - Adapt addition of queue member of LRC BO(MattB)
> > - Extend and use xe_ttm_bo_purge API for vram pages(MattB)
> > - Handle dma_buf_map requests for native and remote(MattB)
> > - Address if in never initialized block, set block to NULL
> > V5:
> > - Categorise and handle BOs accordingly
> > - Fix crash found with new debugfs tests
> > V4:
> > - Set block->private NULL post bo purge
> > - Filter out gsm address early on
> > - Rebase
> > V3:
> > -rename api, remove tile dependency and add status of reservation
> > V2:
> > - Fix mm->avail counter issue
> > - Remove unused code and handle clean up in case of error
> >
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_bo.c                 |  11 +-
> >   drivers/gpu/drm/xe/xe_bo.h                 |   4 +-
> >   drivers/gpu/drm/xe/xe_dma_buf.c            |   3 +
> >   drivers/gpu/drm/xe/xe_exec_queue.c         |   9 +-
> >   drivers/gpu/drm/xe/xe_pt.c                 |   3 +-
> >   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 267
> +++++++++++++++++++++
> >   drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   1 +
> >   drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |  28 +++
> >   8 files changed, 320 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > index 7baa326c9421..d84849cca0aa 100644
> > --- a/drivers/gpu/drm/xe/xe_bo.c
> > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > @@ -158,7 +158,16 @@ bool xe_bo_is_vm_bound(struct xe_bo *bo)
> >   	return !list_empty(&bo->ttm.base.gpuva.list);
> >   }
> >
> > -static bool xe_bo_is_user(struct xe_bo *bo)
> > +/**
> > + * xe_bo_is_user - check if BO is user created BO
> > + * @bo: The BO
> > + *
> > + * Check if  BO is user created BO. This requires the
> > + * reservation lock for the BO to be held.
> > + *
> > + * Returns: boolean
> > + */
> > +bool xe_bo_is_user(struct xe_bo *bo)
> >   {
> >   	return bo->flags & XE_BO_FLAG_USER;
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
> > index 9f55b3589caf..073fae905073 100644
> > --- a/drivers/gpu/drm/xe/xe_bo.h
> > +++ b/drivers/gpu/drm/xe/xe_bo.h
> > @@ -277,7 +277,8 @@ static inline void xe_bo_unpin_map_no_vm(struct
> xe_bo *bo)
> >   {
> >   	if (likely(bo)) {
> >   		xe_bo_lock(bo, false);
> > -		xe_bo_unpin(bo);
> > +		if (!xe_bo_is_purged(bo))
> > +			xe_bo_unpin(bo);
> >   		xe_bo_unlock(bo);
> >
> >   		xe_bo_put(bo);
> > @@ -501,6 +502,7 @@ long xe_bo_shrink(struct ttm_operation_ctx *ctx,
> struct ttm_buffer_object *bo,
> >   		  const struct xe_bo_shrink_flags flags,
> >   		  unsigned long *scanned);
> >   int xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct
> > ttm_operation_ctx *ctx);
> > +bool xe_bo_is_user(struct xe_bo *bo);
> >
> >   /**
> >    * xe_bo_is_mem_type - Whether the bo currently resides in the given
> > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c
> > b/drivers/gpu/drm/xe/xe_dma_buf.c index 7f9602b3363d..e36ea88292f5
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> > @@ -104,6 +104,9 @@ static struct sg_table *xe_dma_buf_map(struct
> dma_buf_attachment *attach,
> >   	struct sg_table *sgt;
> >   	int r = 0;
> >
> > +	if (xe_bo_is_purged(bo))
> > +		return ERR_PTR(-ENOENT);
> 
> Is it not the case that we can already purge something that gets exported via
> dma-buf? Is this actually new for this patch?

We purge but when its unpinned I think. Here what we trying to do is unpin pinned bo and then go through purge step. Since it is pinned, remapping may be attempted which here we are trying to block. But I realize some problem here, there could be multiple imports also, how about handling purge this way:

/* Invalidate importer mappings - importers unmap their sg_tables */
dma_buf_invalidate_mappings(bo->ttm.base.dma_buf);
/* Move BO to system RAM instead of purging - preserves data */
xe_bo_migrate(bo, XE_PL_TT);
/* Now the VRAM page is free and can be offlined */

Here migrate will take care of pinnings & all and data is also preserved via RAM, then VRAM page can go for offline.

> 
> > +
> >   	if (!attach->peer2peer && !xe_bo_can_migrate(bo, XE_PL_TT))
> >   		return ERR_PTR(-EOPNOTSUPP);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c
> > b/drivers/gpu/drm/xe/xe_exec_queue.c
> > index b3b80893c387..40ffc598e0f8 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > @@ -385,7 +385,6 @@ static int __xe_exec_queue_init(struct
> xe_exec_queue *q, u32 exec_queue_flags)
> >   				err = PTR_ERR(lrc);
> >   				goto err_lrc;
> >   			}
> > -
> >   			lrc->bo->q = q;
> >   			xe_exec_queue_set_lrc(q, lrc, i);
> >
> > @@ -1552,8 +1551,12 @@ void xe_exec_queue_update_run_ticks(struct
> xe_exec_queue *q)
> >   	 * errors.
> >   	 */
> >   	lrc = q->lrc[0];
> > -	new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
> > -	q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
> > +	xe_bo_lock(lrc->bo, false);
> > +	if (!xe_bo_is_purged(lrc->bo)) {
> > +		new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
> > +		q->xef->run_ticks[q->class] += (new_ts - old_ts) * q->width;
> > +	}
> > +	xe_bo_unlock(lrc->bo);
> >
> >   	drm_dev_exit(idx);
> >   }
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index 8e5f4f0dea3f..1764bae6e481 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -211,7 +211,8 @@ void xe_pt_destroy(struct xe_pt *pt, u32 flags,
> struct llist_head *deferred)
> >   		return;
> >
> >   	XE_WARN_ON(!list_empty(&pt->bo->ttm.base.gpuva.list));
> > -	xe_bo_unpin(pt->bo);
> > +	if (!xe_bo_is_purged(pt->bo))
> > +		xe_bo_unpin(pt->bo);
> >   	xe_bo_put_deferred(pt->bo, deferred);
> >
> >   	if (pt->level > 0 && pt->num_live) { diff --git
> > a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > index 01a9b92772f8..ac6f034852f7 100644
> > --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > @@ -13,7 +13,10 @@
> >
> >   #include "xe_bo.h"
> >   #include "xe_device.h"
> > +#include "xe_exec_queue.h"
> > +#include "xe_lrc.h"
> >   #include "xe_res_cursor.h"
> > +#include "xe_ttm_stolen_mgr.h"
> >   #include "xe_ttm_vram_mgr.h"
> >   #include "xe_vram_types.h"
> >
> > @@ -280,6 +283,24 @@ static const struct ttm_resource_manager_func
> xe_ttm_vram_mgr_func = {
> >   	.debug	= xe_ttm_vram_mgr_debug
> >   };
> >
> > +static void xe_ttm_vram_free_bad_pages(struct drm_device *dev, struct
> > +xe_ttm_vram_mgr *mgr) {
> > +	struct xe_ttm_vram_offline_resource *pos, *n;
> > +
> > +	list_for_each_entry_safe(pos, n, &mgr->offlined_pages, offlined_link)
> {
> > +		--mgr->n_offlined_pages;
> > +		gpu_buddy_free_list(&mgr->mm, &pos->blocks, 0);
> > +		mgr->visible_avail += pos->used_visible_size;
> > +		list_del(&pos->offlined_link);
> > +		kfree(pos);
> > +	}
> > +	list_for_each_entry_safe(pos, n, &mgr->queued_pages, queued_link)
> {
> > +		list_del(&pos->queued_link);
> > +		mgr->n_queued_pages--;
> > +		kfree(pos);
> > +	}
> > +}
> > +
> >   static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
> >   {
> >   	struct xe_device *xe = to_xe_device(dev); @@ -291,6 +312,10 @@
> > static void xe_ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
> >   	if (ttm_resource_manager_evict_all(&xe->ttm, man))
> >   		return;
> >
> > +	mutex_lock(&mgr->lock);
> > +	xe_ttm_vram_free_bad_pages(dev, mgr);
> > +	mutex_unlock(&mgr->lock);
> > +
> >   	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
> >
> >   	gpu_buddy_fini(&mgr->mm);
> > @@ -319,6 +344,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe,
> struct xe_ttm_vram_mgr *mgr,
> >   	man->func = &xe_ttm_vram_mgr_func;
> >   	mgr->mem_type = mem_type;
> >   	mutex_init(&mgr->lock);
> > +	INIT_LIST_HEAD(&mgr->offlined_pages);
> > +	INIT_LIST_HEAD(&mgr->queued_pages);
> >   	mgr->default_page_size = default_page_size;
> >   	mgr->visible_size = io_size;
> >   	mgr->visible_avail = io_size;
> > @@ -474,3 +501,243 @@ u64 xe_ttm_vram_get_avail(struct
> > ttm_resource_manager *man)
> >
> >   	return avail;
> >   }
> > +
> > +static int xe_ttm_vram_purge_page(struct xe_device *xe, struct xe_bo
> > +*bo) {
> > +	struct ttm_operation_ctx ctx = {};
> > +	struct xe_vm *vm;
> > +	u32	flags;
> > +	int ret = 0;
> > +
> > +	xe_bo_lock(bo, false);
> > +	vm = bo->vm;
> > +	flags = bo->flags;
> > +	xe_bo_unlock(bo);
> > +	/*  Ban VM if BO is PPGTT */
> > +	if (flags & XE_BO_FLAG_PAGETABLE) {
> > +		down_write(&vm->lock);
> > +		xe_vm_kill(vm, true);
> > +		up_write(&vm->lock);
> > +	}
> > +
> > +	xe_bo_lock(bo, false);
> > +	/*  Ban exec queue if BO is lrc */
> > +	if (bo->q && xe_exec_queue_get_unless_zero(bo->q)) {
> > +		/* ban queue */
> > +		xe_exec_queue_kill(bo->q);
> > +		xe_exec_queue_put(bo->q);
> > +	}
> > +
> > +	xe_bo_set_purgeable_state(bo, XE_MADV_PURGEABLE_DONTNEED);
> > +	ttm_bo_unmap_virtual(&bo->ttm);   /* nuke CPU mmap + VRAM IO
> mappings */
> 
> I think ttm or xe must already do this somewhere, when we do the purge
> below?

ttm_bo_pipeline_gutting() (called via ttm_bo_validate with empty placement) does not call ttm_bo_unmap_virtual(). xe_bo_move_notify() only does xe_bo_vunmap() (kernel vmap) + dma_buf_invalidate_mappings(), neither of which nukes the userspace CPU mmap. So this explicit call is needed to zap page table entries for any mmap()'d region pointing at the corrupt VRAM page.

> 
> > +	if (xe_bo_is_pinned(bo))
> > +		xe_bo_unpin(bo);
> > +	ret = xe_ttm_bo_purge(&bo->ttm, &ctx);
> > +	xe_bo_unlock(bo);
> > +
> > +	return ret;
> > +}
> > +
> > +static int xe_ttm_vram_reserve_page_at_addr(struct xe_device *xe,
> unsigned long addr,
> > +					    struct xe_ttm_vram_mgr
> *vram_mgr, struct gpu_buddy *mm) {
> > +	struct xe_ttm_vram_offline_resource *nentry;
> > +	struct ttm_buffer_object *tbo = NULL;
> > +	struct gpu_buddy_block *block;
> > +	struct gpu_buddy_block *b, *m;
> > +	enum reserve_status {
> > +		pending = 0,
> > +		fail
> > +	};
> > +	u64 size = SZ_4K;
> > +	int ret = 0;
> > +
> > +	scoped_guard(mutex, &vram_mgr->lock) {
> > +		block = gpu_buddy_addr_to_block(mm, addr);
> > +		if (PTR_ERR(block) == -ENXIO)
> 
> Maybe make make this a programmer error and also chuck a warn? Only way
> this fires is when the addr is bogus, so something is bady wrong.

Agree, I will just simply put WARN_ON, anyway if we hit this we are in bad state.

> 
> > +			/* VRAM region check passed earlier; safe to proceed
> */
> > +			block = NULL;
> > +
> > +		nentry = kzalloc_obj(*nentry);
> > +		if (!nentry)
> > +			return -ENOMEM;
> > +		INIT_LIST_HEAD(&nentry->blocks);
> > +		nentry->status = pending;
> > +		nentry->addr = addr;
> > +
> > +		if (block) {
> > +			struct xe_bo *pbo;
> > +
> > +			WARN_ON(!block->private);
> > +			tbo = block->private;
> > +			pbo = ttm_to_xe_bo(tbo);
> > +
> > +			/* Get reference safely - BO may have zero refcount */
> > +			if (!xe_bo_get_unless_zero(pbo)) {
> > +				kfree(nentry);
> > +				return -ENOENT;
> > +			}
> > +			/* Critical kernel BO? */
> > +			if ((pbo->ttm.type == ttm_bo_type_kernel &&
> > +			     !(pbo->flags &
> XE_BO_FLAG_PINNED_LATE_RESTORE)) ||
> > +			    (xe_bo_is_user(pbo) && xe_bo_is_pinned(pbo))) {
> > +				kfree(nentry);
> > +				xe_ttm_vram_free_bad_pages(&xe->drm,
> vram_mgr);
> > +				xe_bo_put(pbo);
> > +				drm_err(&xe->drm,
> > +					"%s: addr: 0x%lx is critical kernel bo,
> requesting SBR\n",
> > +					__func__, addr);
> > +				/* Hint System controller driver for reset with -
> EIO  */
> > +				return -EIO;
> > +			}
> > +			nentry->id = ++vram_mgr->n_queued_pages;
> > +			list_add(&nentry->queued_link, &vram_mgr-
> >queued_pages);
> > +		}
> > +	}
> > +	if (block) {
> > +		struct xe_ttm_vram_offline_resource *pos, *n;
> > +		struct xe_bo *pbo = ttm_to_xe_bo(tbo);
> > +
> > +		/* Purge BO containing address - reference held from above */
> > +		ret = xe_ttm_vram_purge_page(xe, pbo);
> > +		xe_bo_put(pbo);
> > +		if (ret) {
> > +			nentry->status = fail;
> > +			return ret;
> > +		}
> > +
> > +		/* Reserve page at address addr*/
> > +		scoped_guard(mutex, &vram_mgr->lock) {
> > +			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
> > +						     size, size, &nentry->blocks,
> > +
> GPU_BUDDY_RANGE_ALLOCATION);
> 
> I guess technically this is racy, with purging the pages and someone else re-
> allocating the pages before you can grab mgr->lock? But we think that is
> unlikely, and in the worst case we just wedge the device if we error out here?

Yes this is highly unlikely. And return value of this is passed to called that is xe system controller driver , I believe currently they do reset only when -EIO is returned. We can reset in this case also, but there number of error codes possible from gpu_buddy_alloc_blocks() I believe!

> 
> > +
> > +			if (ret) {
> > +				drm_warn(&xe->drm, "Could not reserve
> page at addr:0x%lx, ret:%d\n",
> > +					 addr, ret);
> > +				nentry->status = fail;
> > +				return ret;
> > +			}
> > +
> > +			list_for_each_entry_safe(b, m, &nentry->blocks, link)
> 
> No need for the _safe, since we don't modify the list.

Sure

> 
> > +				b->private = NULL;
> > +
> > +			if ((addr + size) <= vram_mgr->visible_size) {
> > +				nentry->used_visible_size = size;
> > +			} else {
> > +				list_for_each_entry(b, &nentry->blocks, link) {
> > +					u64 start =
> gpu_buddy_block_offset(b);
> > +
> > +					if (start < vram_mgr->visible_size) {
> > +						u64 end = start +
> gpu_buddy_block_size(mm, b);
> > +
> > +						nentry->used_visible_size +=
> > +							min(end, vram_mgr-
> >visible_size) - start;
> > +					}
> > +				}
> > +			}
> > +			vram_mgr->visible_avail -= nentry->used_visible_size;
> > +			list_for_each_entry_safe(pos, n, &vram_mgr-
> >queued_pages, queued_link) {
> > +				if (pos->id == nentry->id) {
> > +					--vram_mgr->n_queued_pages;
> > +				list_del(&pos->queued_link);
> > +				break;
> > +				}
> > +			}
> > +			list_add(&nentry->offlined_link, &vram_mgr-
> >offlined_pages);
> > +			/* TODO: FW Integration: Send command to FW for
> offlining page */
> > +			++vram_mgr->n_offlined_pages;
> > +			return ret;
> > +		}
> > +	} else {
> > +		scoped_guard(mutex, &vram_mgr->lock) {
> > +			ret = gpu_buddy_alloc_blocks(mm, addr, addr + size,
> > +						     size, size, &nentry->blocks,
> > +
> GPU_BUDDY_RANGE_ALLOCATION);
> > +			if (ret) {
> > +				drm_warn(&xe->drm, "Could not reserve
> page at addr:0x%lx, ret:%d\n",
> > +					 addr, ret);
> > +				nentry->status = fail;
> > +				return ret;
> > +			}
> > +
> > +			list_for_each_entry_safe(b, m, &nentry->blocks, link)
> 
> Same here.

Sure

> 
> > +				b->private = NULL;
> > +
> > +			if ((addr + size) <= vram_mgr->visible_size) {
> > +				nentry->used_visible_size = size;
> > +			} else {
> > +				struct gpu_buddy_block *block;
> > +
> > +				list_for_each_entry(block, &nentry->blocks,
> link) {
> > +					u64 start =
> gpu_buddy_block_offset(block);
> > +
> > +					if (start < vram_mgr->visible_size) {
> > +						u64 end = start +
> gpu_buddy_block_size(mm, block);
> > +
> > +						nentry->used_visible_size +=
> > +							min(end, vram_mgr-
> >visible_size) - start;
> > +					}
> > +				}
> > +			}
> 
> Could some of this be moved into a helper or refactored a bit? This and above
> look very similar, also this is not that different from what we already do for
> normal allocation flow?
> 
> Would it make sense to pull this out into something like:
> 
> int xe_ttm_vram_buddy_alloc(mgr, size, end, flags, priv, blocks,
> *visible_used)
> {
>      alloc_blocks(&blocks, ...);
> 
>      list_for_each_entry(b, blocks, link)
>         b->private = priv;
> 
>     if (end <= vram_mgr->visible_size)
>        *visibled_used = size;
>     else {
>        ....
>     }
> 
>     vram_mgr->visible_avail -= *visible_used;
>    ...
> }
> 
> void xe_ttm_vram_free(mgr, blocks, *visible_used) {
>     list_for_each_entry(b, blocks, link)
>         b->private = NULL;
>     gpu_buddy_free_list(mm, blocks, 0);
>     vram_mgr->visible_avail += *visible_used; }
> 
> And then convert mgr_del/new to use that, and then also re-use here? I think
> that would make this easier to follow, plus less duplication.

Sure let me look into this.

> 
> > +			vram_mgr->visible_avail -= nentry->used_visible_size;
> > +			nentry->id = ++vram_mgr->n_offlined_pages;
> > +			list_add(&nentry->offlined_link, &vram_mgr-
> >offlined_pages);
> > +			/* TODO: FW Integration: Send command to FW for
> offlining page */
> > +		}
> > +	}
> > +	/* Success */
> > +	return ret;
> > +}
> > +
> > +static struct xe_vram_region *xe_ttm_vram_addr_to_region(struct
> xe_device *xe,
> > +							 resource_size_t addr)
> > +{
> > +	unsigned long stolen_base = xe_ttm_stolen_gpu_offset(xe);
> > +	struct xe_vram_region *vr;
> > +	struct xe_tile *tile;
> > +	int id;
> > +
> > +	/* Addr from stolen memory? */
> > +	if (addr + SZ_4K >= stolen_base)
> > +		return NULL;
> > +
> > +	for_each_tile(tile, xe, id) {
> > +		vr = tile->mem.vram;
> > +		if ((addr <= vr->dpa_base + vr->actual_physical_size) &&
> > +		    (addr + SZ_4K >= vr->dpa_base))
> > +			return vr;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +/**
> > + * xe_ttm_vram_handle_addr_fault - Handle vram physical address error
> > +flaged
> > + * @xe: pointer to parent device
> > + * @addr: physical faulty address
> > + *
> > + * Handle the physcial faulty address error on specific tile.
> > + *
> > + * Returns 0 for success, negative error code otherwise.
> > + */
> > +int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long
> > +addr) {
> > +	struct xe_ttm_vram_mgr *vram_mgr;
> > +	struct xe_vram_region *vr;
> > +	struct gpu_buddy *mm;
> > +	int ret;
> > +
> > +	vr = xe_ttm_vram_addr_to_region(xe, addr);
> > +	if (!vr) {
> > +		drm_err(&xe->drm, "%s:%d addr:%lx error requesting SBR\n",
> > +			__func__, __LINE__, addr);
> > +		/* Hint System controller driver for reset with -EIO  */
> > +		return -EIO;
> > +	}
> > +	vram_mgr = &vr->ttm;
> > +	mm = &vram_mgr->mm;
> > +	/* Reserve page at address */
> > +	ret = xe_ttm_vram_reserve_page_at_addr(xe, addr, vram_mgr, mm);
> > +	return ret;
> 
> Nit: return xe_ttm_vram_reserve_page_at_addr()

Yep

Tejas
> 
> > +}
> > +EXPORT_SYMBOL(xe_ttm_vram_handle_addr_fault);
> > diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> > b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> > index 87b7fae5edba..8ef06d9d44f7 100644
> > --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> > +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
> > @@ -31,6 +31,7 @@ u64 xe_ttm_vram_get_cpu_visible_size(struct
> ttm_resource_manager *man);
> >   void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
> >   			  u64 *used, u64 *used_visible);
> >
> > +int xe_ttm_vram_handle_addr_fault(struct xe_device *xe, unsigned long
> > +addr);
> >   static inline struct xe_ttm_vram_mgr_resource *
> >   to_xe_ttm_vram_mgr_resource(struct ttm_resource *res)
> >   {
> > diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> > b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> > index 9106da056b49..3ad7966798eb 100644
> > --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> > +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
> > @@ -19,6 +19,14 @@ struct xe_ttm_vram_mgr {
> >   	struct ttm_resource_manager manager;
> >   	/** @mm: DRM buddy allocator which manages the VRAM */
> >   	struct gpu_buddy mm;
> > +	/** @offlined_pages: List of offlined pages */
> > +	struct list_head offlined_pages;
> > +	/** @n_offlined_pages: Number of offlined pages */
> > +	u16 n_offlined_pages;
> > +	/** @queued_pages: List of queued pages */
> > +	struct list_head queued_pages;
> > +	/** @n_queued_pages: Number of queued pages */
> > +	u16 n_queued_pages;
> >   	/** @visible_size: Proped size of the CPU visible portion */
> >   	u64 visible_size;
> >   	/** @visible_avail: CPU visible portion still unallocated */ @@
> > -45,4 +53,24 @@ struct xe_ttm_vram_mgr_resource {
> >   	unsigned long flags;
> >   };
> >
> > +/**
> > + * struct xe_ttm_vram_offline_resource - Xe TTM VRAM offline
> > +resource  */ struct xe_ttm_vram_offline_resource {
> > +	/** @offlined_link: Link to offlined pages */
> > +	struct list_head offlined_link;
> > +	/** @queued_link: Link to queued pages */
> > +	struct list_head queued_link;
> > +	/** @blocks: list of DRM buddy blocks */
> > +	struct list_head blocks;
> > +	/** @used_visible_size: How many CPU visible bytes this resource is
> using */
> > +	u64 used_visible_size;
> > +	/** @id: The id of an offline resource */
> > +	u16 id;
> > +	/** @addr: Address of faulty memory location reported by HW */
> > +	unsigned long addr;
> > +	/** @status: reservation status of resource */
> > +	bool status;
> > +};
> > +
> >   #endif


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-05-04 10:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 13:16 [RFC PATCH V7 0/9] Add memory page offlining support Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 1/9] drm/xe: Link VRAM object with gpu buddy Tejas Upadhyay
2026-04-30  3:50   ` Matthew Brost
2026-04-13 13:16 ` [RFC PATCH V7 2/9] drm/gpu: Add gpu_buddy_addr_to_block helper Tejas Upadhyay
2026-04-13 13:28   ` Matthew Auld
2026-04-13 17:30   ` Matthew Auld
2026-04-14  5:36     ` Upadhyay, Tejas
2026-04-13 13:16 ` [RFC PATCH V7 3/9] drm/xe: Link LRC BO and its execution Queue Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 4/9] drm/xe: Extend BO purge to handle vram pages as well Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 5/9] drm/xe: Handle physical memory address error Tejas Upadhyay
2026-04-30 11:28   ` Matthew Auld
2026-05-04 10:52     ` Upadhyay, Tejas
2026-04-13 13:16 ` [RFC PATCH V7 6/9] drm/xe/cri: Add debugfs to inject faulty vram address Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 7/9] gpu/buddy: Add routine to dump allocated buddy blocks Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 8/9] drm/xe/configfs: Add vram bad page reservation policy Tejas Upadhyay
2026-04-13 13:16 ` [RFC PATCH V7 9/9] drm/xe/cri: Add sysfs interface for bad gpu vram pages Tejas Upadhyay
2026-04-13 16:36 ` ✗ CI.checkpatch: warning for Add memory page offlining support (rev7) Patchwork
2026-04-13 16:37 ` ✓ CI.KUnit: success " Patchwork
2026-04-13 17:43 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-13 20:12 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-04-15 15:10 ` [RFC PATCH V7 0/9] Add memory page offlining support Upadhyay, Tejas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox