[PATCH 00/11] Page Reclamation Support for Xe3p Platforms

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/11] Page Reclamation Support for Xe3p Platforms
@ 2025-11-18  9:05 Brian Nguyen
  2025-11-18  9:05 ` [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers Brian Nguyen
                   ` (13 more replies)
  0 siblings, 14 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

  This series introduces a hardware-assisted page reclamation support on Xe3p
platforms, integrating with the KMD's existing TLB invalidation workflow and
adding the ability to perform selective Private Physical Cache (PPC) flushing
rather than always forcing the default full PPC flush. 

Currently as of Xe2, Xe TLB invalidations trigger a full Private Physical Cache
flush to guarantee non-coherent memory correctness. New HW (Xe3p and beyond)
exposes a page reclamation feature, which we selectively enable on platforms
with a flag in device info.

The driver can provide a “Page Reclaim List” (PRL), tracking the physical pages
used that correspond to an unmap/unbind operation and let hardware perform selective cache line eviction.
If reclamation succeeds, we skip the full PPC flush entirely otherwise we fall
back to our current process of full PPC flush with the TLB invalidation.

This series is partially dependent on the "Context based TLB invalidations"
Patch series by Matthew Brost, in particular the "drm/xe: Do not forward
invalid TLB invalidation seqnos to upper layers" patch.

Context based TLB invalidations Patch Series:
https://patchwork.freedesktop.org/series/156874/

Thanks,
Brian

Brian Nguyen (9):
  drm/xe: Reset tlb fence timeout on invalid seqno received
  drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  drm/xe/guc: Add page reclamation interface to GuC
  drm/xe: Create page reclaim list on unbind
  drm/xe: Suballocate BO for page reclaim
  drm/xe: Prep page reclaim in tlb inval job
  drm/xe: Append page reclamation action to tlb inval
  drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  drm/xe: Add debugfs support for page reclamation

Matthew Brost (1):
  [DO,NOT,REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos
    to upper layers

Oak Zeng (1):
  drm/xe: Add page reclamation info to device info

 drivers/gpu/drm/xe/Makefile              |   1 +
 drivers/gpu/drm/xe/abi/guc_actions_abi.h |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h     |  11 ++
 drivers/gpu/drm/xe/regs/xe_gtt_defs.h    |   1 +
 drivers/gpu/drm/xe/xe_configfs.c         |  11 +-
 drivers/gpu/drm/xe/xe_debugfs.c          |  47 +++++++++
 drivers/gpu/drm/xe/xe_device.c           |  10 ++
 drivers/gpu/drm/xe/xe_device.h           |   2 +
 drivers/gpu/drm/xe/xe_device_types.h     |   9 ++
 drivers/gpu/drm/xe/xe_guc_ct.c           |   4 +
 drivers/gpu/drm/xe/xe_guc_fwif.h         |   1 +
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c    |  41 ++++++--
 drivers/gpu/drm/xe/xe_page_reclaim.c     | 128 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_page_reclaim.h     |  56 ++++++++++
 drivers/gpu/drm/xe/xe_pat.c              |   9 +-
 drivers/gpu/drm/xe/xe_pci.c              |   1 +
 drivers/gpu/drm/xe/xe_pci_types.h        |   1 +
 drivers/gpu/drm/xe/xe_pt.c               | 116 ++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pt_types.h         |   5 +
 drivers/gpu/drm/xe/xe_tile.c             |   5 +
 drivers/gpu/drm/xe/xe_tlb_inval.c        |  68 +++++++++++-
 drivers/gpu/drm/xe/xe_tlb_inval.h        |   9 +-
 drivers/gpu/drm/xe/xe_tlb_inval_job.c    |  31 +++++-
 drivers/gpu/drm/xe/xe_tlb_inval_job.h    |   4 +
 drivers/gpu/drm/xe/xe_tlb_inval_types.h  |  12 ++-
 drivers/gpu/drm/xe/xe_vm.c               |   4 +-
 26 files changed, 553 insertions(+), 36 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
 create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h

-- 
2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

From: Matthew Brost <matthew.brost@intel.com>

This is a dependent patch for the page reclamation patch series
taken from another patch series by Matthew Brost, currently under
review: https://patchwork.freedesktop.org/series/156874/

Page reclamation takes the same idea of using invalid seqno to indicate
the initial H2G actions of the in progress tlb invalidation.

Review and comments for this patch should be done in the original patch
series.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 3 ++-
 drivers/gpu/drm/xe/xe_tlb_inval_types.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
index a80175c7c478..f1fd2dd90742 100644
--- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
@@ -236,7 +236,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
 	if (unlikely(len != 1))
 		return -EPROTO;
 
-	xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
+	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
+		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
index 8f8b060e9005..7a6967ce3b76 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
@@ -80,6 +80,7 @@ struct xe_tlb_inval {
 	const struct xe_tlb_inval_ops *ops;
 	/** @tlb_inval.seqno: TLB invalidation seqno, protected by CT lock */
 #define TLB_INVALIDATION_SEQNO_MAX	0x100000
+#define TLB_INVALIDATION_SEQNO_INVALID	TLB_INVALIDATION_SEQNO_MAX
 	int seqno;
 	/** @tlb_invalidation.seqno_lock: protects @tlb_invalidation.seqno */
 	struct mutex seqno_lock;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
  2025-11-18  9:05 ` [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 17:23   ` Lin, Shuicheng
  2025-11-22 18:25   ` Matthew Brost
  2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

TLB_INVALIDATION_SEQNO_INVALID are now used to indicate in progress
multi-step TLB invalidations, so reset tdr to ensure that action
won't prematurely trigger when G2H actions are still ongoing.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c |  2 ++
 drivers/gpu/drm/xe/xe_tlb_inval.c     | 16 ++++++++++++++++
 drivers/gpu/drm/xe/xe_tlb_inval.h     |  1 +
 3 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
index f1fd2dd90742..cd126c53faab 100644
--- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
@@ -238,6 +238,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
 
 	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
 		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
+	else
+		xe_tlb_inval_reset_timeout(&gt->tlb_inval);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index 918a59e686ea..50f05d6b5672 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -199,6 +199,22 @@ void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval)
 	mutex_unlock(&tlb_inval->seqno_lock);
 }
 
+/**
+ * xe_tlb_inval_reset_timeout() - Reset TLB inval fence timeout
+ * @tlb_inval: TLB invalidation client
+ *
+ * Reset the TLB invalidation timeout timer.
+ */
+void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&tlb_inval->pending_lock, flags);
+	mod_delayed_work(system_wq, &tlb_inval->fence_tdr,
+			 tlb_inval->ops->timeout_delay(tlb_inval));
+	spin_unlock_irqrestore(&tlb_inval->pending_lock, flags);
+}
+
 static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval, int seqno)
 {
 	int seqno_recv = READ_ONCE(tlb_inval->seqno_recv);
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
index 05614915463a..9dbddc310eb9 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
@@ -17,6 +17,7 @@ struct xe_vm;
 int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
 
 void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval);
+void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval);
 int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
 		     struct xe_tlb_inval_fence *fence);
 int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
  2025-11-18  9:05 ` [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers Brian Nguyen
  2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 18:02   ` Lin, Shuicheng
  2025-11-22 19:32   ` Matthew Brost
  2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Allow for tlb_invalidation to configure when driver wants to flush the
Private Physical Cache (PPC) as a process of the tlb invalidation
process.

Default behavior is still to always flush the PPC but driver now has the
option to disable it.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 11 +++++++----
 drivers/gpu/drm/xe/xe_tlb_inval.c       | 21 ++++++++++++++++++---
 drivers/gpu/drm/xe/xe_tlb_inval.h       |  5 +++--
 drivers/gpu/drm/xe/xe_tlb_inval_job.c   |  2 +-
 drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++-
 drivers/gpu/drm/xe/xe_vm.c              |  4 ++--
 6 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
index cd126c53faab..c05709a5bc98 100644
--- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
@@ -34,9 +34,12 @@ static int send_tlb_inval(struct xe_guc *guc, const u32 *action, int len)
 			      G2H_LEN_DW_TLB_INVALIDATE, 1);
 }
 
-#define MAKE_INVAL_OP(type)	((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
+#define MAKE_INVAL_OP_FLUSH(type, flush_cache)	((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
 		XE_GUC_TLB_INVAL_MODE_HEAVY << XE_GUC_TLB_INVAL_MODE_SHIFT | \
-		XE_GUC_TLB_INVAL_FLUSH_CACHE)
+		(flush_cache ? \
+		XE_GUC_TLB_INVAL_FLUSH_CACHE : 0))
+
+#define MAKE_INVAL_OP(type)	MAKE_INVAL_OP_FLUSH(type, true)
 
 static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32 seqno)
 {
@@ -100,7 +103,7 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval, u32 seqno)
 #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX))
 
 static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
-				u64 start, u64 end, u32 asid)
+				u64 start, u64 end, u32 asid, bool flush_cache)
 {
 #define MAX_TLB_INVALIDATION_LEN	7
 	struct xe_guc *guc = tlb_inval->private;
@@ -154,7 +157,7 @@ static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
 						    ilog2(SZ_2M) + 1)));
 		xe_gt_assert(gt, IS_ALIGNED(start, length));
 
-		action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
+		action[len++] = MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE, flush_cache);
 		action[len++] = asid;
 		action[len++] = lower_32_bits(start);
 		action[len++] = upper_32_bits(start);
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index 50f05d6b5672..de275759743c 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -324,10 +324,10 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval)
  */
 int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
 		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
-		       u32 asid)
+		       u32 asid, bool flush_cache)
 {
 	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
-				  start, end, asid);
+				  start, end, asid, flush_cache);
 }
 
 /**
@@ -343,7 +343,7 @@ void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm)
 	u64 range = 1ull << vm->xe->info.va_bits;
 
 	xe_tlb_inval_fence_init(tlb_inval, &fence, true);
-	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid);
+	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid, true);
 	xe_tlb_inval_fence_wait(&fence);
 }
 
@@ -420,6 +420,20 @@ static const struct dma_fence_ops inval_fence_ops = {
 	.get_timeline_name = xe_inval_fence_get_timeline_name,
 };
 
+/**
+ * xe_tlb_inval_fence_flush_cache - Control PPC flush at invalidation
+ * @fence: TLB inval fence
+ * @flush_cache: whether to perform PPC cache flush
+ *
+ * Helper function to modify the tlb_inval fence to control the PPC flush.
+ * Other components shouldn't modify fence directly.
+ */
+void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
+				    bool flush_cache)
+{
+	fence->flush_cache = flush_cache;
+}
+
 /**
  * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence
  * @tlb_inval: TLB invalidation client
@@ -446,4 +460,5 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
 	else
 		dma_fence_get(&fence->base);
 	fence->tlb_inval = tlb_inval;
+	fence->flush_cache = true;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
index 9dbddc310eb9..b84ce3e6f294 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
@@ -24,8 +24,9 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
 void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm);
 int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
 		       struct xe_tlb_inval_fence *fence,
-		       u64 start, u64 end, u32 asid);
-
+		       u64 start, u64 end, u32 asid, bool flush_cache);
+void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
+				    bool flush_cache);
 void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
 			     struct xe_tlb_inval_fence *fence,
 			     bool stack);
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
index 1ae0dec2cf31..6248f90323a9 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
@@ -49,7 +49,7 @@ static struct dma_fence *xe_tlb_inval_job_run(struct xe_dep_job *dep_job)
 		container_of(job->fence, typeof(*ifence), base);
 
 	xe_tlb_inval_range(job->tlb_inval, ifence, job->start,
-			   job->end, job->vm->usm.asid);
+			   job->end, job->vm->usm.asid, ifence->flush_cache);
 
 	return job->fence;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
index 7a6967ce3b76..c3c3943fb07e 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
@@ -40,12 +40,13 @@ struct xe_tlb_inval_ops {
 	 * @start: Start address
 	 * @end: End address
 	 * @asid: Address space ID
+	 * @flush_cache: PPC flush control
 	 *
 	 * Return 0 on success, -ECANCELED if backend is mid-reset, error on
 	 * failure
 	 */
 	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
-		     u64 end, u32 asid);
+		     u64 end, u32 asid, bool flush_cache);
 
 	/**
 	 * @initialized: Backend is initialized
@@ -126,6 +127,8 @@ struct xe_tlb_inval_fence {
 	int seqno;
 	/** @inval_time: time of TLB invalidation */
 	ktime_t inval_time;
+	/** @flush_cache: bool for PPC flush, default is true */
+	bool flush_cache;
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 7cac646bdf1c..5fb5226574c5 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3907,7 +3907,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start,
 
 		err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval,
 					 &fence[fence_id], start, end,
-					 vm->usm.asid);
+					 vm->usm.asid, true);
 		if (err)
 			goto wait;
 		++fence_id;
@@ -3920,7 +3920,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start,
 
 		err = xe_tlb_inval_range(&tile->media_gt->tlb_inval,
 					 &fence[fence_id], start, end,
-					 vm->usm.asid);
+					 vm->usm.asid, true);
 		if (err)
 			goto wait;
 		++fence_id;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/11] drm/xe: Add page reclamation info to device info
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (2 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 18:15   ` Lin, Shuicheng
  2025-11-22 18:31   ` Matthew Brost
  2025-11-18  9:05 ` [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC Brian Nguyen
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Oak Zeng, Brian Nguyen

From: Oak Zeng <oak.zeng@intel.com>

Starting from Xe3p, HW adds a feature assisting range based page
reclamation. Introduce a bit in device info to indicate whether
device has such capability.

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h | 2 ++
 drivers/gpu/drm/xe/xe_pci.c          | 1 +
 drivers/gpu/drm/xe/xe_pci_types.h    | 1 +
 3 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 0b2fa7c56d38..268c8e28601a 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -308,6 +308,8 @@ struct xe_device {
 		u8 has_mbx_power_limits:1;
 		/** @info.has_mem_copy_instr: Device supports MEM_COPY instruction */
 		u8 has_mem_copy_instr:1;
+		/** @info.has_page_reclaim_hw_assist: Device supports page reclamation feature */
+		u8 has_page_reclaim_hw_assist:1;
 		/** @info.has_pxp: Device has PXP support */
 		u8 has_pxp:1;
 		/** @info.has_range_tlb_inval: Has range based TLB invalidations */
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index cd03b4b3ebdb..43c47426313e 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -673,6 +673,7 @@ static int xe_info_init_early(struct xe_device *xe,
 	xe->info.has_heci_cscfi = desc->has_heci_cscfi;
 	xe->info.has_late_bind = desc->has_late_bind;
 	xe->info.has_llc = desc->has_llc;
+	xe->info.has_page_reclaim_hw_assist = desc->has_page_reclaim_hw_assist;
 	xe->info.has_pxp = desc->has_pxp;
 	xe->info.has_sriov = xe_configfs_primary_gt_allowed(to_pci_dev(xe->drm.dev)) &&
 		desc->has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
index 9892c063a9c5..151743d4cf72 100644
--- a/drivers/gpu/drm/xe/xe_pci_types.h
+++ b/drivers/gpu/drm/xe/xe_pci_types.h
@@ -47,6 +47,7 @@ struct xe_device_desc {
 	u8 has_llc:1;
 	u8 has_mbx_power_limits:1;
 	u8 has_mem_copy_instr:1;
+	u8 has_page_reclaim_hw_assist:1;
 	u8 has_pxp:1;
 	u8 has_sriov:1;
 	u8 needs_scratch:1;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (3 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 18:32   ` Lin, Shuicheng
  2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Add page reclamation related changes to GuC interface, handlers, and
senders to support page reclamation.

Currently TLB invalidations will perform an entire PPC flush in order to
prevent stale memory access for noncoherent system memory. Page
reclamation is an extension of the typical TLB invalidation
workflow, allowing disabling of full PPC flush and enable selective PPC
flushing. Selective flushing will be decided by a list of pages whom's
address is passed to GuC at time of action.

Page reclamation interfaces require at least GuC FW ver 70.31.0.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
 drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
 drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
 4 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
index 47756e4674a1..11de3bdf69b5 100644
--- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
@@ -151,6 +151,8 @@ enum xe_guc_action {
 	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
 	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
 	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
+	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
+	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
 	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
 	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
 	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 2697d711adb2..e13704e61032 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct *ct, u32 *msg, u32 len)
 	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
 	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
 	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
+	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
 		g2h_release_space(ct, len);
 	}
 
@@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct *ct, u32 *msg, u32 len)
 		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
 		break;
 	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
+	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
 		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
 		break;
 	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
@@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg, bool fast_path)
 		switch (action) {
 		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
 		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
+		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
 			break;	/* Process these in fast-path */
 		default:
 			return 0;
@@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
 		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
 		break;
 	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
+	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
 		__g2h_release_space(ct, len);
 		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
 		break;
diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
index c90dd266e9cf..34d74a71c4f0 100644
--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
@@ -16,6 +16,7 @@
 #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
 #define G2H_LEN_DW_TLB_INVALIDATE		3
 #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
+#define G2H_LEN_DW_PAGE_RECLAMATION		3
 
 #define GUC_ID_MAX			65535
 #define GUC_ID_UNKNOWN			0xffffffff
diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
index c05709a5bc98..3185f8dc00c4 100644
--- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
@@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval, u32 seqno)
 	return -ECANCELED;
 }
 
+static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
+			     u64 gpu_addr)
+{
+	u32 action[] = {
+		XE_GUC_ACTION_PAGE_RECLAMATION,
+		seqno,
+		lower_32_bits(gpu_addr),
+		upper_32_bits(gpu_addr),
+	};
+
+	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
+			      G2H_LEN_DW_PAGE_RECLAMATION, 1);
+}
+
 /*
  * Ensure that roundup_pow_of_two(length) doesn't overflow.
  * Note that roundup_pow_of_two() operates on unsigned long,
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (4 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 21:29   ` Lin, Shuicheng
  2025-11-22 19:18   ` Matthew Brost
  2025-11-18  9:05 ` [PATCH 07/11] drm/xe: Suballocate BO for page reclaim Brian Nguyen
                   ` (7 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Page reclaim list (PRL) is preparation work for the page reclaim feature.
The PRL is firstly owned by pt_update_ops and all other page reclaim
operations will point back to this PRL. PRL generates its entries during
the unbind page walker, updating the PRL.

This PRL is restricted to a 4K page, so 512 page entries at most.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/Makefile           |   1 +
 drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
 drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
 drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
 drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
 6 files changed, 217 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
 create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index e4b273b025d2..048e6c93271c 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -95,6 +95,7 @@ xe-y += xe_bb.o \
 	xe_oa.o \
 	xe_observation.o \
 	xe_pagefault.o \
+	xe_page_reclaim.o \
 	xe_pat.o \
 	xe_pci.o \
 	xe_pcode.o \
diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
index 4389e5a76f89..4d83461e538b 100644
--- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
@@ -9,6 +9,7 @@
 #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
 #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
 
+#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
 #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
 
 #define GUC_GGTT_TOP		0xFEE00000
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
new file mode 100644
index 000000000000..a0d15efff58c
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/bitfield.h>
+#include <linux/kref.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+
+#include "xe_page_reclaim.h"
+
+#include "regs/xe_gt_regs.h"
+#include "xe_assert.h"
+#include "xe_macros.h"
+
+/**
+ * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
+ * @prl: Page reclaim list to reset
+ *
+ * Clears the entries pointer and marks the list as invalid so
+ * future use know PRL is unusable. It is expected that the entries
+ * have already been released.
+ */
+void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl)
+{
+	prl->entries = NULL;
+	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
+}
+
+/**
+ * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list entries
+ * @prl: Page reclaim list to allocate entries for
+ *
+ * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
+ */
+int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl)
+{
+	struct page *page;
+
+	XE_WARN_ON(prl->entries != NULL);
+	if (prl->entries)
+		return 0;
+
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (page) {
+		prl->entries = page_address(page);
+		prl->num_entries = 0;
+	}
+
+	return page ? 0 : -ENOMEM;
+}
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
new file mode 100644
index 000000000000..d066d7d97f79
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PAGE_RECLAIM_H_
+#define _XE_PAGE_RECLAIM_H_
+
+#include <linux/kref.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/workqueue.h>
+
+#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
+#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
+
+struct xe_guc_page_reclaim_entry {
+	u32 valid:1;
+	u32 reclamation_size:6;
+	u32 reserved:5;
+	u32 address_lo:20;
+	u32 address_hi:20;
+	u32 reserved1:12;
+} __packed;
+
+struct xe_page_reclaim_list {
+	/** @entries: array of page reclaim entries, page allocated */
+	struct xe_guc_page_reclaim_entry *entries;
+	/** @num_entries: number of entries */
+	int num_entries;
+#define XE_PAGE_RECLAIM_INVALID_LIST	-1
+};
+
+void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
+int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
+static inline void xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries)
+{
+	if (entries)
+		get_page(virt_to_page(entries));
+}
+
+static inline void xe_page_reclaim_entries_put(struct xe_guc_page_reclaim_entry *entries)
+{
+	if (entries)
+		put_page(virt_to_page(entries));
+}
+
+#endif	/* _XE_PAGE_RECLAIM_H_ */
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 884127b4d97d..532a047676d4 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -12,6 +12,7 @@
 #include "xe_exec_queue.h"
 #include "xe_gt.h"
 #include "xe_migrate.h"
+#include "xe_page_reclaim.h"
 #include "xe_pt_types.h"
 #include "xe_pt_walk.h"
 #include "xe_res_cursor.h"
@@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
 	/* Output */
 	/* @wupd: Structure to track the page-table updates we're building */
 	struct xe_walk_update wupd;
+
+	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
+	struct xe_page_reclaim_list *prl;
 };
 
 /*
@@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64 next, unsigned int level,
 	return false;
 }
 
+/* Huge 2MB leaf lives directly in a level-1 table and has no children */
+static bool is_large_pte(struct xe_pt *pte)
+{
+	return pte->level == 1 && !pte->base.children;
+}
+
+/* page_size = 2^(reclamation_size + 12) */
+#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size)				\
+({									\
+	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
+	ilog2(page_size) - 12;						\
+})
+
+static void generate_reclaim_entry(struct xe_tile *tile,
+				   struct xe_page_reclaim_list *prl,
+				   u64 pte,
+				   struct xe_pt *xe_child)
+{
+	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
+	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
+	const u64 field_mask = GENMASK_ULL(19, 0);
+	u32 reclamation_size;
+	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
+	int num_entries = prl->num_entries;
+
+	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
+	xe_tile_assert(tile, reclaim_entries);
+
+	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
+		return;
+
+	/* Overflow: mark as invalid through num_entries */
+	if (num_entries >= max_entries) {
+		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
+		return;
+	}
+
+	/**
+	 * reclamation_size indicates the size of the page to be
+	 * invalidated and flushed from non-coherent cache.
+	 * Page size is computed as 2^(reclamation_size+12) bytes.
+	 * Only valid for these specific levels.
+	 */
+
+	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
+		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K);  /* reclamation_size = 0 */
+	else if (xe_child->level == 0)
+		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
+	else if (is_large_pte(xe_child))
+		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M);  /* reclamation_size = 2 */
+	else
+		return;
+
+	reclaim_entries[num_entries].valid = 1;
+	reclaim_entries[num_entries].reclamation_size =
+		reclamation_size;
+	reclaim_entries[num_entries].address_lo =
+		FIELD_GET(field_mask, phys_addr);
+	reclaim_entries[num_entries].address_hi =
+		FIELD_GET(field_mask, phys_addr >> 20);
+	prl->num_entries++;
+}
+
 static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
 				    unsigned int level, u64 addr, u64 next,
 				    struct xe_ptw **child,
@@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
 				    struct xe_pt_walk *walk)
 {
 	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child), base);
+	struct xe_pt_stage_unbind_walk *xe_walk =
+		container_of(walk, typeof(*xe_walk), base);
+	struct xe_device *xe = tile_to_xe(xe_walk->tile);
 
 	XE_WARN_ON(!*child);
 	XE_WARN_ON(!level);
 
+	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
+	if (xe_walk->prl && (xe_child->level == 0 || is_large_pte(xe_child))) {
+		struct iosys_map *leaf_map = &xe_child->bo->vmap;
+		pgoff_t first = xe_pt_offset(addr, 0, walk);
+		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
+
+		for (pgoff_t i = 0; i < count; i++) {
+			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64), u64);
+
+			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
+					       pte, xe_child);
+		}
+	}
+
 	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
 
 	return 0;
@@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile,
 {
 	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
 	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
+	struct xe_vm_pgtable_update_op *pt_update_op =
+		container_of(entries, struct xe_vm_pgtable_update_op, entries[0]);
 	struct xe_pt_stage_unbind_walk xe_walk = {
 		.base = {
 			.ops = &xe_pt_stage_unbind_ops,
@@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile,
 		.modified_start = start,
 		.modified_end = end,
 		.wupd.entries = entries,
+		.prl = pt_update_op->prl,
 	};
 	struct xe_pt *pt = vm->pt_root[tile->id];
 
@@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
 			     struct xe_vm_pgtable_update_ops *pt_update_ops,
 			     struct xe_vma *vma)
 {
+	struct xe_device *xe = tile_to_xe(tile);
 	u32 current_op = pt_update_ops->current_op;
 	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops->ops[current_op];
 	int err;
@@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
 	pt_op->vma = vma;
 	pt_op->bind = false;
 	pt_op->rebind = false;
+	/* Maintain one PRL located in pt_update_ops that all others in unbind op reference */
+	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
+		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
+		if (err < 0)
+			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
+	}
+	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL;
 
 	err = vma_reserve_fences(tile_to_xe(tile), vma);
 	if (err)
@@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
 
 	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
 						vma, NULL, pt_op->entries);
+	/* Free PRL if list declared as invalid */
+	if (pt_update_ops->prl.entries &&
+	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
+		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
+		pt_op->prl = NULL;
+		pt_update_ops->prl.entries = NULL;
+	}
 
 	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
 				pt_op->num_entries, false);
@@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
 	pt_op->vma = XE_INVALID_VMA;
 	pt_op->bind = false;
 	pt_op->rebind = false;
+	pt_op->prl = NULL;
 
 	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
 						pt_op->entries);
@@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct xe_vm_pgtable_update_ops *pt_update_ops)
 	init_llist_head(&pt_update_ops->deferred);
 	pt_update_ops->start = ~0x0ull;
 	pt_update_ops->last = 0x0ull;
+	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
 }
 
 /**
@@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
 		&vops->pt_update_ops[tile->id];
 	int i;
 
+	if (pt_update_ops->prl.entries) {
+		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
+		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
+	}
+
 	lockdep_assert_held(&vops->vm->lock);
 	xe_vm_assert_held(vops->vm);
 
diff --git a/drivers/gpu/drm/xe/xe_pt_types.h b/drivers/gpu/drm/xe/xe_pt_types.h
index 881f01e14db8..26e5295f118e 100644
--- a/drivers/gpu/drm/xe/xe_pt_types.h
+++ b/drivers/gpu/drm/xe/xe_pt_types.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 
+#include "xe_page_reclaim.h"
 #include "xe_pt_walk.h"
 
 struct xe_bo;
@@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
 	bool bind;
 	/** @rebind: is a rebind */
 	bool rebind;
+	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
+	struct xe_page_reclaim_list *prl;
 };
 
 /** struct xe_vm_pgtable_update_ops: page table update operations */
@@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
 	 * slots are idle.
 	 */
 	bool wait_vm_kernel;
+	/** @prl: embedded page reclaim list */
+	struct xe_page_reclaim_list prl;
 };
 
 #endif
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/11] drm/xe: Suballocate BO for page reclaim
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (5 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-22 19:42   ` Matthew Brost
  2025-11-18  9:05 ` [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job Brian Nguyen
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Page reclamation feature needs the PRL to be suballocated into a
GGTT-mapped BO. On allocation failure, fallback to default tlb
invalidation with full PPC flush.

PRL's BO allocation is managed in separate pool to ensure 4K alignment
for proper GGTT address.

With BO, pass into TLB invalidation backend and modify fence to
accomadate accordingly.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h    |  7 ++++++
 drivers/gpu/drm/xe/xe_page_reclaim.c    | 33 +++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_page_reclaim.h    |  4 +++
 drivers/gpu/drm/xe/xe_tile.c            |  5 ++++
 drivers/gpu/drm/xe/xe_tlb_inval.c       | 18 ++++++++++++--
 drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++
 6 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 268c8e28601a..057df3f9dc1d 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -184,6 +184,13 @@ struct xe_tile {
 		 * Media GT shares a pool with its primary GT.
 		 */
 		struct xe_sa_manager *kernel_bb_pool;
+
+		/**
+		 * @mem.reclaim_pool: Pool for PRLs allocated.
+		 *
+		 * Only main GT has page reclaim list allocations.
+		 */
+		struct xe_sa_manager *reclaim_pool;
 	} mem;
 
 	/** @sriov: tile level virtualization data */
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
index a0d15efff58c..801a7f1731c0 100644
--- a/drivers/gpu/drm/xe/xe_page_reclaim.c
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
@@ -13,6 +13,39 @@
 #include "regs/xe_gt_regs.h"
 #include "xe_assert.h"
 #include "xe_macros.h"
+#include "xe_sa.h"
+#include "xe_tlb_inval_types.h"
+
+/**
+ * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT BO
+ * @tlb_inval: TLB invalidation frontend associated with the request
+ * @fence: Fence carrying the PRL metadata
+ *
+ * Suballocates a 4K BO out of the tile reclaim pool, copies the PRL CPU
+ * copy into the BO and queues the buffer for release when @fence signals.
+ *
+ * Return: 0 on success or -ENOMEM if the suballocation fails.
+ */
+int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence)
+{
+	struct xe_gt *gt = container_of(tlb_inval, struct xe_gt, tlb_inval);
+	struct xe_tile *tile = gt_to_tile(gt);
+
+	/* Maximum size of PRL is 1 4K-page */
+	fence->prl_sa = __xe_sa_bo_new(tile->mem.reclaim_pool,
+				       XE_PAGE_RECLAIM_LIST_MAX_SIZE, GFP_ATOMIC);
+	if (IS_ERR(fence->prl_sa))
+		return -ENOMEM;
+
+	memcpy(xe_sa_bo_cpu_addr(fence->prl_sa), fence->reclaim_entries,
+	       XE_PAGE_RECLAIM_LIST_MAX_SIZE);
+	xe_sa_bo_flush_write(fence->prl_sa);
+
+	/* Queue up sa_bo_free on fence signal */
+	xe_sa_bo_free(fence->prl_sa, &fence->base);
+
+	return 0;
+}
 
 /**
  * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
index d066d7d97f79..f82b4d0865e0 100644
--- a/drivers/gpu/drm/xe/xe_page_reclaim.h
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
@@ -15,6 +15,9 @@
 #define XE_PAGE_RECLAIM_MAX_ENTRIES	512
 #define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
 
+struct xe_tlb_inval;
+struct xe_tlb_inval_fence;
+
 struct xe_guc_page_reclaim_entry {
 	u32 valid:1;
 	u32 reclamation_size:6;
@@ -32,6 +35,7 @@ struct xe_page_reclaim_list {
 #define XE_PAGE_RECLAIM_INVALID_LIST	-1
 };
 
+int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence);
 void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
 int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
 static inline void xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries)
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index 4f4f9a5c43af..63c060c2ea5c 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -209,6 +209,11 @@ int xe_tile_init(struct xe_tile *tile)
 	if (IS_ERR(tile->mem.kernel_bb_pool))
 		return PTR_ERR(tile->mem.kernel_bb_pool);
 
+	/* Optimistically anticipate at most 256 TLB fences with PRL */
+	tile->mem.reclaim_pool = xe_sa_bo_manager_init(tile, SZ_1M, XE_PAGE_RECLAIM_LIST_MAX_SIZE);
+	if (IS_ERR(tile->mem.reclaim_pool))
+		return PTR_ERR(tile->mem.reclaim_pool);
+
 	return 0;
 }
 void xe_tile_migrate_wait(struct xe_tile *tile)
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index de275759743c..67a047521165 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -15,6 +15,7 @@
 #include "xe_guc_ct.h"
 #include "xe_guc_tlb_inval.h"
 #include "xe_mmio.h"
+#include "xe_page_reclaim.h"
 #include "xe_pm.h"
 #include "xe_tlb_inval.h"
 #include "xe_trace.h"
@@ -326,8 +327,19 @@ int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
 		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
 		       u32 asid, bool flush_cache)
 {
-	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
-				  start, end, asid, flush_cache);
+	int err;
+
+	if (fence->reclaim_entries) {
+		err = xe_page_reclaim_create_prl_bo(tlb_inval, fence);
+		if (err) {
+			flush_cache = true;
+			fence->prl_sa = NULL;
+		}
+	}
+	err = xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
+				 start, end, asid, flush_cache);
+
+	return err;
 }
 
 /**
@@ -461,4 +473,6 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
 		dma_fence_get(&fence->base);
 	fence->tlb_inval = tlb_inval;
 	fence->flush_cache = true;
+	fence->reclaim_entries = NULL;
+	fence->prl_sa = NULL;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
index c3c3943fb07e..7cf741e6a0c7 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
@@ -9,6 +9,7 @@
 #include <linux/workqueue.h>
 #include <linux/dma-fence.h>
 
+struct xe_guc_page_reclaim_entry;
 struct xe_tlb_inval;
 
 /** struct xe_tlb_inval_ops - TLB invalidation ops (backend) */
@@ -129,6 +130,10 @@ struct xe_tlb_inval_fence {
 	ktime_t inval_time;
 	/** @flush_cache: bool for PPC flush, default is true */
 	bool flush_cache;
+	/** @reclaim_entries: list of pages to reclaim */
+	struct xe_guc_page_reclaim_entry *reclaim_entries;
+	/** @prl_sa: BO allocation for page reclaim list */
+	struct drm_suballoc *prl_sa;
 };
 
 #endif
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (6 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 07/11] drm/xe: Suballocate BO for page reclaim Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-22 13:52   ` Michal Wajdeczko
  2025-11-18  9:05 ` [PATCH 09/11] drm/xe: Append page reclamation action to tlb inval Brian Nguyen
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Use page reclaim list as indicator if page reclaim action is desired and
pass it to tlb inval fence to handle.

Job will need to maintain its own embedded copy to ensure lifetime of
PRL exist until job has run.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_pt.c            |  6 ++++++
 drivers/gpu/drm/xe/xe_tlb_inval.c     | 15 ++++++++++++++
 drivers/gpu/drm/xe/xe_tlb_inval.h     |  3 +++
 drivers/gpu/drm/xe/xe_tlb_inval_job.c | 29 +++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_tlb_inval_job.h |  4 ++++
 5 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 532a047676d4..03723c8d2601 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -2497,6 +2497,12 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops)
 			goto kill_vm_tile1;
 		}
 		update.ijob = ijob;
+		if (pt_update_ops->prl.num_entries != XE_PAGE_RECLAIM_INVALID_LIST) {
+			xe_tlb_inval_job_add_page_reclaim(ijob, &pt_update_ops->prl);
+			/* Release ref from alloc, job will now handle it */
+			xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
+			pt_update_ops->prl.entries = NULL;
+		}
 
 		if (tile->media_gt) {
 			dep_scheduler = to_dep_scheduler(q, tile->media_gt);
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index 67a047521165..18d49e017828 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -476,3 +476,18 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
 	fence->reclaim_entries = NULL;
 	fence->prl_sa = NULL;
 }
+
+/**
+ * xe_tlb_inval_fence_add_page_reclaim() - Attach PRL state to a TLB fence
+ * @fence: Fence issued for the invalidate
+ * @prl: Page reclaim list describing pages to reclaim
+ *
+ * Copies the PRL pointer into the fence and disables PPC flushing so the
+ * reclamation message can be sent instead.
+ */
+void xe_tlb_inval_fence_add_page_reclaim(struct xe_tlb_inval_fence *fence,
+					 struct xe_page_reclaim_list *prl)
+{
+	fence->reclaim_entries = prl->entries;
+	fence->flush_cache = false;
+}
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
index b84ce3e6f294..a1cd9afe2ca7 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
@@ -13,6 +13,7 @@
 struct xe_gt;
 struct xe_guc;
 struct xe_vm;
+struct xe_page_reclaim_list;
 
 int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
 
@@ -30,6 +31,8 @@ void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
 void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
 			     struct xe_tlb_inval_fence *fence,
 			     bool stack);
+void xe_tlb_inval_fence_add_page_reclaim(struct xe_tlb_inval_fence *fence,
+					 struct xe_page_reclaim_list *prl);
 
 /**
  * xe_tlb_inval_fence_wait() - TLB invalidiation fence wait
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
index 6248f90323a9..5206a751c3d3 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
@@ -8,6 +8,7 @@
 #include "xe_dep_scheduler.h"
 #include "xe_exec_queue.h"
 #include "xe_gt_types.h"
+#include "xe_page_reclaim.h"
 #include "xe_tlb_inval.h"
 #include "xe_tlb_inval_job.h"
 #include "xe_migrate.h"
@@ -39,6 +40,8 @@ struct xe_tlb_inval_job {
 	int type;
 	/** @fence_armed: Fence has been armed */
 	bool fence_armed;
+	/** @prl: Embedded copy of page reclaim list */
+	struct xe_page_reclaim_list prl;
 };
 
 static struct dma_fence *xe_tlb_inval_job_run(struct xe_dep_job *dep_job)
@@ -107,6 +110,7 @@ xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval,
 	job->start = start;
 	job->end = end;
 	job->fence_armed = false;
+	xe_page_reclaim_list_invalidate(&job->prl);
 	job->dep.ops = &dep_job_ops;
 	job->type = type;
 	kref_init(&job->refcount);
@@ -140,6 +144,25 @@ xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval,
 	return ERR_PTR(err);
 }
 
+/**
+ * xe_tlb_inval_job_add_page_reclaim() - Embed PRL into a TLB job
+ * @job: TLB invalidation job that may trigger reclamation
+ * @prl: Page reclaim list populated during unbind
+ *
+ * Copies @prl into the job and takes an extra reference to the entry page so
+ * ownership can transfer to the TLB fence when the job is pushed.
+ */
+void xe_tlb_inval_job_add_page_reclaim(struct xe_tlb_inval_job *job,
+				       struct xe_page_reclaim_list *prl)
+{
+	struct xe_device *xe = gt_to_xe(job->q->gt);
+
+	WARN_ON(!xe->info.has_page_reclaim_hw_assist);
+	job->prl = *prl;
+	/* Pair with put after bo creation */
+	xe_page_reclaim_entries_get(job->prl.entries);
+}
+
 static void xe_tlb_inval_job_destroy(struct kref *ref)
 {
 	struct xe_tlb_inval_job *job = container_of(ref, typeof(*job),
@@ -150,6 +173,10 @@ static void xe_tlb_inval_job_destroy(struct kref *ref)
 	struct xe_device *xe = gt_to_xe(q->gt);
 	struct xe_vm *vm = job->vm;
 
+	/* BO creation retains a copy (if used), so no longer needed */
+	if (job->prl.entries)
+		xe_page_reclaim_entries_put(job->prl.entries);
+
 	if (!job->fence_armed)
 		kfree(ifence);
 	else
@@ -234,6 +261,8 @@ struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job,
 	/* Creation ref pairs with put in xe_tlb_inval_job_destroy */
 	xe_tlb_inval_fence_init(job->tlb_inval, ifence, false);
 	dma_fence_get(job->fence);	/* Pairs with put in DRM scheduler */
+	if (job->prl.num_entries != XE_PAGE_RECLAIM_INVALID_LIST)
+		xe_tlb_inval_fence_add_page_reclaim(ifence, &job->prl);
 
 	drm_sched_job_arm(&job->dep.drm);
 	/*
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.h b/drivers/gpu/drm/xe/xe_tlb_inval_job.h
index 4d6df1a6c6ca..03d6e21cd611 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_job.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.h
@@ -12,6 +12,7 @@ struct dma_fence;
 struct xe_dep_scheduler;
 struct xe_exec_queue;
 struct xe_migrate;
+struct xe_page_reclaim_list;
 struct xe_tlb_inval;
 struct xe_tlb_inval_job;
 struct xe_vm;
@@ -21,6 +22,9 @@ xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval,
 			struct xe_dep_scheduler *dep_scheduler,
 			struct xe_vm *vm, u64 start, u64 end, int type);
 
+void xe_tlb_inval_job_add_page_reclaim(struct xe_tlb_inval_job *job,
+				       struct xe_page_reclaim_list *prl);
+
 int xe_tlb_inval_job_alloc_dep(struct xe_tlb_inval_job *job);
 
 struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job,
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/11] drm/xe: Append page reclamation action to tlb inval
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (7 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-18  9:05 ` [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim Brian Nguyen
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Add page reclamation action to tlb inval backend. The page reclamation
action is paired with range tlb invalidations so both are issued at the
same time.

Page reclamation will issue the TLB invalidation with an invalid seqno
and a H2G page reclamation action with the fence's corresponding seqno
and handle the fence accordingly on page reclaim action done handler.

If page reclamation fails, tlb timeout handler will be responsible for
signalling fence and celaning up.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 13 +++++++++----
 drivers/gpu/drm/xe/xe_tlb_inval.c       |  2 +-
 drivers/gpu/drm/xe/xe_tlb_inval_types.h |  3 ++-
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
index 3185f8dc00c4..f42dcaf17aab 100644
--- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
@@ -13,6 +13,7 @@
 #include "xe_guc_tlb_inval.h"
 #include "xe_force_wake.h"
 #include "xe_mmio.h"
+#include "xe_sa.h"
 #include "xe_tlb_inval.h"
 
 #include "regs/xe_guc_regs.h"
@@ -117,20 +118,21 @@ static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
 #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX))
 
 static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
-				u64 start, u64 end, u32 asid, bool flush_cache)
+				u64 start, u64 end, u32 asid, bool flush_cache,
+				struct drm_suballoc *prl_sa)
 {
 #define MAX_TLB_INVALIDATION_LEN	7
 	struct xe_guc *guc = tlb_inval->private;
 	struct xe_gt *gt = guc_to_gt(guc);
 	u32 action[MAX_TLB_INVALIDATION_LEN];
 	u64 length = end - start;
-	int len = 0;
+	int len = 0, err;
 
 	if (guc_to_xe(guc)->info.force_execlist)
 		return -ECANCELED;
 
 	action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
-	action[len++] = seqno;
+	action[len++] = !prl_sa ? seqno : TLB_INVALIDATION_SEQNO_INVALID;
 	if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
 	    length > MAX_RANGE_TLB_INVALIDATION_LENGTH) {
 		action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
@@ -180,7 +182,10 @@ static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
 
 	xe_gt_assert(gt, len <= MAX_TLB_INVALIDATION_LEN);
 
-	return send_tlb_inval(guc, action, len);
+	err = send_tlb_inval(guc, action, len);
+	if (!err && prl_sa)
+		err = send_page_reclaim(guc, seqno, xe_sa_bo_gpu_addr(prl_sa));
+	return err;
 }
 
 static bool tlb_inval_initialized(struct xe_tlb_inval *tlb_inval)
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index 18d49e017828..8ab967f47b45 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -337,7 +337,7 @@ int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
 		}
 	}
 	err = xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
-				 start, end, asid, flush_cache);
+				 start, end, asid, flush_cache, fence->prl_sa);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
index 7cf741e6a0c7..386f51db5a1c 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
+++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
@@ -9,6 +9,7 @@
 #include <linux/workqueue.h>
 #include <linux/dma-fence.h>
 
+struct drm_suballoc;
 struct xe_guc_page_reclaim_entry;
 struct xe_tlb_inval;
 
@@ -47,7 +48,7 @@ struct xe_tlb_inval_ops {
 	 * failure
 	 */
 	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
-		     u64 end, u32 asid, bool flush_cache);
+		     u64 end, u32 asid, bool flush_cache, struct drm_suballoc *sa);
 
 	/**
 	 * @initialized: Backend is initialized
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (8 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 09/11] drm/xe: Append page reclamation action to tlb inval Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-24 12:29   ` Matthew Auld
  2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

In Xe3p and beyond, there are additional hardware managed L2$ flushing
for the deemed transient display and transient app buffers. In those
scenarios, page reclamation is unnecessary resulting in redundant
cachline flushes, so skip over those corresponding ranges.

Add chicken bit to determine media engine status to help facilitate
decision making in L2$ flush skipping.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 +++++++
 drivers/gpu/drm/xe/xe_page_reclaim.c | 43 ++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_page_reclaim.h |  3 ++
 drivers/gpu/drm/xe/xe_pat.c          |  9 +-----
 drivers/gpu/drm/xe/xe_pt.c           |  3 +-
 5 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
index 917a088c28f2..a18a2d59153e 100644
--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
@@ -99,6 +99,14 @@
 #define VE1_AUX_INV				XE_REG(0x42b8)
 #define   AUX_INV				REG_BIT(0)
 
+#define _PAT_PTA				0x4820
+#define   XE2_NO_PROMOTE			REG_BIT(10)
+#define   XE2_COMP_EN				REG_BIT(9)
+#define   XE2_L3_CLOS				REG_GENMASK(7, 6)
+#define   XE2_L3_POLICY				REG_GENMASK(5, 4)
+#define   XE2_L4_POLICY				REG_GENMASK(3, 2)
+#define   XE2_COH_MODE				REG_GENMASK(1, 0)
+
 #define XE2_LMEM_CFG				XE_REG(0x48b0)
 
 #define XEHP_FLAT_CCS_BASE_ADDR			XE_REG_MCR(0x4910)
@@ -429,6 +437,9 @@
 
 #define XE2_GLOBAL_INVAL			XE_REG(0xb404)
 
+#define LTISEQCHK				XE_REG(0xb49c)
+#define   XE3P_MEDIA_IS_ON			REG_BIT(2)
+
 #define XE2LPM_L3SQCREG2			XE_REG_MCR(0xb604)
 
 #define XE2LPM_L3SQCREG3			XE_REG_MCR(0xb608)
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
index 801a7f1731c0..2f0e7547732c 100644
--- a/drivers/gpu/drm/xe/xe_page_reclaim.c
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
@@ -13,8 +13,51 @@
 #include "regs/xe_gt_regs.h"
 #include "xe_assert.h"
 #include "xe_macros.h"
+#include "xe_mmio.h"
+#include "xe_pat.h"
 #include "xe_sa.h"
 #include "xe_tlb_inval_types.h"
+#include "xe_vm.h"
+
+/**
+ * xe_page_reclaim_skip() - Decide whether PRL should be skipped for a VMA
+ * @tile: Tile owning the VMA
+ * @vma: VMA under consideration
+ *
+ * Xe3p and beyond can handle PPC flushing for specific PAT encodings.
+ * Skip PPC flushing in both scenarios below.
+ * - pat_index is transient display (1)
+ * - pat_index is transient app (2) and Media is off
+ *
+ * Return: true when page reclamation is unnecessary, false otherwise.
+ */
+bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma)
+{
+	struct xe_device *xe = xe_vma_vm(vma)->xe;
+	struct xe_mmio *mmio = &tile->primary_gt->mmio;
+	u16 pat_index = vma->attr.pat_index;
+	u32 pat_value;
+	u8 l3_policy;
+	bool is_media_awake;
+
+	/* Ensure called only with Xe3p due to associated PAT index */
+	xe_assert(tile->xe, GRAPHICS_VER(tile->xe) >= 35);
+	xe_assert(tile->xe, pat_index < xe->pat.n_entries);
+
+	pat_value = xe->pat.table[pat_index].value;
+	l3_policy = REG_FIELD_GET(XE2_L3_POLICY, pat_value);
+	is_media_awake = xe_mmio_read32(mmio, LTISEQCHK) & XE3P_MEDIA_IS_ON;
+
+	/**
+	 *   - l3_policy:   0=WB, 1=XD ("WB - Transient Display"),
+	 *                  2=XA ("WB - Transient App" for Xe3p), 3=UC
+	 * From Xe3p, transient display flush is taken care by HW, l3_policy = 1
+	 *
+	 * Also with Xe3p, pat_index=18/19 corresponds to transient app flushing
+	 * which is handled by HW when media is off.
+	 */
+	return (l3_policy == 1 || (!is_media_awake && (pat_index == 18 || pat_index == 19)));
+}
 
 /**
  * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT BO
diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
index f82b4d0865e0..dafd4edd6f61 100644
--- a/drivers/gpu/drm/xe/xe_page_reclaim.h
+++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
@@ -17,6 +17,8 @@
 
 struct xe_tlb_inval;
 struct xe_tlb_inval_fence;
+struct xe_tile;
+struct xe_vma;
 
 struct xe_guc_page_reclaim_entry {
 	u32 valid:1;
@@ -35,6 +37,7 @@ struct xe_page_reclaim_list {
 #define XE_PAGE_RECLAIM_INVALID_LIST	-1
 };
 
+bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma);
 int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence);
 void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
 int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
index 1b4d5d3def0f..4783acd1f027 100644
--- a/drivers/gpu/drm/xe/xe_pat.c
+++ b/drivers/gpu/drm/xe/xe_pat.c
@@ -9,6 +9,7 @@
 
 #include <generated/xe_wa_oob.h>
 
+#include "regs/xe_gt_regs.h"
 #include "regs/xe_reg_defs.h"
 #include "xe_assert.h"
 #include "xe_device.h"
@@ -23,14 +24,6 @@
 #define _PAT_INDEX(index)			_PICK_EVEN_2RANGES(index, 8, \
 								   0x4800, 0x4804, \
 								   0x4848, 0x484c)
-#define _PAT_PTA				0x4820
-
-#define XE2_NO_PROMOTE				REG_BIT(10)
-#define XE2_COMP_EN				REG_BIT(9)
-#define XE2_L3_CLOS				REG_GENMASK(7, 6)
-#define XE2_L3_POLICY				REG_GENMASK(5, 4)
-#define XE2_L4_POLICY				REG_GENMASK(3, 2)
-#define XE2_COH_MODE				REG_GENMASK(1, 0)
 
 #define XELPG_L4_POLICY_MASK			REG_GENMASK(3, 2)
 #define XELPG_PAT_3_UC				REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index 03723c8d2601..8ccab39c2599 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -2008,7 +2008,8 @@ static int unbind_op_prepare(struct xe_tile *tile,
 		if (err < 0)
 			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
 	}
-	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL;
+	pt_op->prl = (pt_update_ops->prl.entries &&
+		     !xe_page_reclaim_skip(tile, vma)) ? &pt_update_ops->prl : NULL;
 
 	err = vma_reserve_fences(tile_to_xe(tile), vma);
 	if (err)
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/11] drm/xe: Add debugfs support for page reclamation
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (9 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim Brian Nguyen
@ 2025-11-18  9:05 ` Brian Nguyen
  2025-11-21 22:32   ` Lin, Shuicheng
  2025-11-22 14:18   ` Michal Wajdeczko
  2025-11-18  9:52 ` ✗ CI.checkpatch: warning for Page Reclamation Support for Xe3p Platforms Patchwork
                   ` (2 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Brian Nguyen @ 2025-11-18  9:05 UTC (permalink / raw)
  To: intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers,
	Brian Nguyen

Allow for runtime modification to page reclamation feature through
debugfs configuration. This parameter will only take effect if the
platform supports the page reclamation feature by default.

Move xe_match_desc to common header for debugfs access to read default
device values of xe driver for current platform.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
---
 drivers/gpu/drm/xe/xe_configfs.c | 11 +-------
 drivers/gpu/drm/xe/xe_debugfs.c  | 47 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_device.c   | 10 +++++++
 drivers/gpu/drm/xe/xe_device.h   |  2 ++
 4 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
index 9f6251b1008b..efc6d0690b27 100644
--- a/drivers/gpu/drm/xe/xe_configfs.c
+++ b/drivers/gpu/drm/xe/xe_configfs.c
@@ -15,6 +15,7 @@
 
 #include "instructions/xe_mi_commands.h"
 #include "xe_configfs.h"
+#include "xe_device.h"
 #include "xe_gt_types.h"
 #include "xe_hw_engine_types.h"
 #include "xe_module.h"
@@ -925,16 +926,6 @@ static const struct config_item_type xe_config_sriov_type = {
 	.ct_attrs	= xe_config_sriov_attrs,
 };
 
-static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
-{
-	struct device_driver *driver = driver_find("xe", &pci_bus_type);
-	struct pci_driver *drv = to_pci_driver(driver);
-	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
-	const struct pci_device_id *found = pci_match_id(ids, pdev);
-
-	return found ? (const void *)found->driver_data : NULL;
-}
-
 static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn)
 {
 	struct pci_dev *physfn = pci_physfn(virtfn);
diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index e91da9589c5f..572c61ee1e29 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -19,6 +19,7 @@
 #include "xe_gt_printk.h"
 #include "xe_guc_ads.h"
 #include "xe_mmio.h"
+#include "xe_pci_types.h"
 #include "xe_pm.h"
 #include "xe_psmi.h"
 #include "xe_pxp_debugfs.h"
@@ -297,6 +298,49 @@ static const struct file_operations wedged_mode_fops = {
 	.write = wedged_mode_set,
 };
 
+static ssize_t page_reclaim_hw_assist_show(struct file *f, char __user *ubuf,
+					   size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	char buf[8];
+	int len;
+
+	len = scnprintf(buf, sizeof(buf), "%d\n", xe->info.has_page_reclaim_hw_assist);
+	return simple_read_from_buffer(ubuf, size, pos, buf, len);
+}
+
+static ssize_t page_reclaim_hw_assist_set(struct file *f, const char __user *ubuf,
+					  size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+	const struct xe_device_desc *desc = xe_match_desc(pdev);
+	unsigned int val;
+	ssize_t ret;
+
+	ret = kstrtouint_from_user(ubuf, size, 0, &val);
+	if (ret)
+		return ret;
+
+	/**
+	 * Don't modify if page reclamation support isn't normally
+	 * supported by the HW.
+	 */
+
+	if (!desc || !desc->has_page_reclaim_hw_assist)
+		return -ENODEV;
+
+	xe->info.has_page_reclaim_hw_assist = !!val;
+
+	return size;
+}
+
+static const struct file_operations page_reclaim_hw_assist_fops = {
+	.owner = THIS_MODULE,
+	.read = page_reclaim_hw_assist_show,
+	.write = page_reclaim_hw_assist_set,
+};
+
 static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user *ubuf,
 					    size_t size, loff_t *pos)
 {
@@ -403,6 +447,9 @@ void xe_debugfs_register(struct xe_device *xe)
 	debugfs_create_file("disable_late_binding", 0600, root, xe,
 			    &disable_late_binding_fops);
 
+	debugfs_create_file("page_reclaim_hw_assist", 0600, root, xe,
+			    &page_reclaim_hw_assist_fops);
+
 	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
 		man = ttm_manager_type(bdev, mem_type);
 
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index c7d373c70f0f..16afddc5e35e 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -1295,3 +1295,13 @@ void xe_device_declare_wedged(struct xe_device *xe)
 		drm_dev_wedged_event(&xe->drm, xe->wedged.method, NULL);
 	}
 }
+
+const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
+{
+	struct device_driver *driver = driver_find("xe", &pci_bus_type);
+	struct pci_driver *drv = to_pci_driver(driver);
+	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
+	const struct pci_device_id *found = pci_match_id(ids, pdev);
+
+	return found ? (const void *)found->driver_data : NULL;
+}
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 32cc6323b7f6..a66e8e4b3e01 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -193,6 +193,8 @@ void xe_device_declare_wedged(struct xe_device *xe);
 struct xe_file *xe_file_get(struct xe_file *xef);
 void xe_file_put(struct xe_file *xef);
 
+const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev);
+
 int xe_is_injection_active(void);
 
 /*
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* ✗ CI.checkpatch: warning for Page Reclamation Support for Xe3p Platforms
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (10 preceding siblings ...)
  2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
@ 2025-11-18  9:52 ` Patchwork
  2025-11-18  9:53 ` ✓ CI.KUnit: success " Patchwork
  2025-11-18 13:02 ` ✗ Xe.CI.Full: failure " Patchwork
  13 siblings, 0 replies; 51+ messages in thread
From: Patchwork @ 2025-11-18  9:52 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe

== Series Details ==

Series: Page Reclamation Support for Xe3p Platforms
URL   : https://patchwork.freedesktop.org/series/157698/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
2de9a3901bc28757c7906b454717b64e2a214021
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit ffbb6692b205691569fa88ee2715c717ee46b762
Author: Brian Nguyen <brian3.nguyen@intel.com>
Date:   Tue Nov 18 17:05:52 2025 +0800

    drm/xe: Add debugfs support for page reclamation
    
    Allow for runtime modification to page reclamation feature through
    debugfs configuration. This parameter will only take effect if the
    platform supports the page reclamation feature by default.
    
    Move xe_match_desc to common header for debugfs access to read default
    device values of xe driver for current platform.
    
    Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
+ /mt/dim checkpatch 91fc6d984707c9bfd4a60550e6a85f1a991e7ec8 drm-intel
c80c36a867f4 drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers
-:9: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'review:', use 'Link:' or 'Closes:' instead
#9: 
review: https://patchwork.freedesktop.org/series/156874/

total: 0 errors, 1 warnings, 0 checks, 16 lines checked
2adb1a2c4b56 drm/xe: Reset tlb fence timeout on invalid seqno received
ff3049866943 drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
9ed5753590f2 drm/xe: Add page reclamation info to device info
76ea654c09c9 drm/xe/guc: Add page reclamation interface to GuC
07326f2e30d0 drm/xe: Create page reclaim list on unbind
-:40: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#40: 
new file mode 100644

-:85: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "prl->entries"
#85: FILE: drivers/gpu/drm/xe/xe_page_reclaim.c:41:
+	XE_WARN_ON(prl->entries != NULL);

total: 0 errors, 1 warnings, 1 checks, 323 lines checked
fc5e7552663a drm/xe: Suballocate BO for page reclaim
b3fb1c631f7a drm/xe: Prep page reclaim in tlb inval job
ae7c87d5ae9a drm/xe: Append page reclamation action to tlb inval
d99e1cd9c590 drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
ffbb6692b205 drm/xe: Add debugfs support for page reclamation



^ permalink raw reply	[flat|nested] 51+ messages in thread

* ✓ CI.KUnit: success for Page Reclamation Support for Xe3p Platforms
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (11 preceding siblings ...)
  2025-11-18  9:52 ` ✗ CI.checkpatch: warning for Page Reclamation Support for Xe3p Platforms Patchwork
@ 2025-11-18  9:53 ` Patchwork
  2025-11-18 13:02 ` ✗ Xe.CI.Full: failure " Patchwork
  13 siblings, 0 replies; 51+ messages in thread
From: Patchwork @ 2025-11-18  9:53 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe

== Series Details ==

Series: Page Reclamation Support for Xe3p Platforms
URL   : https://patchwork.freedesktop.org/series/157698/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[09:52:37] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[09:52:41] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[09:53:12] Starting KUnit Kernel (1/1)...
[09:53:12] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[09:53:12] ================== guc_buf (11 subtests) ===================
[09:53:12] [PASSED] test_smallest
[09:53:12] [PASSED] test_largest
[09:53:12] [PASSED] test_granular
[09:53:12] [PASSED] test_unique
[09:53:12] [PASSED] test_overlap
[09:53:12] [PASSED] test_reusable
[09:53:12] [PASSED] test_too_big
[09:53:12] [PASSED] test_flush
[09:53:12] [PASSED] test_lookup
[09:53:12] [PASSED] test_data
[09:53:12] [PASSED] test_class
[09:53:12] ===================== [PASSED] guc_buf =====================
[09:53:12] =================== guc_dbm (7 subtests) ===================
[09:53:12] [PASSED] test_empty
[09:53:12] [PASSED] test_default
[09:53:12] ======================== test_size  ========================
[09:53:12] [PASSED] 4
[09:53:12] [PASSED] 8
[09:53:12] [PASSED] 32
[09:53:12] [PASSED] 256
[09:53:12] ==================== [PASSED] test_size ====================
[09:53:12] ======================= test_reuse  ========================
[09:53:12] [PASSED] 4
[09:53:12] [PASSED] 8
[09:53:12] [PASSED] 32
[09:53:12] [PASSED] 256
[09:53:12] =================== [PASSED] test_reuse ====================
[09:53:12] =================== test_range_overlap  ====================
[09:53:12] [PASSED] 4
[09:53:12] [PASSED] 8
[09:53:12] [PASSED] 32
[09:53:12] [PASSED] 256
[09:53:12] =============== [PASSED] test_range_overlap ================
[09:53:12] =================== test_range_compact  ====================
[09:53:12] [PASSED] 4
[09:53:12] [PASSED] 8
[09:53:12] [PASSED] 32
[09:53:12] [PASSED] 256
[09:53:12] =============== [PASSED] test_range_compact ================
[09:53:12] ==================== test_range_spare  =====================
[09:53:12] [PASSED] 4
[09:53:12] [PASSED] 8
[09:53:12] [PASSED] 32
[09:53:12] [PASSED] 256
[09:53:12] ================ [PASSED] test_range_spare =================
[09:53:12] ===================== [PASSED] guc_dbm =====================
[09:53:12] =================== guc_idm (6 subtests) ===================
[09:53:12] [PASSED] bad_init
[09:53:12] [PASSED] no_init
[09:53:12] [PASSED] init_fini
[09:53:12] [PASSED] check_used
[09:53:12] [PASSED] check_quota
[09:53:12] [PASSED] check_all
[09:53:12] ===================== [PASSED] guc_idm =====================
[09:53:12] ================== no_relay (3 subtests) ===================
[09:53:12] [PASSED] xe_drops_guc2pf_if_not_ready
[09:53:12] [PASSED] xe_drops_guc2vf_if_not_ready
[09:53:12] [PASSED] xe_rejects_send_if_not_ready
[09:53:12] ==================== [PASSED] no_relay =====================
[09:53:12] ================== pf_relay (14 subtests) ==================
[09:53:12] [PASSED] pf_rejects_guc2pf_too_short
[09:53:12] [PASSED] pf_rejects_guc2pf_too_long
[09:53:12] [PASSED] pf_rejects_guc2pf_no_payload
[09:53:12] [PASSED] pf_fails_no_payload
[09:53:12] [PASSED] pf_fails_bad_origin
[09:53:12] [PASSED] pf_fails_bad_type
[09:53:12] [PASSED] pf_txn_reports_error
[09:53:12] [PASSED] pf_txn_sends_pf2guc
[09:53:12] [PASSED] pf_sends_pf2guc
[09:53:12] [SKIPPED] pf_loopback_nop
[09:53:12] [SKIPPED] pf_loopback_echo
[09:53:13] [SKIPPED] pf_loopback_fail
[09:53:13] [SKIPPED] pf_loopback_busy
[09:53:13] [SKIPPED] pf_loopback_retry
[09:53:13] ==================== [PASSED] pf_relay =====================
[09:53:13] ================== vf_relay (3 subtests) ===================
[09:53:13] [PASSED] vf_rejects_guc2vf_too_short
[09:53:13] [PASSED] vf_rejects_guc2vf_too_long
[09:53:13] [PASSED] vf_rejects_guc2vf_no_payload
[09:53:13] ==================== [PASSED] vf_relay =====================
[09:53:13] ================ pf_gt_config (6 subtests) =================
[09:53:13] [PASSED] fair_contexts_1vf
[09:53:13] [PASSED] fair_doorbells_1vf
[09:53:13] [PASSED] fair_ggtt_1vf
[09:53:13] ====================== fair_contexts  ======================
[09:53:13] [PASSED] 1 VF
[09:53:13] [PASSED] 2 VFs
[09:53:13] [PASSED] 3 VFs
[09:53:13] [PASSED] 4 VFs
[09:53:13] [PASSED] 5 VFs
[09:53:13] [PASSED] 6 VFs
[09:53:13] [PASSED] 7 VFs
[09:53:13] [PASSED] 8 VFs
[09:53:13] [PASSED] 9 VFs
[09:53:13] [PASSED] 10 VFs
[09:53:13] [PASSED] 11 VFs
[09:53:13] [PASSED] 12 VFs
[09:53:13] [PASSED] 13 VFs
[09:53:13] [PASSED] 14 VFs
[09:53:13] [PASSED] 15 VFs
[09:53:13] [PASSED] 16 VFs
[09:53:13] [PASSED] 17 VFs
[09:53:13] [PASSED] 18 VFs
[09:53:13] [PASSED] 19 VFs
[09:53:13] [PASSED] 20 VFs
[09:53:13] [PASSED] 21 VFs
[09:53:13] [PASSED] 22 VFs
[09:53:13] [PASSED] 23 VFs
[09:53:13] [PASSED] 24 VFs
[09:53:13] [PASSED] 25 VFs
[09:53:13] [PASSED] 26 VFs
[09:53:13] [PASSED] 27 VFs
[09:53:13] [PASSED] 28 VFs
[09:53:13] [PASSED] 29 VFs
[09:53:13] [PASSED] 30 VFs
[09:53:13] [PASSED] 31 VFs
[09:53:13] [PASSED] 32 VFs
[09:53:13] [PASSED] 33 VFs
[09:53:13] [PASSED] 34 VFs
[09:53:13] [PASSED] 35 VFs
[09:53:13] [PASSED] 36 VFs
[09:53:13] [PASSED] 37 VFs
[09:53:13] [PASSED] 38 VFs
[09:53:13] [PASSED] 39 VFs
[09:53:13] [PASSED] 40 VFs
[09:53:13] [PASSED] 41 VFs
[09:53:13] [PASSED] 42 VFs
[09:53:13] [PASSED] 43 VFs
[09:53:13] [PASSED] 44 VFs
[09:53:13] [PASSED] 45 VFs
[09:53:13] [PASSED] 46 VFs
[09:53:13] [PASSED] 47 VFs
[09:53:13] [PASSED] 48 VFs
[09:53:13] [PASSED] 49 VFs
[09:53:13] [PASSED] 50 VFs
[09:53:13] [PASSED] 51 VFs
[09:53:13] [PASSED] 52 VFs
[09:53:13] [PASSED] 53 VFs
[09:53:13] [PASSED] 54 VFs
[09:53:13] [PASSED] 55 VFs
[09:53:13] [PASSED] 56 VFs
[09:53:13] [PASSED] 57 VFs
[09:53:13] [PASSED] 58 VFs
[09:53:13] [PASSED] 59 VFs
[09:53:13] [PASSED] 60 VFs
[09:53:13] [PASSED] 61 VFs
[09:53:13] [PASSED] 62 VFs
[09:53:13] [PASSED] 63 VFs
[09:53:13] ================== [PASSED] fair_contexts ==================
[09:53:13] ===================== fair_doorbells  ======================
[09:53:13] [PASSED] 1 VF
[09:53:13] [PASSED] 2 VFs
[09:53:13] [PASSED] 3 VFs
[09:53:13] [PASSED] 4 VFs
[09:53:13] [PASSED] 5 VFs
[09:53:13] [PASSED] 6 VFs
[09:53:13] [PASSED] 7 VFs
[09:53:13] [PASSED] 8 VFs
[09:53:13] [PASSED] 9 VFs
[09:53:13] [PASSED] 10 VFs
[09:53:13] [PASSED] 11 VFs
[09:53:13] [PASSED] 12 VFs
[09:53:13] [PASSED] 13 VFs
[09:53:13] [PASSED] 14 VFs
[09:53:13] [PASSED] 15 VFs
[09:53:13] [PASSED] 16 VFs
[09:53:13] [PASSED] 17 VFs
[09:53:13] [PASSED] 18 VFs
[09:53:13] [PASSED] 19 VFs
[09:53:13] [PASSED] 20 VFs
[09:53:13] [PASSED] 21 VFs
[09:53:13] [PASSED] 22 VFs
[09:53:13] [PASSED] 23 VFs
[09:53:13] [PASSED] 24 VFs
[09:53:13] [PASSED] 25 VFs
[09:53:13] [PASSED] 26 VFs
[09:53:13] [PASSED] 27 VFs
[09:53:13] [PASSED] 28 VFs
[09:53:13] [PASSED] 29 VFs
[09:53:13] [PASSED] 30 VFs
[09:53:13] [PASSED] 31 VFs
[09:53:13] [PASSED] 32 VFs
[09:53:13] [PASSED] 33 VFs
[09:53:13] [PASSED] 34 VFs
[09:53:13] [PASSED] 35 VFs
[09:53:13] [PASSED] 36 VFs
[09:53:13] [PASSED] 37 VFs
[09:53:13] [PASSED] 38 VFs
[09:53:13] [PASSED] 39 VFs
[09:53:13] [PASSED] 40 VFs
[09:53:13] [PASSED] 41 VFs
[09:53:13] [PASSED] 42 VFs
[09:53:13] [PASSED] 43 VFs
[09:53:13] [PASSED] 44 VFs
[09:53:13] [PASSED] 45 VFs
[09:53:13] [PASSED] 46 VFs
[09:53:13] [PASSED] 47 VFs
[09:53:13] [PASSED] 48 VFs
[09:53:13] [PASSED] 49 VFs
[09:53:13] [PASSED] 50 VFs
[09:53:13] [PASSED] 51 VFs
[09:53:13] [PASSED] 52 VFs
[09:53:13] [PASSED] 53 VFs
[09:53:13] [PASSED] 54 VFs
[09:53:13] [PASSED] 55 VFs
[09:53:13] [PASSED] 56 VFs
[09:53:13] [PASSED] 57 VFs
[09:53:13] [PASSED] 58 VFs
[09:53:13] [PASSED] 59 VFs
[09:53:13] [PASSED] 60 VFs
[09:53:13] [PASSED] 61 VFs
[09:53:13] [PASSED] 62 VFs
[09:53:13] [PASSED] 63 VFs
[09:53:13] ================= [PASSED] fair_doorbells ==================
[09:53:13] ======================== fair_ggtt  ========================
[09:53:13] [PASSED] 1 VF
[09:53:13] [PASSED] 2 VFs
[09:53:13] [PASSED] 3 VFs
[09:53:13] [PASSED] 4 VFs
[09:53:13] [PASSED] 5 VFs
[09:53:13] [PASSED] 6 VFs
[09:53:13] [PASSED] 7 VFs
[09:53:13] [PASSED] 8 VFs
[09:53:13] [PASSED] 9 VFs
[09:53:13] [PASSED] 10 VFs
[09:53:13] [PASSED] 11 VFs
[09:53:13] [PASSED] 12 VFs
[09:53:13] [PASSED] 13 VFs
[09:53:13] [PASSED] 14 VFs
[09:53:13] [PASSED] 15 VFs
[09:53:13] [PASSED] 16 VFs
[09:53:13] [PASSED] 17 VFs
[09:53:13] [PASSED] 18 VFs
[09:53:13] [PASSED] 19 VFs
[09:53:13] [PASSED] 20 VFs
[09:53:13] [PASSED] 21 VFs
[09:53:13] [PASSED] 22 VFs
[09:53:13] [PASSED] 23 VFs
[09:53:13] [PASSED] 24 VFs
[09:53:13] [PASSED] 25 VFs
[09:53:13] [PASSED] 26 VFs
[09:53:13] [PASSED] 27 VFs
[09:53:13] [PASSED] 28 VFs
[09:53:13] [PASSED] 29 VFs
[09:53:13] [PASSED] 30 VFs
[09:53:13] [PASSED] 31 VFs
[09:53:13] [PASSED] 32 VFs
[09:53:13] [PASSED] 33 VFs
[09:53:13] [PASSED] 34 VFs
[09:53:13] [PASSED] 35 VFs
[09:53:13] [PASSED] 36 VFs
[09:53:13] [PASSED] 37 VFs
[09:53:13] [PASSED] 38 VFs
[09:53:13] [PASSED] 39 VFs
[09:53:13] [PASSED] 40 VFs
[09:53:13] [PASSED] 41 VFs
[09:53:13] [PASSED] 42 VFs
[09:53:13] [PASSED] 43 VFs
[09:53:13] [PASSED] 44 VFs
[09:53:13] [PASSED] 45 VFs
[09:53:13] [PASSED] 46 VFs
[09:53:13] [PASSED] 47 VFs
[09:53:13] [PASSED] 48 VFs
[09:53:13] [PASSED] 49 VFs
[09:53:13] [PASSED] 50 VFs
[09:53:13] [PASSED] 51 VFs
[09:53:13] [PASSED] 52 VFs
[09:53:13] [PASSED] 53 VFs
[09:53:13] [PASSED] 54 VFs
[09:53:13] [PASSED] 55 VFs
[09:53:13] [PASSED] 56 VFs
[09:53:13] [PASSED] 57 VFs
[09:53:13] [PASSED] 58 VFs
[09:53:13] [PASSED] 59 VFs
[09:53:13] [PASSED] 60 VFs
[09:53:13] [PASSED] 61 VFs
[09:53:13] [PASSED] 62 VFs
[09:53:13] [PASSED] 63 VFs
[09:53:13] ==================== [PASSED] fair_ggtt ====================
[09:53:13] ================== [PASSED] pf_gt_config ===================
[09:53:13] ===================== lmtt (1 subtest) =====================
[09:53:13] ======================== test_ops  =========================
[09:53:13] [PASSED] 2-level
[09:53:13] [PASSED] multi-level
[09:53:13] ==================== [PASSED] test_ops =====================
[09:53:13] ====================== [PASSED] lmtt =======================
[09:53:13] ================= pf_service (11 subtests) =================
[09:53:13] [PASSED] pf_negotiate_any
[09:53:13] [PASSED] pf_negotiate_base_match
[09:53:13] [PASSED] pf_negotiate_base_newer
[09:53:13] [PASSED] pf_negotiate_base_next
[09:53:13] [SKIPPED] pf_negotiate_base_older
[09:53:13] [PASSED] pf_negotiate_base_prev
[09:53:13] [PASSED] pf_negotiate_latest_match
[09:53:13] [PASSED] pf_negotiate_latest_newer
[09:53:13] [PASSED] pf_negotiate_latest_next
[09:53:13] [SKIPPED] pf_negotiate_latest_older
[09:53:13] [SKIPPED] pf_negotiate_latest_prev
[09:53:13] =================== [PASSED] pf_service ====================
[09:53:13] ================= xe_guc_g2g (2 subtests) ==================
[09:53:13] ============== xe_live_guc_g2g_kunit_default  ==============
[09:53:13] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[09:53:13] ============== xe_live_guc_g2g_kunit_allmem  ===============
[09:53:13] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[09:53:13] =================== [SKIPPED] xe_guc_g2g ===================
[09:53:13] =================== xe_mocs (2 subtests) ===================
[09:53:13] ================ xe_live_mocs_kernel_kunit  ================
[09:53:13] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[09:53:13] ================ xe_live_mocs_reset_kunit  =================
[09:53:13] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[09:53:13] ==================== [SKIPPED] xe_mocs =====================
[09:53:13] ================= xe_migrate (2 subtests) ==================
[09:53:13] ================= xe_migrate_sanity_kunit  =================
[09:53:13] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[09:53:13] ================== xe_validate_ccs_kunit  ==================
[09:53:13] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[09:53:13] =================== [SKIPPED] xe_migrate ===================
[09:53:13] ================== xe_dma_buf (1 subtest) ==================
[09:53:13] ==================== xe_dma_buf_kunit  =====================
[09:53:13] ================ [SKIPPED] xe_dma_buf_kunit ================
[09:53:13] =================== [SKIPPED] xe_dma_buf ===================
[09:53:13] ================= xe_bo_shrink (1 subtest) =================
[09:53:13] =================== xe_bo_shrink_kunit  ====================
[09:53:13] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[09:53:13] ================== [SKIPPED] xe_bo_shrink ==================
[09:53:13] ==================== xe_bo (2 subtests) ====================
[09:53:13] ================== xe_ccs_migrate_kunit  ===================
[09:53:13] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[09:53:13] ==================== xe_bo_evict_kunit  ====================
[09:53:13] =============== [SKIPPED] xe_bo_evict_kunit ================
[09:53:13] ===================== [SKIPPED] xe_bo ======================
[09:53:13] ==================== args (11 subtests) ====================
[09:53:13] [PASSED] count_args_test
[09:53:13] [PASSED] call_args_example
[09:53:13] [PASSED] call_args_test
[09:53:13] [PASSED] drop_first_arg_example
[09:53:13] [PASSED] drop_first_arg_test
[09:53:13] [PASSED] first_arg_example
[09:53:13] [PASSED] first_arg_test
[09:53:13] [PASSED] last_arg_example
[09:53:13] [PASSED] last_arg_test
[09:53:13] [PASSED] pick_arg_example
[09:53:13] [PASSED] sep_comma_example
[09:53:13] ====================== [PASSED] args =======================
[09:53:13] =================== xe_pci (3 subtests) ====================
[09:53:13] ==================== check_graphics_ip  ====================
[09:53:13] [PASSED] 12.00 Xe_LP
[09:53:13] [PASSED] 12.10 Xe_LP+
[09:53:13] [PASSED] 12.55 Xe_HPG
[09:53:13] [PASSED] 12.60 Xe_HPC
[09:53:13] [PASSED] 12.70 Xe_LPG
[09:53:13] [PASSED] 12.71 Xe_LPG
[09:53:13] [PASSED] 12.74 Xe_LPG+
[09:53:13] [PASSED] 20.01 Xe2_HPG
[09:53:13] [PASSED] 20.02 Xe2_HPG
[09:53:13] [PASSED] 20.04 Xe2_LPG
[09:53:13] [PASSED] 30.00 Xe3_LPG
[09:53:13] [PASSED] 30.01 Xe3_LPG
[09:53:13] [PASSED] 30.03 Xe3_LPG
[09:53:13] [PASSED] 30.04 Xe3_LPG
[09:53:13] [PASSED] 30.05 Xe3_LPG
[09:53:13] [PASSED] 35.11 Xe3p_XPC
[09:53:13] ================ [PASSED] check_graphics_ip ================
[09:53:13] ===================== check_media_ip  ======================
[09:53:13] [PASSED] 12.00 Xe_M
[09:53:13] [PASSED] 12.55 Xe_HPM
[09:53:13] [PASSED] 13.00 Xe_LPM+
[09:53:13] [PASSED] 13.01 Xe2_HPM
[09:53:13] [PASSED] 20.00 Xe2_LPM
[09:53:13] [PASSED] 30.00 Xe3_LPM
[09:53:13] [PASSED] 30.02 Xe3_LPM
[09:53:13] [PASSED] 35.00 Xe3p_LPM
[09:53:13] [PASSED] 35.03 Xe3p_HPM
[09:53:13] ================= [PASSED] check_media_ip ==================
[09:53:13] =================== check_platform_desc  ===================
[09:53:13] [PASSED] 0x9A60 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A68 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A70 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A40 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A49 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A59 (TIGERLAKE)
[09:53:13] [PASSED] 0x9A78 (TIGERLAKE)
[09:53:13] [PASSED] 0x9AC0 (TIGERLAKE)
[09:53:13] [PASSED] 0x9AC9 (TIGERLAKE)
[09:53:13] [PASSED] 0x9AD9 (TIGERLAKE)
[09:53:13] [PASSED] 0x9AF8 (TIGERLAKE)
[09:53:13] [PASSED] 0x4C80 (ROCKETLAKE)
[09:53:13] [PASSED] 0x4C8A (ROCKETLAKE)
[09:53:13] [PASSED] 0x4C8B (ROCKETLAKE)
[09:53:13] [PASSED] 0x4C8C (ROCKETLAKE)
[09:53:13] [PASSED] 0x4C90 (ROCKETLAKE)
[09:53:13] [PASSED] 0x4C9A (ROCKETLAKE)
[09:53:13] [PASSED] 0x4680 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4682 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4688 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x468A (ALDERLAKE_S)
[09:53:13] [PASSED] 0x468B (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4690 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4692 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4693 (ALDERLAKE_S)
[09:53:13] [PASSED] 0x46A0 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46A1 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46A2 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46A3 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46A6 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46A8 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46AA (ALDERLAKE_P)
[09:53:13] [PASSED] 0x462A (ALDERLAKE_P)
[09:53:13] [PASSED] 0x4626 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x4628 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46B0 (ALDERLAKE_P)
stty: 'standard input': Inappropriate ioctl for device
[09:53:13] [PASSED] 0x46B1 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46B2 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46B3 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46C0 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46C1 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46C2 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46C3 (ALDERLAKE_P)
[09:53:13] [PASSED] 0x46D0 (ALDERLAKE_N)
[09:53:13] [PASSED] 0x46D1 (ALDERLAKE_N)
[09:53:13] [PASSED] 0x46D2 (ALDERLAKE_N)
[09:53:13] [PASSED] 0x46D3 (ALDERLAKE_N)
[09:53:13] [PASSED] 0x46D4 (ALDERLAKE_N)
[09:53:13] [PASSED] 0xA721 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7A1 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7A9 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7AC (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7AD (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA720 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7A0 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7A8 (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7AA (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA7AB (ALDERLAKE_P)
[09:53:13] [PASSED] 0xA780 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA781 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA782 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA783 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA788 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA789 (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA78A (ALDERLAKE_S)
[09:53:13] [PASSED] 0xA78B (ALDERLAKE_S)
[09:53:13] [PASSED] 0x4905 (DG1)
[09:53:13] [PASSED] 0x4906 (DG1)
[09:53:13] [PASSED] 0x4907 (DG1)
[09:53:13] [PASSED] 0x4908 (DG1)
[09:53:13] [PASSED] 0x4909 (DG1)
[09:53:13] [PASSED] 0x56C0 (DG2)
[09:53:13] [PASSED] 0x56C2 (DG2)
[09:53:13] [PASSED] 0x56C1 (DG2)
[09:53:13] [PASSED] 0x7D51 (METEORLAKE)
[09:53:13] [PASSED] 0x7DD1 (METEORLAKE)
[09:53:13] [PASSED] 0x7D41 (METEORLAKE)
[09:53:13] [PASSED] 0x7D67 (METEORLAKE)
[09:53:13] [PASSED] 0xB640 (METEORLAKE)
[09:53:13] [PASSED] 0x56A0 (DG2)
[09:53:13] [PASSED] 0x56A1 (DG2)
[09:53:13] [PASSED] 0x56A2 (DG2)
[09:53:13] [PASSED] 0x56BE (DG2)
[09:53:13] [PASSED] 0x56BF (DG2)
[09:53:13] [PASSED] 0x5690 (DG2)
[09:53:13] [PASSED] 0x5691 (DG2)
[09:53:13] [PASSED] 0x5692 (DG2)
[09:53:13] [PASSED] 0x56A5 (DG2)
[09:53:13] [PASSED] 0x56A6 (DG2)
[09:53:13] [PASSED] 0x56B0 (DG2)
[09:53:13] [PASSED] 0x56B1 (DG2)
[09:53:13] [PASSED] 0x56BA (DG2)
[09:53:13] [PASSED] 0x56BB (DG2)
[09:53:13] [PASSED] 0x56BC (DG2)
[09:53:13] [PASSED] 0x56BD (DG2)
[09:53:13] [PASSED] 0x5693 (DG2)
[09:53:13] [PASSED] 0x5694 (DG2)
[09:53:13] [PASSED] 0x5695 (DG2)
[09:53:13] [PASSED] 0x56A3 (DG2)
[09:53:13] [PASSED] 0x56A4 (DG2)
[09:53:13] [PASSED] 0x56B2 (DG2)
[09:53:13] [PASSED] 0x56B3 (DG2)
[09:53:13] [PASSED] 0x5696 (DG2)
[09:53:13] [PASSED] 0x5697 (DG2)
[09:53:13] [PASSED] 0xB69 (PVC)
[09:53:13] [PASSED] 0xB6E (PVC)
[09:53:13] [PASSED] 0xBD4 (PVC)
[09:53:13] [PASSED] 0xBD5 (PVC)
[09:53:13] [PASSED] 0xBD6 (PVC)
[09:53:13] [PASSED] 0xBD7 (PVC)
[09:53:13] [PASSED] 0xBD8 (PVC)
[09:53:13] [PASSED] 0xBD9 (PVC)
[09:53:13] [PASSED] 0xBDA (PVC)
[09:53:13] [PASSED] 0xBDB (PVC)
[09:53:13] [PASSED] 0xBE0 (PVC)
[09:53:13] [PASSED] 0xBE1 (PVC)
[09:53:13] [PASSED] 0xBE5 (PVC)
[09:53:13] [PASSED] 0x7D40 (METEORLAKE)
[09:53:13] [PASSED] 0x7D45 (METEORLAKE)
[09:53:13] [PASSED] 0x7D55 (METEORLAKE)
[09:53:13] [PASSED] 0x7D60 (METEORLAKE)
[09:53:13] [PASSED] 0x7DD5 (METEORLAKE)
[09:53:13] [PASSED] 0x6420 (LUNARLAKE)
[09:53:13] [PASSED] 0x64A0 (LUNARLAKE)
[09:53:13] [PASSED] 0x64B0 (LUNARLAKE)
[09:53:13] [PASSED] 0xE202 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE209 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE20B (BATTLEMAGE)
[09:53:13] [PASSED] 0xE20C (BATTLEMAGE)
[09:53:13] [PASSED] 0xE20D (BATTLEMAGE)
[09:53:13] [PASSED] 0xE210 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE211 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE212 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE216 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE220 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE221 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE222 (BATTLEMAGE)
[09:53:13] [PASSED] 0xE223 (BATTLEMAGE)
[09:53:13] [PASSED] 0xB080 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB081 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB082 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB083 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB084 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB085 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB086 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB087 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB08F (PANTHERLAKE)
[09:53:13] [PASSED] 0xB090 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB0A0 (PANTHERLAKE)
[09:53:13] [PASSED] 0xB0B0 (PANTHERLAKE)
[09:53:13] [PASSED] 0xD740 (NOVALAKE_S)
[09:53:13] [PASSED] 0xD741 (NOVALAKE_S)
[09:53:13] [PASSED] 0xD742 (NOVALAKE_S)
[09:53:13] [PASSED] 0xD743 (NOVALAKE_S)
[09:53:13] [PASSED] 0xD744 (NOVALAKE_S)
[09:53:13] [PASSED] 0xD745 (NOVALAKE_S)
[09:53:13] [PASSED] 0x674C (CRESCENTISLAND)
[09:53:13] [PASSED] 0xFD80 (PANTHERLAKE)
[09:53:13] [PASSED] 0xFD81 (PANTHERLAKE)
[09:53:13] =============== [PASSED] check_platform_desc ===============
[09:53:13] ===================== [PASSED] xe_pci ======================
[09:53:13] =================== xe_rtp (2 subtests) ====================
[09:53:13] =============== xe_rtp_process_to_sr_tests  ================
[09:53:13] [PASSED] coalesce-same-reg
[09:53:13] [PASSED] no-match-no-add
[09:53:13] [PASSED] match-or
[09:53:13] [PASSED] match-or-xfail
[09:53:13] [PASSED] no-match-no-add-multiple-rules
[09:53:13] [PASSED] two-regs-two-entries
[09:53:13] [PASSED] clr-one-set-other
[09:53:13] [PASSED] set-field
[09:53:13] [PASSED] conflict-duplicate
[09:53:13] [PASSED] conflict-not-disjoint
[09:53:13] [PASSED] conflict-reg-type
[09:53:13] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[09:53:13] ================== xe_rtp_process_tests  ===================
[09:53:13] [PASSED] active1
[09:53:13] [PASSED] active2
[09:53:13] [PASSED] active-inactive
[09:53:13] [PASSED] inactive-active
[09:53:13] [PASSED] inactive-1st_or_active-inactive
[09:53:13] [PASSED] inactive-2nd_or_active-inactive
[09:53:13] [PASSED] inactive-last_or_active-inactive
[09:53:13] [PASSED] inactive-no_or_active-inactive
[09:53:13] ============== [PASSED] xe_rtp_process_tests ===============
[09:53:13] ===================== [PASSED] xe_rtp ======================
[09:53:13] ==================== xe_wa (1 subtest) =====================
[09:53:13] ======================== xe_wa_gt  =========================
[09:53:13] [PASSED] TIGERLAKE B0
[09:53:13] [PASSED] DG1 A0
[09:53:13] [PASSED] DG1 B0
[09:53:13] [PASSED] ALDERLAKE_S A0
[09:53:13] [PASSED] ALDERLAKE_S B0
[09:53:13] [PASSED] ALDERLAKE_S C0
[09:53:13] [PASSED] ALDERLAKE_S D0
[09:53:13] [PASSED] ALDERLAKE_P A0
[09:53:13] [PASSED] ALDERLAKE_P B0
[09:53:13] [PASSED] ALDERLAKE_P C0
[09:53:13] [PASSED] ALDERLAKE_S RPLS D0
[09:53:13] [PASSED] ALDERLAKE_P RPLU E0
[09:53:13] [PASSED] DG2 G10 C0
[09:53:13] [PASSED] DG2 G11 B1
[09:53:13] [PASSED] DG2 G12 A1
[09:53:13] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[09:53:13] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[09:53:13] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[09:53:13] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[09:53:13] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[09:53:13] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[09:53:13] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[09:53:13] ==================== [PASSED] xe_wa_gt =====================
[09:53:13] ====================== [PASSED] xe_wa ======================
[09:53:13] ============================================================
[09:53:13] Testing complete. Ran 510 tests: passed: 492, skipped: 18
[09:53:13] Elapsed time: 35.486s total, 4.241s configuring, 30.779s building, 0.457s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[09:53:13] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[09:53:15] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[09:53:39] Starting KUnit Kernel (1/1)...
[09:53:39] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[09:53:40] ============ drm_test_pick_cmdline (2 subtests) ============
[09:53:40] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[09:53:40] =============== drm_test_pick_cmdline_named  ===============
[09:53:40] [PASSED] NTSC
[09:53:40] [PASSED] NTSC-J
[09:53:40] [PASSED] PAL
[09:53:40] [PASSED] PAL-M
[09:53:40] =========== [PASSED] drm_test_pick_cmdline_named ===========
[09:53:40] ============== [PASSED] drm_test_pick_cmdline ==============
[09:53:40] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[09:53:40] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[09:53:40] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[09:53:40] =========== drm_validate_clone_mode (2 subtests) ===========
[09:53:40] ============== drm_test_check_in_clone_mode  ===============
[09:53:40] [PASSED] in_clone_mode
[09:53:40] [PASSED] not_in_clone_mode
[09:53:40] ========== [PASSED] drm_test_check_in_clone_mode ===========
[09:53:40] =============== drm_test_check_valid_clones  ===============
[09:53:40] [PASSED] not_in_clone_mode
[09:53:40] [PASSED] valid_clone
[09:53:40] [PASSED] invalid_clone
[09:53:40] =========== [PASSED] drm_test_check_valid_clones ===========
[09:53:40] ============= [PASSED] drm_validate_clone_mode =============
[09:53:40] ============= drm_validate_modeset (1 subtest) =============
[09:53:40] [PASSED] drm_test_check_connector_changed_modeset
[09:53:40] ============== [PASSED] drm_validate_modeset ===============
[09:53:40] ====== drm_test_bridge_get_current_state (2 subtests) ======
[09:53:40] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[09:53:40] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[09:53:40] ======== [PASSED] drm_test_bridge_get_current_state ========
[09:53:40] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[09:53:40] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[09:53:40] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[09:53:40] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[09:53:40] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[09:53:40] ============== drm_bridge_alloc (2 subtests) ===============
[09:53:40] [PASSED] drm_test_drm_bridge_alloc_basic
[09:53:40] [PASSED] drm_test_drm_bridge_alloc_get_put
[09:53:40] ================ [PASSED] drm_bridge_alloc =================
[09:53:40] ================== drm_buddy (8 subtests) ==================
[09:53:40] [PASSED] drm_test_buddy_alloc_limit
[09:53:40] [PASSED] drm_test_buddy_alloc_optimistic
[09:53:40] [PASSED] drm_test_buddy_alloc_pessimistic
[09:53:40] [PASSED] drm_test_buddy_alloc_pathological
[09:53:40] [PASSED] drm_test_buddy_alloc_contiguous
[09:53:40] [PASSED] drm_test_buddy_alloc_clear
[09:53:40] [PASSED] drm_test_buddy_alloc_range_bias
[09:53:40] [PASSED] drm_test_buddy_fragmentation_performance
[09:53:40] ==================== [PASSED] drm_buddy ====================
[09:53:40] ============= drm_cmdline_parser (40 subtests) =============
[09:53:40] [PASSED] drm_test_cmdline_force_d_only
[09:53:40] [PASSED] drm_test_cmdline_force_D_only_dvi
[09:53:40] [PASSED] drm_test_cmdline_force_D_only_hdmi
[09:53:40] [PASSED] drm_test_cmdline_force_D_only_not_digital
[09:53:40] [PASSED] drm_test_cmdline_force_e_only
[09:53:40] [PASSED] drm_test_cmdline_res
[09:53:40] [PASSED] drm_test_cmdline_res_vesa
[09:53:40] [PASSED] drm_test_cmdline_res_vesa_rblank
[09:53:40] [PASSED] drm_test_cmdline_res_rblank
[09:53:40] [PASSED] drm_test_cmdline_res_bpp
[09:53:40] [PASSED] drm_test_cmdline_res_refresh
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[09:53:40] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[09:53:40] [PASSED] drm_test_cmdline_res_margins_force_on
[09:53:40] [PASSED] drm_test_cmdline_res_vesa_margins
[09:53:40] [PASSED] drm_test_cmdline_name
[09:53:40] [PASSED] drm_test_cmdline_name_bpp
[09:53:40] [PASSED] drm_test_cmdline_name_option
[09:53:40] [PASSED] drm_test_cmdline_name_bpp_option
[09:53:40] [PASSED] drm_test_cmdline_rotate_0
[09:53:40] [PASSED] drm_test_cmdline_rotate_90
[09:53:40] [PASSED] drm_test_cmdline_rotate_180
[09:53:40] [PASSED] drm_test_cmdline_rotate_270
[09:53:40] [PASSED] drm_test_cmdline_hmirror
[09:53:40] [PASSED] drm_test_cmdline_vmirror
[09:53:40] [PASSED] drm_test_cmdline_margin_options
[09:53:40] [PASSED] drm_test_cmdline_multiple_options
[09:53:40] [PASSED] drm_test_cmdline_bpp_extra_and_option
[09:53:40] [PASSED] drm_test_cmdline_extra_and_option
[09:53:40] [PASSED] drm_test_cmdline_freestanding_options
[09:53:40] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[09:53:40] [PASSED] drm_test_cmdline_panel_orientation
[09:53:40] ================ drm_test_cmdline_invalid  =================
[09:53:40] [PASSED] margin_only
[09:53:40] [PASSED] interlace_only
[09:53:40] [PASSED] res_missing_x
[09:53:40] [PASSED] res_missing_y
[09:53:40] [PASSED] res_bad_y
[09:53:40] [PASSED] res_missing_y_bpp
[09:53:40] [PASSED] res_bad_bpp
[09:53:40] [PASSED] res_bad_refresh
[09:53:40] [PASSED] res_bpp_refresh_force_on_off
[09:53:40] [PASSED] res_invalid_mode
[09:53:40] [PASSED] res_bpp_wrong_place_mode
[09:53:40] [PASSED] name_bpp_refresh
[09:53:40] [PASSED] name_refresh
[09:53:40] [PASSED] name_refresh_wrong_mode
[09:53:40] [PASSED] name_refresh_invalid_mode
[09:53:40] [PASSED] rotate_multiple
[09:53:40] [PASSED] rotate_invalid_val
[09:53:40] [PASSED] rotate_truncated
[09:53:40] [PASSED] invalid_option
[09:53:40] [PASSED] invalid_tv_option
[09:53:40] [PASSED] truncated_tv_option
[09:53:40] ============ [PASSED] drm_test_cmdline_invalid =============
[09:53:40] =============== drm_test_cmdline_tv_options  ===============
[09:53:40] [PASSED] NTSC
[09:53:40] [PASSED] NTSC_443
[09:53:40] [PASSED] NTSC_J
[09:53:40] [PASSED] PAL
[09:53:40] [PASSED] PAL_M
[09:53:40] [PASSED] PAL_N
[09:53:40] [PASSED] SECAM
[09:53:40] [PASSED] MONO_525
[09:53:40] [PASSED] MONO_625
[09:53:40] =========== [PASSED] drm_test_cmdline_tv_options ===========
[09:53:40] =============== [PASSED] drm_cmdline_parser ================
[09:53:40] ========== drmm_connector_hdmi_init (20 subtests) ==========
[09:53:40] [PASSED] drm_test_connector_hdmi_init_valid
[09:53:40] [PASSED] drm_test_connector_hdmi_init_bpc_8
[09:53:40] [PASSED] drm_test_connector_hdmi_init_bpc_10
[09:53:40] [PASSED] drm_test_connector_hdmi_init_bpc_12
[09:53:40] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[09:53:40] [PASSED] drm_test_connector_hdmi_init_bpc_null
[09:53:40] [PASSED] drm_test_connector_hdmi_init_formats_empty
[09:53:40] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[09:53:40] === drm_test_connector_hdmi_init_formats_yuv420_allowed  ===
[09:53:40] [PASSED] supported_formats=0x9 yuv420_allowed=1
[09:53:40] [PASSED] supported_formats=0x9 yuv420_allowed=0
[09:53:40] [PASSED] supported_formats=0x3 yuv420_allowed=1
[09:53:40] [PASSED] supported_formats=0x3 yuv420_allowed=0
[09:53:40] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[09:53:40] [PASSED] drm_test_connector_hdmi_init_null_ddc
[09:53:40] [PASSED] drm_test_connector_hdmi_init_null_product
[09:53:40] [PASSED] drm_test_connector_hdmi_init_null_vendor
[09:53:40] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[09:53:40] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[09:53:40] [PASSED] drm_test_connector_hdmi_init_product_valid
[09:53:40] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[09:53:40] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[09:53:40] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[09:53:40] ========= drm_test_connector_hdmi_init_type_valid  =========
[09:53:40] [PASSED] HDMI-A
[09:53:40] [PASSED] HDMI-B
[09:53:40] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[09:53:40] ======== drm_test_connector_hdmi_init_type_invalid  ========
[09:53:40] [PASSED] Unknown
[09:53:40] [PASSED] VGA
[09:53:40] [PASSED] DVI-I
[09:53:40] [PASSED] DVI-D
[09:53:40] [PASSED] DVI-A
[09:53:40] [PASSED] Composite
[09:53:40] [PASSED] SVIDEO
[09:53:40] [PASSED] LVDS
[09:53:40] [PASSED] Component
[09:53:40] [PASSED] DIN
[09:53:40] [PASSED] DP
[09:53:40] [PASSED] TV
[09:53:40] [PASSED] eDP
[09:53:40] [PASSED] Virtual
[09:53:40] [PASSED] DSI
[09:53:40] [PASSED] DPI
[09:53:40] [PASSED] Writeback
[09:53:40] [PASSED] SPI
[09:53:40] [PASSED] USB
[09:53:40] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[09:53:40] ============ [PASSED] drmm_connector_hdmi_init =============
[09:53:40] ============= drmm_connector_init (3 subtests) =============
[09:53:40] [PASSED] drm_test_drmm_connector_init
[09:53:40] [PASSED] drm_test_drmm_connector_init_null_ddc
[09:53:40] ========= drm_test_drmm_connector_init_type_valid  =========
[09:53:40] [PASSED] Unknown
[09:53:40] [PASSED] VGA
[09:53:40] [PASSED] DVI-I
[09:53:40] [PASSED] DVI-D
[09:53:40] [PASSED] DVI-A
[09:53:40] [PASSED] Composite
[09:53:40] [PASSED] SVIDEO
[09:53:40] [PASSED] LVDS
[09:53:40] [PASSED] Component
[09:53:40] [PASSED] DIN
[09:53:40] [PASSED] DP
[09:53:40] [PASSED] HDMI-A
[09:53:40] [PASSED] HDMI-B
[09:53:40] [PASSED] TV
[09:53:40] [PASSED] eDP
[09:53:40] [PASSED] Virtual
[09:53:40] [PASSED] DSI
[09:53:40] [PASSED] DPI
[09:53:40] [PASSED] Writeback
[09:53:40] [PASSED] SPI
[09:53:40] [PASSED] USB
[09:53:40] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[09:53:40] =============== [PASSED] drmm_connector_init ===============
[09:53:40] ========= drm_connector_dynamic_init (6 subtests) ==========
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_init
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_init_properties
[09:53:40] ===== drm_test_drm_connector_dynamic_init_type_valid  ======
[09:53:40] [PASSED] Unknown
[09:53:40] [PASSED] VGA
[09:53:40] [PASSED] DVI-I
[09:53:40] [PASSED] DVI-D
[09:53:40] [PASSED] DVI-A
[09:53:40] [PASSED] Composite
[09:53:40] [PASSED] SVIDEO
[09:53:40] [PASSED] LVDS
[09:53:40] [PASSED] Component
[09:53:40] [PASSED] DIN
[09:53:40] [PASSED] DP
[09:53:40] [PASSED] HDMI-A
[09:53:40] [PASSED] HDMI-B
[09:53:40] [PASSED] TV
[09:53:40] [PASSED] eDP
[09:53:40] [PASSED] Virtual
[09:53:40] [PASSED] DSI
[09:53:40] [PASSED] DPI
[09:53:40] [PASSED] Writeback
[09:53:40] [PASSED] SPI
[09:53:40] [PASSED] USB
[09:53:40] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[09:53:40] ======== drm_test_drm_connector_dynamic_init_name  =========
[09:53:40] [PASSED] Unknown
[09:53:40] [PASSED] VGA
[09:53:40] [PASSED] DVI-I
[09:53:40] [PASSED] DVI-D
[09:53:40] [PASSED] DVI-A
[09:53:40] [PASSED] Composite
[09:53:40] [PASSED] SVIDEO
[09:53:40] [PASSED] LVDS
[09:53:40] [PASSED] Component
[09:53:40] [PASSED] DIN
[09:53:40] [PASSED] DP
[09:53:40] [PASSED] HDMI-A
[09:53:40] [PASSED] HDMI-B
[09:53:40] [PASSED] TV
[09:53:40] [PASSED] eDP
[09:53:40] [PASSED] Virtual
[09:53:40] [PASSED] DSI
[09:53:40] [PASSED] DPI
[09:53:40] [PASSED] Writeback
[09:53:40] [PASSED] SPI
[09:53:40] [PASSED] USB
[09:53:40] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[09:53:40] =========== [PASSED] drm_connector_dynamic_init ============
[09:53:40] ==== drm_connector_dynamic_register_early (4 subtests) =====
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[09:53:40] ====== [PASSED] drm_connector_dynamic_register_early =======
[09:53:40] ======= drm_connector_dynamic_register (7 subtests) ========
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[09:53:40] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[09:53:40] ========= [PASSED] drm_connector_dynamic_register ==========
[09:53:40] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[09:53:40] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[09:53:40] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[09:53:40] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[09:53:40] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[09:53:40] ========== drm_test_get_tv_mode_from_name_valid  ===========
[09:53:40] [PASSED] NTSC
[09:53:40] [PASSED] NTSC-443
[09:53:40] [PASSED] NTSC-J
[09:53:40] [PASSED] PAL
[09:53:40] [PASSED] PAL-M
[09:53:40] [PASSED] PAL-N
[09:53:40] [PASSED] SECAM
[09:53:40] [PASSED] Mono
[09:53:40] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[09:53:40] [PASSED] drm_test_get_tv_mode_from_name_truncated
[09:53:40] ============ [PASSED] drm_get_tv_mode_from_name ============
[09:53:40] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[09:53:40] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[09:53:40] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[09:53:40] [PASSED] VIC 96
[09:53:40] [PASSED] VIC 97
[09:53:40] [PASSED] VIC 101
[09:53:40] [PASSED] VIC 102
[09:53:40] [PASSED] VIC 106
[09:53:40] [PASSED] VIC 107
[09:53:40] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[09:53:40] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[09:53:40] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[09:53:40] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[09:53:40] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[09:53:40] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[09:53:40] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[09:53:40] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[09:53:40] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[09:53:40] [PASSED] Automatic
[09:53:40] [PASSED] Full
[09:53:40] [PASSED] Limited 16:235
[09:53:40] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[09:53:40] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[09:53:40] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[09:53:40] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[09:53:40] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[09:53:40] [PASSED] RGB
[09:53:40] [PASSED] YUV 4:2:0
[09:53:40] [PASSED] YUV 4:2:2
[09:53:40] [PASSED] YUV 4:4:4
[09:53:40] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[09:53:40] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[09:53:40] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[09:53:40] ============= drm_damage_helper (21 subtests) ==============
[09:53:40] [PASSED] drm_test_damage_iter_no_damage
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_src_moved
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_not_visible
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[09:53:40] [PASSED] drm_test_damage_iter_no_damage_no_fb
[09:53:40] [PASSED] drm_test_damage_iter_simple_damage
[09:53:40] [PASSED] drm_test_damage_iter_single_damage
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_outside_src
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_src_moved
[09:53:40] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[09:53:40] [PASSED] drm_test_damage_iter_damage
[09:53:40] [PASSED] drm_test_damage_iter_damage_one_intersect
[09:53:40] [PASSED] drm_test_damage_iter_damage_one_outside
[09:53:40] [PASSED] drm_test_damage_iter_damage_src_moved
[09:53:40] [PASSED] drm_test_damage_iter_damage_not_visible
[09:53:40] ================ [PASSED] drm_damage_helper ================
[09:53:40] ============== drm_dp_mst_helper (3 subtests) ==============
[09:53:40] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[09:53:40] [PASSED] Clock 154000 BPP 30 DSC disabled
[09:53:40] [PASSED] Clock 234000 BPP 30 DSC disabled
[09:53:40] [PASSED] Clock 297000 BPP 24 DSC disabled
[09:53:40] [PASSED] Clock 332880 BPP 24 DSC enabled
[09:53:40] [PASSED] Clock 324540 BPP 24 DSC enabled
[09:53:40] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[09:53:40] ============== drm_test_dp_mst_calc_pbn_div  ===============
[09:53:40] [PASSED] Link rate 2000000 lane count 4
[09:53:40] [PASSED] Link rate 2000000 lane count 2
[09:53:40] [PASSED] Link rate 2000000 lane count 1
[09:53:40] [PASSED] Link rate 1350000 lane count 4
[09:53:40] [PASSED] Link rate 1350000 lane count 2
[09:53:40] [PASSED] Link rate 1350000 lane count 1
[09:53:40] [PASSED] Link rate 1000000 lane count 4
[09:53:40] [PASSED] Link rate 1000000 lane count 2
[09:53:40] [PASSED] Link rate 1000000 lane count 1
[09:53:40] [PASSED] Link rate 810000 lane count 4
[09:53:40] [PASSED] Link rate 810000 lane count 2
[09:53:40] [PASSED] Link rate 810000 lane count 1
[09:53:40] [PASSED] Link rate 540000 lane count 4
[09:53:40] [PASSED] Link rate 540000 lane count 2
[09:53:40] [PASSED] Link rate 540000 lane count 1
[09:53:40] [PASSED] Link rate 270000 lane count 4
[09:53:40] [PASSED] Link rate 270000 lane count 2
[09:53:40] [PASSED] Link rate 270000 lane count 1
[09:53:40] [PASSED] Link rate 162000 lane count 4
[09:53:40] [PASSED] Link rate 162000 lane count 2
[09:53:40] [PASSED] Link rate 162000 lane count 1
[09:53:40] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[09:53:40] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[09:53:40] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[09:53:40] [PASSED] DP_POWER_UP_PHY with port number
[09:53:40] [PASSED] DP_POWER_DOWN_PHY with port number
[09:53:40] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[09:53:40] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[09:53:40] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[09:53:40] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[09:53:40] [PASSED] DP_QUERY_PAYLOAD with port number
[09:53:40] [PASSED] DP_QUERY_PAYLOAD with VCPI
[09:53:40] [PASSED] DP_REMOTE_DPCD_READ with port number
[09:53:40] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[09:53:40] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[09:53:40] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[09:53:40] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[09:53:40] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[09:53:40] [PASSED] DP_REMOTE_I2C_READ with port number
[09:53:40] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[09:53:40] [PASSED] DP_REMOTE_I2C_READ with transactions array
[09:53:40] [PASSED] DP_REMOTE_I2C_WRITE with port number
[09:53:40] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[09:53:40] [PASSED] DP_REMOTE_I2C_WRITE with data array
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[09:53:40] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[09:53:40] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[09:53:40] ================ [PASSED] drm_dp_mst_helper ================
[09:53:40] ================== drm_exec (7 subtests) ===================
[09:53:40] [PASSED] sanitycheck
[09:53:40] [PASSED] test_lock
[09:53:40] [PASSED] test_lock_unlock
[09:53:40] [PASSED] test_duplicates
[09:53:40] [PASSED] test_prepare
[09:53:40] [PASSED] test_prepare_array
[09:53:40] [PASSED] test_multiple_loops
[09:53:40] ==================== [PASSED] drm_exec =====================
[09:53:40] =========== drm_format_helper_test (17 subtests) ===========
[09:53:40] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[09:53:40] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[09:53:40] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[09:53:40] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[09:53:40] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[09:53:40] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[09:53:40] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[09:53:40] ============= drm_test_fb_xrgb8888_to_bgr888  ==============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[09:53:40] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[09:53:40] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[09:53:40] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[09:53:40] ============== drm_test_fb_xrgb8888_to_mono  ===============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[09:53:40] ==================== drm_test_fb_swab  =====================
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ================ [PASSED] drm_test_fb_swab =================
[09:53:40] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[09:53:40] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[09:53:40] [PASSED] single_pixel_source_buffer
[09:53:40] [PASSED] single_pixel_clip_rectangle
[09:53:40] [PASSED] well_known_colors
[09:53:40] [PASSED] destination_pitch
[09:53:40] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[09:53:40] ================= drm_test_fb_clip_offset  =================
[09:53:40] [PASSED] pass through
[09:53:40] [PASSED] horizontal offset
[09:53:40] [PASSED] vertical offset
[09:53:40] [PASSED] horizontal and vertical offset
[09:53:40] [PASSED] horizontal offset (custom pitch)
[09:53:40] [PASSED] vertical offset (custom pitch)
[09:53:40] [PASSED] horizontal and vertical offset (custom pitch)
[09:53:40] ============= [PASSED] drm_test_fb_clip_offset =============
[09:53:40] =================== drm_test_fb_memcpy  ====================
[09:53:40] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[09:53:40] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[09:53:40] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[09:53:40] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[09:53:40] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[09:53:40] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[09:53:40] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[09:53:40] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[09:53:40] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[09:53:40] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[09:53:40] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[09:53:40] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[09:53:40] =============== [PASSED] drm_test_fb_memcpy ================
[09:53:40] ============= [PASSED] drm_format_helper_test ==============
[09:53:40] ================= drm_format (18 subtests) =================
[09:53:40] [PASSED] drm_test_format_block_width_invalid
[09:53:40] [PASSED] drm_test_format_block_width_one_plane
[09:53:40] [PASSED] drm_test_format_block_width_two_plane
[09:53:40] [PASSED] drm_test_format_block_width_three_plane
[09:53:40] [PASSED] drm_test_format_block_width_tiled
[09:53:40] [PASSED] drm_test_format_block_height_invalid
[09:53:40] [PASSED] drm_test_format_block_height_one_plane
[09:53:40] [PASSED] drm_test_format_block_height_two_plane
[09:53:40] [PASSED] drm_test_format_block_height_three_plane
[09:53:40] [PASSED] drm_test_format_block_height_tiled
[09:53:40] [PASSED] drm_test_format_min_pitch_invalid
[09:53:40] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[09:53:40] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[09:53:40] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[09:53:40] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[09:53:40] [PASSED] drm_test_format_min_pitch_two_plane
[09:53:40] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[09:53:40] [PASSED] drm_test_format_min_pitch_tiled
[09:53:40] =================== [PASSED] drm_format ====================
[09:53:40] ============== drm_framebuffer (10 subtests) ===============
[09:53:40] ========== drm_test_framebuffer_check_src_coords  ==========
[09:53:40] [PASSED] Success: source fits into fb
[09:53:40] [PASSED] Fail: overflowing fb with x-axis coordinate
[09:53:40] [PASSED] Fail: overflowing fb with y-axis coordinate
[09:53:40] [PASSED] Fail: overflowing fb with source width
[09:53:40] [PASSED] Fail: overflowing fb with source height
[09:53:40] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[09:53:40] [PASSED] drm_test_framebuffer_cleanup
[09:53:40] =============== drm_test_framebuffer_create  ===============
[09:53:40] [PASSED] ABGR8888 normal sizes
[09:53:40] [PASSED] ABGR8888 max sizes
[09:53:40] [PASSED] ABGR8888 pitch greater than min required
[09:53:40] [PASSED] ABGR8888 pitch less than min required
[09:53:40] [PASSED] ABGR8888 Invalid width
[09:53:40] [PASSED] ABGR8888 Invalid buffer handle
[09:53:40] [PASSED] No pixel format
[09:53:40] [PASSED] ABGR8888 Width 0
[09:53:40] [PASSED] ABGR8888 Height 0
[09:53:40] [PASSED] ABGR8888 Out of bound height * pitch combination
[09:53:40] [PASSED] ABGR8888 Large buffer offset
[09:53:40] [PASSED] ABGR8888 Buffer offset for inexistent plane
[09:53:40] [PASSED] ABGR8888 Invalid flag
[09:53:40] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[09:53:40] [PASSED] ABGR8888 Valid buffer modifier
[09:53:40] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[09:53:40] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] NV12 Normal sizes
[09:53:40] [PASSED] NV12 Max sizes
[09:53:40] [PASSED] NV12 Invalid pitch
[09:53:40] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[09:53:40] [PASSED] NV12 different  modifier per-plane
[09:53:40] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[09:53:40] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] NV12 Modifier for inexistent plane
[09:53:40] [PASSED] NV12 Handle for inexistent plane
[09:53:40] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[09:53:40] [PASSED] YVU420 Normal sizes
[09:53:40] [PASSED] YVU420 Max sizes
[09:53:40] [PASSED] YVU420 Invalid pitch
[09:53:40] [PASSED] YVU420 Different pitches
[09:53:40] [PASSED] YVU420 Different buffer offsets/pitches
[09:53:40] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[09:53:40] [PASSED] YVU420 Valid modifier
[09:53:40] [PASSED] YVU420 Different modifiers per plane
[09:53:40] [PASSED] YVU420 Modifier for inexistent plane
[09:53:40] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[09:53:40] [PASSED] X0L2 Normal sizes
[09:53:40] [PASSED] X0L2 Max sizes
[09:53:40] [PASSED] X0L2 Invalid pitch
[09:53:40] [PASSED] X0L2 Pitch greater than minimum required
[09:53:40] [PASSED] X0L2 Handle for inexistent plane
[09:53:40] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[09:53:40] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[09:53:40] [PASSED] X0L2 Valid modifier
[09:53:40] [PASSED] X0L2 Modifier for inexistent plane
[09:53:40] =========== [PASSED] drm_test_framebuffer_create ===========
[09:53:40] [PASSED] drm_test_framebuffer_free
[09:53:40] [PASSED] drm_test_framebuffer_init
[09:53:40] [PASSED] drm_test_framebuffer_init_bad_format
[09:53:40] [PASSED] drm_test_framebuffer_init_dev_mismatch
[09:53:40] [PASSED] drm_test_framebuffer_lookup
[09:53:40] [PASSED] drm_test_framebuffer_lookup_inexistent
[09:53:40] [PASSED] drm_test_framebuffer_modifiers_not_supported
[09:53:40] ================= [PASSED] drm_framebuffer =================
[09:53:40] ================ drm_gem_shmem (8 subtests) ================
[09:53:40] [PASSED] drm_gem_shmem_test_obj_create
[09:53:40] [PASSED] drm_gem_shmem_test_obj_create_private
[09:53:40] [PASSED] drm_gem_shmem_test_pin_pages
[09:53:40] [PASSED] drm_gem_shmem_test_vmap
[09:53:40] [PASSED] drm_gem_shmem_test_get_pages_sgt
[09:53:40] [PASSED] drm_gem_shmem_test_get_sg_table
[09:53:40] [PASSED] drm_gem_shmem_test_madvise
[09:53:40] [PASSED] drm_gem_shmem_test_purge
[09:53:40] ================== [PASSED] drm_gem_shmem ==================
[09:53:40] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[09:53:40] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420  =======
[09:53:40] [PASSED] Automatic
[09:53:40] [PASSED] Full
[09:53:40] [PASSED] Limited 16:235
[09:53:40] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[09:53:40] [PASSED] drm_test_check_disable_connector
[09:53:40] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[09:53:40] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[09:53:40] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[09:53:40] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[09:53:40] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[09:53:40] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[09:53:40] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[09:53:40] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[09:53:40] [PASSED] drm_test_check_output_bpc_dvi
[09:53:40] [PASSED] drm_test_check_output_bpc_format_vic_1
[09:53:40] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[09:53:40] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[09:53:40] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[09:53:40] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[09:53:40] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[09:53:40] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[09:53:40] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[09:53:40] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[09:53:40] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[09:53:40] [PASSED] drm_test_check_broadcast_rgb_value
[09:53:40] [PASSED] drm_test_check_bpc_8_value
[09:53:40] [PASSED] drm_test_check_bpc_10_value
[09:53:40] [PASSED] drm_test_check_bpc_12_value
[09:53:40] [PASSED] drm_test_check_format_value
[09:53:40] [PASSED] drm_test_check_tmds_char_value
[09:53:40] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[09:53:40] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[09:53:40] [PASSED] drm_test_check_mode_valid
[09:53:40] [PASSED] drm_test_check_mode_valid_reject
[09:53:40] [PASSED] drm_test_check_mode_valid_reject_rate
[09:53:40] [PASSED] drm_test_check_mode_valid_reject_max_clock
[09:53:40] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[09:53:40] ================= drm_managed (2 subtests) =================
[09:53:40] [PASSED] drm_test_managed_release_action
[09:53:40] [PASSED] drm_test_managed_run_action
[09:53:40] =================== [PASSED] drm_managed ===================
[09:53:40] =================== drm_mm (6 subtests) ====================
[09:53:40] [PASSED] drm_test_mm_init
[09:53:40] [PASSED] drm_test_mm_debug
[09:53:40] [PASSED] drm_test_mm_align32
[09:53:40] [PASSED] drm_test_mm_align64
[09:53:40] [PASSED] drm_test_mm_lowest
[09:53:40] [PASSED] drm_test_mm_highest
[09:53:40] ===================== [PASSED] drm_mm ======================
[09:53:40] ============= drm_modes_analog_tv (5 subtests) =============
[09:53:40] [PASSED] drm_test_modes_analog_tv_mono_576i
[09:53:40] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[09:53:40] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[09:53:40] [PASSED] drm_test_modes_analog_tv_pal_576i
[09:53:40] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[09:53:40] =============== [PASSED] drm_modes_analog_tv ===============
[09:53:40] ============== drm_plane_helper (2 subtests) ===============
[09:53:40] =============== drm_test_check_plane_state  ================
[09:53:40] [PASSED] clipping_simple
[09:53:40] [PASSED] clipping_rotate_reflect
[09:53:40] [PASSED] positioning_simple
[09:53:40] [PASSED] upscaling
[09:53:40] [PASSED] downscaling
[09:53:40] [PASSED] rounding1
[09:53:40] [PASSED] rounding2
[09:53:40] [PASSED] rounding3
[09:53:40] [PASSED] rounding4
[09:53:40] =========== [PASSED] drm_test_check_plane_state ============
[09:53:40] =========== drm_test_check_invalid_plane_state  ============
[09:53:40] [PASSED] positioning_invalid
[09:53:40] [PASSED] upscaling_invalid
[09:53:40] [PASSED] downscaling_invalid
[09:53:40] ======= [PASSED] drm_test_check_invalid_plane_state ========
[09:53:40] ================ [PASSED] drm_plane_helper =================
[09:53:40] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[09:53:40] ====== drm_test_connector_helper_tv_get_modes_check  =======
[09:53:40] [PASSED] None
[09:53:40] [PASSED] PAL
[09:53:40] [PASSED] NTSC
[09:53:40] [PASSED] Both, NTSC Default
[09:53:40] [PASSED] Both, PAL Default
[09:53:40] [PASSED] Both, NTSC Default, with PAL on command-line
[09:53:40] [PASSED] Both, PAL Default, with NTSC on command-line
[09:53:40] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[09:53:40] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[09:53:40] ================== drm_rect (9 subtests) ===================
[09:53:40] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[09:53:40] [PASSED] drm_test_rect_clip_scaled_not_clipped
[09:53:40] [PASSED] drm_test_rect_clip_scaled_clipped
[09:53:40] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[09:53:40] ================= drm_test_rect_intersect  =================
[09:53:40] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[09:53:40] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[09:53:40] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[09:53:40] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[09:53:40] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[09:53:40] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[09:53:40] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[09:53:40] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[09:53:40] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[09:53:40] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[09:53:40] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[09:53:40] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[09:53:40] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[09:53:40] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[09:53:40] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[09:53:40] ============= [PASSED] drm_test_rect_intersect =============
[09:53:40] ================ drm_test_rect_calc_hscale  ================
[09:53:40] [PASSED] normal use
[09:53:40] [PASSED] out of max range
[09:53:40] [PASSED] out of min range
[09:53:40] [PASSED] zero dst
[09:53:40] [PASSED] negative src
[09:53:40] [PASSED] negative dst
[09:53:40] ============ [PASSED] drm_test_rect_calc_hscale ============
[09:53:40] ================ drm_test_rect_calc_vscale  ================
[09:53:40] [PASSED] normal use
stty: 'standard input': Inappropriate ioctl for device
[09:53:40] [PASSED] out of max range
[09:53:40] [PASSED] out of min range
[09:53:40] [PASSED] zero dst
[09:53:40] [PASSED] negative src
[09:53:40] [PASSED] negative dst
[09:53:40] ============ [PASSED] drm_test_rect_calc_vscale ============
[09:53:40] ================== drm_test_rect_rotate  ===================
[09:53:40] [PASSED] reflect-x
[09:53:40] [PASSED] reflect-y
[09:53:40] [PASSED] rotate-0
[09:53:40] [PASSED] rotate-90
[09:53:40] [PASSED] rotate-180
[09:53:40] [PASSED] rotate-270
[09:53:40] ============== [PASSED] drm_test_rect_rotate ===============
[09:53:40] ================ drm_test_rect_rotate_inv  =================
[09:53:40] [PASSED] reflect-x
[09:53:40] [PASSED] reflect-y
[09:53:40] [PASSED] rotate-0
[09:53:40] [PASSED] rotate-90
[09:53:40] [PASSED] rotate-180
[09:53:40] [PASSED] rotate-270
[09:53:40] ============ [PASSED] drm_test_rect_rotate_inv =============
[09:53:40] ==================== [PASSED] drm_rect =====================
[09:53:40] ============ drm_sysfb_modeset_test (1 subtest) ============
[09:53:40] ============ drm_test_sysfb_build_fourcc_list  =============
[09:53:40] [PASSED] no native formats
[09:53:40] [PASSED] XRGB8888 as native format
[09:53:40] [PASSED] remove duplicates
[09:53:40] [PASSED] convert alpha formats
[09:53:40] [PASSED] random formats
[09:53:40] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[09:53:40] ============= [PASSED] drm_sysfb_modeset_test ==============
[09:53:40] ============================================================
[09:53:40] Testing complete. Ran 622 tests: passed: 622
[09:53:40] Elapsed time: 27.094s total, 1.716s configuring, 24.961s building, 0.386s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[09:53:40] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[09:53:42] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[09:53:51] Starting KUnit Kernel (1/1)...
[09:53:51] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[09:53:51] ================= ttm_device (5 subtests) ==================
[09:53:51] [PASSED] ttm_device_init_basic
[09:53:51] [PASSED] ttm_device_init_multiple
[09:53:51] [PASSED] ttm_device_fini_basic
[09:53:51] [PASSED] ttm_device_init_no_vma_man
[09:53:51] ================== ttm_device_init_pools  ==================
[09:53:51] [PASSED] No DMA allocations, no DMA32 required
[09:53:51] [PASSED] DMA allocations, DMA32 required
[09:53:51] [PASSED] No DMA allocations, DMA32 required
[09:53:51] [PASSED] DMA allocations, no DMA32 required
[09:53:51] ============== [PASSED] ttm_device_init_pools ==============
[09:53:51] =================== [PASSED] ttm_device ====================
[09:53:51] ================== ttm_pool (8 subtests) ===================
[09:53:51] ================== ttm_pool_alloc_basic  ===================
[09:53:51] [PASSED] One page
[09:53:51] [PASSED] More than one page
[09:53:51] [PASSED] Above the allocation limit
[09:53:51] [PASSED] One page, with coherent DMA mappings enabled
[09:53:51] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[09:53:51] ============== [PASSED] ttm_pool_alloc_basic ===============
[09:53:51] ============== ttm_pool_alloc_basic_dma_addr  ==============
[09:53:51] [PASSED] One page
[09:53:51] [PASSED] More than one page
[09:53:51] [PASSED] Above the allocation limit
[09:53:51] [PASSED] One page, with coherent DMA mappings enabled
[09:53:51] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[09:53:51] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[09:53:51] [PASSED] ttm_pool_alloc_order_caching_match
[09:53:51] [PASSED] ttm_pool_alloc_caching_mismatch
[09:53:51] [PASSED] ttm_pool_alloc_order_mismatch
[09:53:51] [PASSED] ttm_pool_free_dma_alloc
[09:53:51] [PASSED] ttm_pool_free_no_dma_alloc
[09:53:51] [PASSED] ttm_pool_fini_basic
[09:53:51] ==================== [PASSED] ttm_pool =====================
[09:53:51] ================ ttm_resource (8 subtests) =================
[09:53:51] ================= ttm_resource_init_basic  =================
[09:53:51] [PASSED] Init resource in TTM_PL_SYSTEM
[09:53:51] [PASSED] Init resource in TTM_PL_VRAM
[09:53:51] [PASSED] Init resource in a private placement
[09:53:51] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[09:53:51] ============= [PASSED] ttm_resource_init_basic =============
[09:53:51] [PASSED] ttm_resource_init_pinned
[09:53:51] [PASSED] ttm_resource_fini_basic
[09:53:51] [PASSED] ttm_resource_manager_init_basic
[09:53:51] [PASSED] ttm_resource_manager_usage_basic
[09:53:51] [PASSED] ttm_resource_manager_set_used_basic
[09:53:51] [PASSED] ttm_sys_man_alloc_basic
[09:53:51] [PASSED] ttm_sys_man_free_basic
[09:53:51] ================== [PASSED] ttm_resource ===================
[09:53:51] =================== ttm_tt (15 subtests) ===================
[09:53:51] ==================== ttm_tt_init_basic  ====================
[09:53:51] [PASSED] Page-aligned size
[09:53:51] [PASSED] Extra pages requested
[09:53:51] ================ [PASSED] ttm_tt_init_basic ================
[09:53:51] [PASSED] ttm_tt_init_misaligned
[09:53:51] [PASSED] ttm_tt_fini_basic
[09:53:51] [PASSED] ttm_tt_fini_sg
[09:53:51] [PASSED] ttm_tt_fini_shmem
[09:53:51] [PASSED] ttm_tt_create_basic
[09:53:51] [PASSED] ttm_tt_create_invalid_bo_type
[09:53:51] [PASSED] ttm_tt_create_ttm_exists
[09:53:51] [PASSED] ttm_tt_create_failed
[09:53:51] [PASSED] ttm_tt_destroy_basic
[09:53:51] [PASSED] ttm_tt_populate_null_ttm
[09:53:51] [PASSED] ttm_tt_populate_populated_ttm
[09:53:51] [PASSED] ttm_tt_unpopulate_basic
[09:53:51] [PASSED] ttm_tt_unpopulate_empty_ttm
[09:53:51] [PASSED] ttm_tt_swapin_basic
[09:53:51] ===================== [PASSED] ttm_tt ======================
[09:53:51] =================== ttm_bo (14 subtests) ===================
[09:53:51] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[09:53:51] [PASSED] Cannot be interrupted and sleeps
[09:53:51] [PASSED] Cannot be interrupted, locks straight away
[09:53:51] [PASSED] Can be interrupted, sleeps
[09:53:51] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[09:53:51] [PASSED] ttm_bo_reserve_locked_no_sleep
[09:53:51] [PASSED] ttm_bo_reserve_no_wait_ticket
[09:53:51] [PASSED] ttm_bo_reserve_double_resv
[09:53:51] [PASSED] ttm_bo_reserve_interrupted
[09:53:51] [PASSED] ttm_bo_reserve_deadlock
[09:53:51] [PASSED] ttm_bo_unreserve_basic
[09:53:51] [PASSED] ttm_bo_unreserve_pinned
[09:53:51] [PASSED] ttm_bo_unreserve_bulk
[09:53:51] [PASSED] ttm_bo_fini_basic
[09:53:51] [PASSED] ttm_bo_fini_shared_resv
[09:53:51] [PASSED] ttm_bo_pin_basic
[09:53:51] [PASSED] ttm_bo_pin_unpin_resource
[09:53:51] [PASSED] ttm_bo_multiple_pin_one_unpin
[09:53:51] ===================== [PASSED] ttm_bo ======================
[09:53:51] ============== ttm_bo_validate (21 subtests) ===============
[09:53:51] ============== ttm_bo_init_reserved_sys_man  ===============
[09:53:51] [PASSED] Buffer object for userspace
[09:53:51] [PASSED] Kernel buffer object
[09:53:51] [PASSED] Shared buffer object
[09:53:51] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[09:53:51] ============== ttm_bo_init_reserved_mock_man  ==============
[09:53:51] [PASSED] Buffer object for userspace
[09:53:51] [PASSED] Kernel buffer object
[09:53:51] [PASSED] Shared buffer object
[09:53:51] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[09:53:51] [PASSED] ttm_bo_init_reserved_resv
[09:53:51] ================== ttm_bo_validate_basic  ==================
[09:53:51] [PASSED] Buffer object for userspace
[09:53:51] [PASSED] Kernel buffer object
[09:53:51] [PASSED] Shared buffer object
[09:53:51] ============== [PASSED] ttm_bo_validate_basic ==============
[09:53:51] [PASSED] ttm_bo_validate_invalid_placement
[09:53:51] ============= ttm_bo_validate_same_placement  ==============
[09:53:51] [PASSED] System manager
[09:53:51] [PASSED] VRAM manager
[09:53:51] ========= [PASSED] ttm_bo_validate_same_placement ==========
[09:53:51] [PASSED] ttm_bo_validate_failed_alloc
[09:53:51] [PASSED] ttm_bo_validate_pinned
[09:53:51] [PASSED] ttm_bo_validate_busy_placement
[09:53:51] ================ ttm_bo_validate_multihop  =================
[09:53:51] [PASSED] Buffer object for userspace
[09:53:51] [PASSED] Kernel buffer object
[09:53:51] [PASSED] Shared buffer object
[09:53:51] ============ [PASSED] ttm_bo_validate_multihop =============
[09:53:51] ========== ttm_bo_validate_no_placement_signaled  ==========
[09:53:51] [PASSED] Buffer object in system domain, no page vector
[09:53:51] [PASSED] Buffer object in system domain with an existing page vector
[09:53:51] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[09:53:51] ======== ttm_bo_validate_no_placement_not_signaled  ========
[09:53:51] [PASSED] Buffer object for userspace
[09:53:51] [PASSED] Kernel buffer object
[09:53:51] [PASSED] Shared buffer object
[09:53:51] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[09:53:51] [PASSED] ttm_bo_validate_move_fence_signaled
[09:53:51] ========= ttm_bo_validate_move_fence_not_signaled  =========
[09:53:51] [PASSED] Waits for GPU
[09:53:51] [PASSED] Tries to lock straight away
[09:53:51] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[09:53:51] [PASSED] ttm_bo_validate_happy_evict
[09:53:51] [PASSED] ttm_bo_validate_all_pinned_evict
[09:53:51] [PASSED] ttm_bo_validate_allowed_only_evict
[09:53:51] [PASSED] ttm_bo_validate_deleted_evict
[09:53:51] [PASSED] ttm_bo_validate_busy_domain_evict
[09:53:51] [PASSED] ttm_bo_validate_evict_gutting
[09:53:51] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[09:53:51] ================= [PASSED] ttm_bo_validate =================
[09:53:51] ============================================================
[09:53:51] Testing complete. Ran 101 tests: passed: 101
[09:53:51] Elapsed time: 11.348s total, 1.697s configuring, 9.434s building, 0.186s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* ✗ Xe.CI.Full: failure for Page Reclamation Support for Xe3p Platforms
  2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
                   ` (12 preceding siblings ...)
  2025-11-18  9:53 ` ✓ CI.KUnit: success " Patchwork
@ 2025-11-18 13:02 ` Patchwork
  13 siblings, 0 replies; 51+ messages in thread
From: Patchwork @ 2025-11-18 13:02 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe

[-- Attachment #1: Type: text/plain, Size: 64437 bytes --]

== Series Details ==

Series: Page Reclamation Support for Xe3p Platforms
URL   : https://patchwork.freedesktop.org/series/157698/
State : failure

== Summary ==

CI Bug Log - changes from xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8_FULL -> xe-pw-157698v1_FULL
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with xe-pw-157698v1_FULL absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in xe-pw-157698v1_FULL, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (4 -> 4)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in xe-pw-157698v1_FULL:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_cursor_legacy@cursor-vs-flip-atomic-transitions:
    - shard-adlp:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-1/igt@kms_cursor_legacy@cursor-vs-flip-atomic-transitions.html
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@kms_cursor_legacy@cursor-vs-flip-atomic-transitions.html

  
Known issues
------------

  Here are the changes found in xe-pw-157698v1_FULL that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_async_flips@async-flip-with-page-flip-events-linear@pipe-c-edp-1:
    - shard-lnl:          [PASS][3] -> [FAIL][4] ([Intel XE#5993]) +3 other tests fail
   [3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-4/igt@kms_async_flips@async-flip-with-page-flip-events-linear@pipe-c-edp-1.html
   [4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_async_flips@async-flip-with-page-flip-events-linear@pipe-c-edp-1.html

  * igt@kms_big_fb@4-tiled-addfb:
    - shard-adlp:         NOTRUN -> [SKIP][5] ([Intel XE#619])
   [5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-6/igt@kms_big_fb@4-tiled-addfb.html

  * igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0:
    - shard-adlp:         NOTRUN -> [SKIP][6] ([Intel XE#1124]) +1 other test skip
   [6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0.html

  * igt@kms_big_fb@linear-32bpp-rotate-270:
    - shard-dg2-set2:     NOTRUN -> [SKIP][7] ([Intel XE#316]) +2 other tests skip
   [7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-463/igt@kms_big_fb@linear-32bpp-rotate-270.html

  * igt@kms_big_fb@x-tiled-8bpp-rotate-0:
    - shard-adlp:         [PASS][8] -> [DMESG-FAIL][9] ([Intel XE#4543]) +8 other tests dmesg-fail
   [8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-9/igt@kms_big_fb@x-tiled-8bpp-rotate-0.html
   [9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@kms_big_fb@x-tiled-8bpp-rotate-0.html

  * igt@kms_big_fb@y-tiled-addfb:
    - shard-dg2-set2:     NOTRUN -> [SKIP][10] ([Intel XE#619])
   [10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-466/igt@kms_big_fb@y-tiled-addfb.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-180-hflip:
    - shard-bmg:          NOTRUN -> [SKIP][11] ([Intel XE#1124]) +1 other test skip
   [11]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-180-hflip.html

  * igt@kms_big_fb@yf-tiled-64bpp-rotate-180:
    - shard-dg2-set2:     NOTRUN -> [SKIP][12] ([Intel XE#1124]) +7 other tests skip
   [12]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_big_fb@yf-tiled-64bpp-rotate-180.html

  * igt@kms_big_fb@yf-tiled-8bpp-rotate-0:
    - shard-lnl:          NOTRUN -> [SKIP][13] ([Intel XE#1124]) +3 other tests skip
   [13]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_big_fb@yf-tiled-8bpp-rotate-0.html

  * igt@kms_big_fb@yf-tiled-addfb-size-overflow:
    - shard-dg2-set2:     NOTRUN -> [SKIP][14] ([Intel XE#610])
   [14]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-466/igt@kms_big_fb@yf-tiled-addfb-size-overflow.html

  * igt@kms_bw@linear-tiling-1-displays-3840x2160p:
    - shard-dg2-set2:     NOTRUN -> [SKIP][15] ([Intel XE#367]) +2 other tests skip
   [15]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-434/igt@kms_bw@linear-tiling-1-displays-3840x2160p.html

  * igt@kms_ccs@bad-aux-stride-4-tiled-mtl-mc-ccs@pipe-d-hdmi-a-1:
    - shard-adlp:         NOTRUN -> [SKIP][16] ([Intel XE#455] / [Intel XE#787]) +5 other tests skip
   [16]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_ccs@bad-aux-stride-4-tiled-mtl-mc-ccs@pipe-d-hdmi-a-1.html

  * igt@kms_ccs@bad-pixel-format-yf-tiled-ccs:
    - shard-dg2-set2:     NOTRUN -> [SKIP][17] ([Intel XE#455] / [Intel XE#787]) +23 other tests skip
   [17]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_ccs@bad-pixel-format-yf-tiled-ccs.html
    - shard-lnl:          NOTRUN -> [SKIP][18] ([Intel XE#2887]) +5 other tests skip
   [18]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_ccs@bad-pixel-format-yf-tiled-ccs.html

  * igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs:
    - shard-adlp:         NOTRUN -> [SKIP][19] ([Intel XE#3442])
   [19]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][20] ([Intel XE#3442])
   [20]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html

  * igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs@pipe-a-edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][21] ([Intel XE#2669] / [Intel XE#3433]) +3 other tests skip
   [21]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs@pipe-a-edp-1.html

  * igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-mc-ccs:
    - shard-bmg:          NOTRUN -> [SKIP][22] ([Intel XE#2887]) +2 other tests skip
   [22]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-mc-ccs.html

  * igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-mc-ccs@pipe-a-hdmi-a-1:
    - shard-adlp:         NOTRUN -> [SKIP][23] ([Intel XE#787]) +8 other tests skip
   [23]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-mc-ccs@pipe-a-hdmi-a-1.html

  * igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-rc-ccs-cc@pipe-c-hdmi-a-6:
    - shard-dg2-set2:     NOTRUN -> [SKIP][24] ([Intel XE#787]) +83 other tests skip
   [24]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@kms_ccs@crc-sprite-planes-basic-4-tiled-mtl-rc-ccs-cc@pipe-c-hdmi-a-6.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs:
    - shard-dg2-set2:     [PASS][25] -> [INCOMPLETE][26] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4345] / [Intel XE#4522])
   [25]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-434/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs.html
   [26]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4:
    - shard-dg2-set2:     [PASS][27] -> [INCOMPLETE][28] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4522])
   [27]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-434/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html
   [28]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html

  * igt@kms_chamelium_color@ctm-0-25:
    - shard-adlp:         NOTRUN -> [SKIP][29] ([Intel XE#306])
   [29]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_chamelium_color@ctm-0-25.html
    - shard-lnl:          NOTRUN -> [SKIP][30] ([Intel XE#306])
   [30]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-3/igt@kms_chamelium_color@ctm-0-25.html

  * igt@kms_chamelium_color@ctm-green-to-red:
    - shard-dg2-set2:     NOTRUN -> [SKIP][31] ([Intel XE#306]) +1 other test skip
   [31]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_chamelium_color@ctm-green-to-red.html

  * igt@kms_chamelium_frames@hdmi-crc-nonplanar-formats:
    - shard-dg2-set2:     NOTRUN -> [SKIP][32] ([Intel XE#373]) +7 other tests skip
   [32]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_chamelium_frames@hdmi-crc-nonplanar-formats.html

  * igt@kms_chamelium_hpd@hdmi-hpd-after-hibernate:
    - shard-adlp:         NOTRUN -> [SKIP][33] ([Intel XE#373]) +1 other test skip
   [33]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_chamelium_hpd@hdmi-hpd-after-hibernate.html
    - shard-bmg:          NOTRUN -> [SKIP][34] ([Intel XE#2252]) +1 other test skip
   [34]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_chamelium_hpd@hdmi-hpd-after-hibernate.html
    - shard-lnl:          NOTRUN -> [SKIP][35] ([Intel XE#373]) +3 other tests skip
   [35]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-8/igt@kms_chamelium_hpd@hdmi-hpd-after-hibernate.html

  * igt@kms_content_protection@dp-mst-type-1:
    - shard-bmg:          NOTRUN -> [SKIP][36] ([Intel XE#2390])
   [36]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_content_protection@dp-mst-type-1.html
    - shard-adlp:         NOTRUN -> [SKIP][37] ([Intel XE#307])
   [37]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_content_protection@dp-mst-type-1.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][38] ([Intel XE#307]) +1 other test skip
   [38]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_content_protection@dp-mst-type-1.html
    - shard-lnl:          NOTRUN -> [SKIP][39] ([Intel XE#307]) +1 other test skip
   [39]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@kms_content_protection@dp-mst-type-1.html

  * igt@kms_cursor_crc@cursor-rapid-movement-256x85:
    - shard-lnl:          NOTRUN -> [SKIP][40] ([Intel XE#1424])
   [40]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@kms_cursor_crc@cursor-rapid-movement-256x85.html

  * igt@kms_cursor_crc@cursor-sliding-512x170:
    - shard-dg2-set2:     NOTRUN -> [SKIP][41] ([Intel XE#308])
   [41]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-434/igt@kms_cursor_crc@cursor-sliding-512x170.html

  * igt@kms_cursor_crc@cursor-suspend:
    - shard-dg2-set2:     [PASS][42] -> [INCOMPLETE][43] ([Intel XE#6612]) +1 other test incomplete
   [42]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-434/igt@kms_cursor_crc@cursor-suspend.html
   [43]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-466/igt@kms_cursor_crc@cursor-suspend.html

  * igt@kms_cursor_legacy@2x-flip-vs-cursor-atomic:
    - shard-lnl:          NOTRUN -> [SKIP][44] ([Intel XE#309]) +1 other test skip
   [44]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-5/igt@kms_cursor_legacy@2x-flip-vs-cursor-atomic.html

  * igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size:
    - shard-bmg:          [PASS][45] -> [SKIP][46] ([Intel XE#2291]) +4 other tests skip
   [45]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-2/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size.html
   [46]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@cursorb-vs-flipb-toggle:
    - shard-adlp:         NOTRUN -> [SKIP][47] ([Intel XE#309])
   [47]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@kms_cursor_legacy@cursorb-vs-flipb-toggle.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-bmg:          [PASS][48] -> [FAIL][49] ([Intel XE#1475])
   [48]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [49]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_display_modes@extended-mode-basic:
    - shard-adlp:         NOTRUN -> [SKIP][50] ([Intel XE#4302])
   [50]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_display_modes@extended-mode-basic.html
    - shard-lnl:          NOTRUN -> [SKIP][51] ([Intel XE#4302])
   [51]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_display_modes@extended-mode-basic.html

  * igt@kms_dp_linktrain_fallback@dsc-fallback:
    - shard-dg2-set2:     NOTRUN -> [SKIP][52] ([Intel XE#4331])
   [52]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_dp_linktrain_fallback@dsc-fallback.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-bmg:          NOTRUN -> [SKIP][53] ([Intel XE#4156])
   [53]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-2/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_fbcon_fbt@psr-suspend:
    - shard-dg2-set2:     NOTRUN -> [SKIP][54] ([Intel XE#776])
   [54]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-434/igt@kms_fbcon_fbt@psr-suspend.html

  * igt@kms_feature_discovery@psr1:
    - shard-adlp:         NOTRUN -> [SKIP][55] ([Intel XE#1135])
   [55]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@kms_feature_discovery@psr1.html
    - shard-bmg:          NOTRUN -> [SKIP][56] ([Intel XE#2374])
   [56]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@kms_feature_discovery@psr1.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][57] ([Intel XE#1135])
   [57]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_feature_discovery@psr1.html

  * igt@kms_flip@2x-flip-vs-absolute-wf_vblank-interruptible:
    - shard-dg2-set2:     [PASS][58] -> [FAIL][59] ([Intel XE#3149] / [Intel XE#5408] / [Intel XE#5416] / [Intel XE#6266]) +1 other test fail
   [58]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-466/igt@kms_flip@2x-flip-vs-absolute-wf_vblank-interruptible.html
   [59]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_flip@2x-flip-vs-absolute-wf_vblank-interruptible.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-adlp:         NOTRUN -> [SKIP][60] ([Intel XE#310])
   [60]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html
    - shard-bmg:          NOTRUN -> [SKIP][61] ([Intel XE#2316])
   [61]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@2x-flip-vs-panning-vs-hang:
    - shard-lnl:          NOTRUN -> [SKIP][62] ([Intel XE#1421]) +2 other tests skip
   [62]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_flip@2x-flip-vs-panning-vs-hang.html

  * igt@kms_flip@2x-wf_vblank-ts-check-interruptible:
    - shard-bmg:          [PASS][63] -> [SKIP][64] ([Intel XE#2316]) +5 other tests skip
   [63]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-2/igt@kms_flip@2x-wf_vblank-ts-check-interruptible.html
   [64]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_flip@2x-wf_vblank-ts-check-interruptible.html

  * igt@kms_flip@flip-vs-panning-interruptible:
    - shard-adlp:         [PASS][65] -> [DMESG-WARN][66] ([Intel XE#4543] / [Intel XE#5208])
   [65]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-4/igt@kms_flip@flip-vs-panning-interruptible.html
   [66]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@kms_flip@flip-vs-panning-interruptible.html

  * igt@kms_flip@flip-vs-suspend@b-hdmi-a1:
    - shard-adlp:         [PASS][67] -> [DMESG-WARN][68] ([Intel XE#4543]) +19 other tests dmesg-warn
   [67]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-4/igt@kms_flip@flip-vs-suspend@b-hdmi-a1.html
   [68]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_flip@flip-vs-suspend@b-hdmi-a1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling:
    - shard-bmg:          NOTRUN -> [SKIP][69] ([Intel XE#2293] / [Intel XE#2380])
   [69]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling.html
    - shard-lnl:          NOTRUN -> [SKIP][70] ([Intel XE#1401] / [Intel XE#1745])
   [70]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-8/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode:
    - shard-lnl:          NOTRUN -> [SKIP][71] ([Intel XE#1401])
   [71]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-8/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-valid-mode:
    - shard-bmg:          NOTRUN -> [SKIP][72] ([Intel XE#2293])
   [72]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-valid-mode.html

  * igt@kms_frontbuffer_tracking@drrs-1p-primscrn-pri-indfb-draw-render:
    - shard-adlp:         NOTRUN -> [SKIP][73] ([Intel XE#651]) +2 other tests skip
   [73]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_frontbuffer_tracking@drrs-1p-primscrn-pri-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-cur-indfb-draw-blt:
    - shard-bmg:          NOTRUN -> [SKIP][74] ([Intel XE#2311]) +6 other tests skip
   [74]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-cur-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-onoff:
    - shard-bmg:          NOTRUN -> [SKIP][75] ([Intel XE#4141])
   [75]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-onoff.html
    - shard-adlp:         NOTRUN -> [SKIP][76] ([Intel XE#656]) +8 other tests skip
   [76]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-1p-offscreen-pri-indfb-draw-blt:
    - shard-lnl:          NOTRUN -> [SKIP][77] ([Intel XE#6312])
   [77]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@kms_frontbuffer_tracking@fbcdrrs-1p-offscreen-pri-indfb-draw-blt.html
    - shard-adlp:         NOTRUN -> [SKIP][78] ([Intel XE#6312])
   [78]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_frontbuffer_tracking@fbcdrrs-1p-offscreen-pri-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-1p-offscreen-pri-indfb-draw-mmap-wc:
    - shard-dg2-set2:     NOTRUN -> [SKIP][79] ([Intel XE#6312]) +4 other tests skip
   [79]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-434/igt@kms_frontbuffer_tracking@fbcdrrs-1p-offscreen-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-1p-primscrn-cur-indfb-draw-render:
    - shard-dg2-set2:     NOTRUN -> [SKIP][80] ([Intel XE#651]) +20 other tests skip
   [80]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_frontbuffer_tracking@fbcdrrs-1p-primscrn-cur-indfb-draw-render.html
    - shard-lnl:          NOTRUN -> [SKIP][81] ([Intel XE#651]) +3 other tests skip
   [81]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@kms_frontbuffer_tracking@fbcdrrs-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw:
    - shard-adlp:         NOTRUN -> [SKIP][82] ([Intel XE#653]) +2 other tests skip
   [82]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw.html
    - shard-bmg:          NOTRUN -> [SKIP][83] ([Intel XE#2313]) +4 other tests skip
   [83]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-draw-blt:
    - shard-bmg:          NOTRUN -> [SKIP][84] ([Intel XE#2312]) +2 other tests skip
   [84]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-blt:
    - shard-dg2-set2:     NOTRUN -> [SKIP][85] ([Intel XE#653]) +29 other tests skip
   [85]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-blt:
    - shard-lnl:          NOTRUN -> [SKIP][86] ([Intel XE#656]) +15 other tests skip
   [86]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-blt.html

  * igt@kms_hdr@brightness-with-hdr:
    - shard-dg2-set2:     NOTRUN -> [SKIP][87] ([Intel XE#455]) +17 other tests skip
   [87]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-463/igt@kms_hdr@brightness-with-hdr.html
    - shard-lnl:          NOTRUN -> [SKIP][88] ([Intel XE#3374] / [Intel XE#3544])
   [88]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-8/igt@kms_hdr@brightness-with-hdr.html

  * igt@kms_invalid_mode@clock-too-high@pipe-a-edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][89] ([Intel XE#1450]) +1 other test skip
   [89]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_invalid_mode@clock-too-high@pipe-a-edp-1.html

  * igt@kms_invalid_mode@clock-too-high@pipe-c-edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][90] ([Intel XE#1450] / [Intel XE#2568]) +1 other test skip
   [90]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_invalid_mode@clock-too-high@pipe-c-edp-1.html

  * igt@kms_plane_multiple@tiling-none:
    - shard-adlp:         [PASS][91] -> [DMESG-WARN][92] ([Intel XE#2953] / [Intel XE#4173]) +5 other tests dmesg-warn
   [91]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-8/igt@kms_plane_multiple@tiling-none.html
   [92]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@kms_plane_multiple@tiling-none.html

  * igt@kms_plane_scaling@planes-downscale-factor-0-75@pipe-b:
    - shard-lnl:          NOTRUN -> [SKIP][93] ([Intel XE#2763]) +7 other tests skip
   [93]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_plane_scaling@planes-downscale-factor-0-75@pipe-b.html

  * igt@kms_pm_dc@dc5-dpms:
    - shard-lnl:          [PASS][94] -> [FAIL][95] ([Intel XE#718])
   [94]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-5/igt@kms_pm_dc@dc5-dpms.html
   [95]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-8/igt@kms_pm_dc@dc5-dpms.html

  * igt@kms_pm_dc@dc5-psr:
    - shard-dg2-set2:     NOTRUN -> [SKIP][96] ([Intel XE#1129])
   [96]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_pm_dc@dc5-psr.html

  * igt@kms_pm_dc@deep-pkgc:
    - shard-dg2-set2:     NOTRUN -> [SKIP][97] ([Intel XE#908])
   [97]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@kms_pm_dc@deep-pkgc.html

  * igt@kms_pm_rpm@dpms-lpsp:
    - shard-bmg:          NOTRUN -> [SKIP][98] ([Intel XE#1439] / [Intel XE#3141] / [Intel XE#836])
   [98]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@kms_pm_rpm@dpms-lpsp.html

  * igt@kms_psr2_sf@fbc-pr-cursor-plane-move-continuous-exceed-sf:
    - shard-adlp:         NOTRUN -> [SKIP][99] ([Intel XE#1406] / [Intel XE#1489]) +1 other test skip
   [99]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_psr2_sf@fbc-pr-cursor-plane-move-continuous-exceed-sf.html
    - shard-bmg:          NOTRUN -> [SKIP][100] ([Intel XE#1406] / [Intel XE#1489]) +1 other test skip
   [100]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_psr2_sf@fbc-pr-cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@pr-cursor-plane-update-sf:
    - shard-lnl:          NOTRUN -> [SKIP][101] ([Intel XE#1406] / [Intel XE#2893]) +1 other test skip
   [101]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-3/igt@kms_psr2_sf@pr-cursor-plane-update-sf.html

  * igt@kms_psr2_sf@psr2-overlay-primary-update-sf-dmg-area:
    - shard-dg2-set2:     NOTRUN -> [SKIP][102] ([Intel XE#1406] / [Intel XE#1489]) +3 other tests skip
   [102]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@kms_psr2_sf@psr2-overlay-primary-update-sf-dmg-area.html

  * igt@kms_psr@fbc-psr-primary-blt:
    - shard-adlp:         NOTRUN -> [SKIP][103] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929]) +1 other test skip
   [103]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_psr@fbc-psr-primary-blt.html

  * igt@kms_psr@fbc-psr-sprite-plane-onoff:
    - shard-dg2-set2:     NOTRUN -> [SKIP][104] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929]) +7 other tests skip
   [104]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-463/igt@kms_psr@fbc-psr-sprite-plane-onoff.html

  * igt@kms_psr@fbc-psr2-sprite-blt@edp-1:
    - shard-lnl:          NOTRUN -> [SKIP][105] ([Intel XE#1406] / [Intel XE#4609])
   [105]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_psr@fbc-psr2-sprite-blt@edp-1.html

  * igt@kms_psr@pr-primary-blt:
    - shard-lnl:          NOTRUN -> [SKIP][106] ([Intel XE#1406]) +1 other test skip
   [106]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-5/igt@kms_psr@pr-primary-blt.html
    - shard-bmg:          NOTRUN -> [SKIP][107] ([Intel XE#1406] / [Intel XE#2234] / [Intel XE#2850])
   [107]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-2/igt@kms_psr@pr-primary-blt.html

  * igt@kms_rotation_crc@primary-y-tiled-reflect-x-90:
    - shard-dg2-set2:     NOTRUN -> [SKIP][108] ([Intel XE#3414])
   [108]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_rotation_crc@primary-y-tiled-reflect-x-90.html

  * igt@kms_setmode@basic:
    - shard-adlp:         [PASS][109] -> [FAIL][110] ([Intel XE#6361]) +2 other tests fail
   [109]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-3/igt@kms_setmode@basic.html
   [110]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@kms_setmode@basic.html

  * igt@kms_setmode@basic@pipe-b-edp-1:
    - shard-lnl:          [PASS][111] -> [FAIL][112] ([Intel XE#6361]) +2 other tests fail
   [111]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-2/igt@kms_setmode@basic@pipe-b-edp-1.html
   [112]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_setmode@basic@pipe-b-edp-1.html

  * igt@kms_setmode@clone-exclusive-crtc:
    - shard-adlp:         NOTRUN -> [SKIP][113] ([Intel XE#455]) +2 other tests skip
   [113]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-6/igt@kms_setmode@clone-exclusive-crtc.html
    - shard-lnl:          NOTRUN -> [SKIP][114] ([Intel XE#1435])
   [114]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-5/igt@kms_setmode@clone-exclusive-crtc.html

  * igt@kms_sharpness_filter@filter-formats:
    - shard-lnl:          [PASS][115] -> [DMESG-WARN][116] ([Intel XE#4537])
   [115]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-2/igt@kms_sharpness_filter@filter-formats.html
   [116]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@kms_sharpness_filter@filter-formats.html

  * igt@kms_vrr@cmrr@pipe-a-edp-1:
    - shard-lnl:          [PASS][117] -> [FAIL][118] ([Intel XE#4459]) +1 other test fail
   [117]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-4/igt@kms_vrr@cmrr@pipe-a-edp-1.html
   [118]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-7/igt@kms_vrr@cmrr@pipe-a-edp-1.html

  * igt@xe_compute_preempt@compute-preempt-many:
    - shard-dg2-set2:     NOTRUN -> [SKIP][119] ([Intel XE#6360])
   [119]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@xe_compute_preempt@compute-preempt-many.html

  * igt@xe_configfs@survivability-mode:
    - shard-dg2-set2:     NOTRUN -> [SKIP][120] ([Intel XE#6010])
   [120]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_configfs@survivability-mode.html
    - shard-lnl:          NOTRUN -> [SKIP][121] ([Intel XE#6010])
   [121]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-3/igt@xe_configfs@survivability-mode.html
    - shard-adlp:         NOTRUN -> [SKIP][122] ([Intel XE#6010])
   [122]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@xe_configfs@survivability-mode.html

  * igt@xe_copy_basic@mem-set-linear-0xfffe:
    - shard-dg2-set2:     NOTRUN -> [SKIP][123] ([Intel XE#1126]) +1 other test skip
   [123]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_copy_basic@mem-set-linear-0xfffe.html
    - shard-adlp:         NOTRUN -> [SKIP][124] ([Intel XE#1126])
   [124]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@xe_copy_basic@mem-set-linear-0xfffe.html

  * igt@xe_eudebug@basic-client-th:
    - shard-adlp:         NOTRUN -> [SKIP][125] ([Intel XE#4837] / [Intel XE#5565]) +2 other tests skip
   [125]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@xe_eudebug@basic-client-th.html
    - shard-bmg:          NOTRUN -> [SKIP][126] ([Intel XE#4837]) +2 other tests skip
   [126]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@xe_eudebug@basic-client-th.html

  * igt@xe_eudebug@sysfs-toggle:
    - shard-dg2-set2:     NOTRUN -> [SKIP][127] ([Intel XE#4837]) +13 other tests skip
   [127]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_eudebug@sysfs-toggle.html

  * igt@xe_eudebug_online@interrupt-all-set-breakpoint:
    - shard-lnl:          NOTRUN -> [SKIP][128] ([Intel XE#4837]) +5 other tests skip
   [128]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-2/igt@xe_eudebug_online@interrupt-all-set-breakpoint.html

  * igt@xe_evict@evict-beng-mixed-many-threads-small:
    - shard-adlp:         NOTRUN -> [SKIP][129] ([Intel XE#261])
   [129]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@xe_evict@evict-beng-mixed-many-threads-small.html

  * igt@xe_evict@evict-mixed-threads-small:
    - shard-adlp:         NOTRUN -> [SKIP][130] ([Intel XE#261] / [Intel XE#688])
   [130]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-3/igt@xe_evict@evict-mixed-threads-small.html

  * igt@xe_evict_ccs@evict-overcommit-parallel-nofree-reopen:
    - shard-lnl:          NOTRUN -> [SKIP][131] ([Intel XE#688]) +3 other tests skip
   [131]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@xe_evict_ccs@evict-overcommit-parallel-nofree-reopen.html

  * igt@xe_exec_basic@multigpu-no-exec-userptr-invalidate-race:
    - shard-lnl:          NOTRUN -> [SKIP][132] ([Intel XE#1392]) +2 other tests skip
   [132]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-2/igt@xe_exec_basic@multigpu-no-exec-userptr-invalidate-race.html
    - shard-adlp:         NOTRUN -> [SKIP][133] ([Intel XE#1392] / [Intel XE#5575])
   [133]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-6/igt@xe_exec_basic@multigpu-no-exec-userptr-invalidate-race.html
    - shard-bmg:          NOTRUN -> [SKIP][134] ([Intel XE#2322])
   [134]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@xe_exec_basic@multigpu-no-exec-userptr-invalidate-race.html

  * igt@xe_exec_fault_mode@twice-userptr-invalidate-imm:
    - shard-adlp:         NOTRUN -> [SKIP][135] ([Intel XE#288] / [Intel XE#5561]) +3 other tests skip
   [135]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-2/igt@xe_exec_fault_mode@twice-userptr-invalidate-imm.html

  * igt@xe_exec_fault_mode@twice-userptr-prefetch:
    - shard-dg2-set2:     NOTRUN -> [SKIP][136] ([Intel XE#288]) +13 other tests skip
   [136]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@xe_exec_fault_mode@twice-userptr-prefetch.html

  * igt@xe_exec_mix_modes@exec-simple-batch-store-lr:
    - shard-dg2-set2:     NOTRUN -> [SKIP][137] ([Intel XE#2360])
   [137]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@xe_exec_mix_modes@exec-simple-batch-store-lr.html
    - shard-adlp:         NOTRUN -> [SKIP][138] ([Intel XE#2360])
   [138]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-9/igt@xe_exec_mix_modes@exec-simple-batch-store-lr.html

  * igt@xe_exec_system_allocator@madvise-range-invalidate-change-attr:
    - shard-lnl:          NOTRUN -> [WARN][139] ([Intel XE#5786]) +2 other tests warn
   [139]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@xe_exec_system_allocator@madvise-range-invalidate-change-attr.html

  * igt@xe_exec_system_allocator@many-execqueues-mmap-huge-nomemset:
    - shard-bmg:          NOTRUN -> [SKIP][140] ([Intel XE#4943]) +4 other tests skip
   [140]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-2/igt@xe_exec_system_allocator@many-execqueues-mmap-huge-nomemset.html

  * igt@xe_exec_system_allocator@threads-shared-vm-many-execqueues-mmap-free-madvise:
    - shard-adlp:         NOTRUN -> [SKIP][141] ([Intel XE#4915]) +64 other tests skip
   [141]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@xe_exec_system_allocator@threads-shared-vm-many-execqueues-mmap-free-madvise.html

  * igt@xe_exec_system_allocator@threads-shared-vm-many-large-malloc:
    - shard-dg2-set2:     NOTRUN -> [SKIP][142] ([Intel XE#4915]) +256 other tests skip
   [142]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@xe_exec_system_allocator@threads-shared-vm-many-large-malloc.html

  * igt@xe_exec_system_allocator@threads-shared-vm-many-large-mmap-new-huge:
    - shard-lnl:          NOTRUN -> [SKIP][143] ([Intel XE#4943]) +9 other tests skip
   [143]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@xe_exec_system_allocator@threads-shared-vm-many-large-mmap-new-huge.html

  * igt@xe_gt_freq@freq_fixed_idle:
    - shard-dg2-set2:     [PASS][144] -> [FAIL][145] ([Intel XE#6407])
   [144]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-432/igt@xe_gt_freq@freq_fixed_idle.html
   [145]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-463/igt@xe_gt_freq@freq_fixed_idle.html

  * igt@xe_media_fill@media-fill:
    - shard-dg2-set2:     NOTRUN -> [SKIP][146] ([Intel XE#560])
   [146]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_media_fill@media-fill.html
    - shard-lnl:          NOTRUN -> [SKIP][147] ([Intel XE#560])
   [147]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-4/igt@xe_media_fill@media-fill.html

  * igt@xe_mmap@pci-membarrier-bad-object:
    - shard-adlp:         NOTRUN -> [SKIP][148] ([Intel XE#5100])
   [148]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-4/igt@xe_mmap@pci-membarrier-bad-object.html
    - shard-lnl:          NOTRUN -> [SKIP][149] ([Intel XE#5100])
   [149]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-1/igt@xe_mmap@pci-membarrier-bad-object.html

  * igt@xe_module_load@force-load:
    - shard-dg2-set2:     NOTRUN -> [SKIP][150] ([Intel XE#378])
   [150]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_module_load@force-load.html

  * igt@xe_oa@oa-unit-exclusive-stream-sample-oa:
    - shard-dg2-set2:     NOTRUN -> [SKIP][151] ([Intel XE#3573]) +3 other tests skip
   [151]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@xe_oa@oa-unit-exclusive-stream-sample-oa.html

  * igt@xe_pat@pat-index-xelp:
    - shard-lnl:          NOTRUN -> [SKIP][152] ([Intel XE#977])
   [152]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-3/igt@xe_pat@pat-index-xelp.html
    - shard-bmg:          NOTRUN -> [SKIP][153] ([Intel XE#2245])
   [153]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@xe_pat@pat-index-xelp.html

  * igt@xe_pm@d3cold-mmap-system:
    - shard-dg2-set2:     NOTRUN -> [SKIP][154] ([Intel XE#2284] / [Intel XE#366])
   [154]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-434/igt@xe_pm@d3cold-mmap-system.html

  * igt@xe_pmu@engine-activity-accuracy-90:
    - shard-lnl:          NOTRUN -> [FAIL][155] ([Intel XE#6251]) +3 other tests fail
   [155]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-3/igt@xe_pmu@engine-activity-accuracy-90.html

  * igt@xe_pmu@gt-c6-idle:
    - shard-dg2-set2:     NOTRUN -> [FAIL][156] ([Intel XE#6366])
   [156]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_pmu@gt-c6-idle.html

  * igt@xe_pxp@pxp-stale-bo-exec-post-termination-irq:
    - shard-dg2-set2:     NOTRUN -> [SKIP][157] ([Intel XE#4733]) +2 other tests skip
   [157]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@xe_pxp@pxp-stale-bo-exec-post-termination-irq.html

  * igt@xe_pxp@regular-src-to-pxp-dest-rendercopy:
    - shard-adlp:         NOTRUN -> [SKIP][158] ([Intel XE#4733] / [Intel XE#5594])
   [158]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@xe_pxp@regular-src-to-pxp-dest-rendercopy.html
    - shard-bmg:          NOTRUN -> [SKIP][159] ([Intel XE#4733])
   [159]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@xe_pxp@regular-src-to-pxp-dest-rendercopy.html

  * igt@xe_sriov_flr@flr-twice:
    - shard-dg2-set2:     NOTRUN -> [SKIP][160] ([Intel XE#4273])
   [160]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-435/igt@xe_sriov_flr@flr-twice.html
    - shard-lnl:          NOTRUN -> [SKIP][161] ([Intel XE#4273])
   [161]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-2/igt@xe_sriov_flr@flr-twice.html

  * igt@xe_sriov_flr@flr-vf1-clear:
    - shard-dg2-set2:     NOTRUN -> [SKIP][162] ([Intel XE#3342])
   [162]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-466/igt@xe_sriov_flr@flr-vf1-clear.html

  * igt@xe_sriov_flr@flr-vfs-parallel:
    - shard-bmg:          [PASS][163] -> [FAIL][164] ([Intel XE#6569]) +1 other test fail
   [163]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-5/igt@xe_sriov_flr@flr-vfs-parallel.html
   [164]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@xe_sriov_flr@flr-vfs-parallel.html

  * igt@xe_sriov_vram@vf-access-after-resize-up:
    - shard-bmg:          [PASS][165] -> [FAIL][166] ([Intel XE#5937])
   [165]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@xe_sriov_vram@vf-access-after-resize-up.html
   [166]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@xe_sriov_vram@vf-access-after-resize-up.html
    - shard-dg2-set2:     NOTRUN -> [SKIP][167] ([Intel XE#6318])
   [167]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-436/igt@xe_sriov_vram@vf-access-after-resize-up.html

  
#### Possible fixes ####

  * igt@kms_async_flips@crc-atomic@pipe-d-hdmi-a-1:
    - shard-adlp:         [FAIL][168] ([Intel XE#3884]) -> [PASS][169]
   [168]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-9/igt@kms_async_flips@crc-atomic@pipe-d-hdmi-a-1.html
   [169]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_async_flips@crc-atomic@pipe-d-hdmi-a-1.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
    - shard-adlp:         [DMESG-FAIL][170] ([Intel XE#4543]) -> [PASS][171] +7 other tests pass
   [170]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-2/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html
   [171]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-8/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html

  * igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-rc-ccs@pipe-c-dp-4:
    - shard-dg2-set2:     [INCOMPLETE][172] ([Intel XE#3862]) -> [PASS][173] +3 other tests pass
   [172]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-466/igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html
   [173]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-466/igt@kms_ccs@crc-primary-suspend-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs:
    - shard-dg2-set2:     [INCOMPLETE][174] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4345] / [Intel XE#4522]) -> [PASS][175]
   [174]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-436/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs.html
   [175]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs.html

  * igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-hdmi-a-6:
    - shard-dg2-set2:     [INCOMPLETE][176] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4522]) -> [PASS][177]
   [176]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-436/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-hdmi-a-6.html
   [177]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-hdmi-a-6.html

  * igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions:
    - shard-bmg:          [SKIP][178] ([Intel XE#2291]) -> [PASS][179]
   [178]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions.html
   [179]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-2/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions.html

  * igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size:
    - shard-bmg:          [DMESG-WARN][180] ([Intel XE#5354]) -> [PASS][181] +1 other test pass
   [180]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-1/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size.html
   [181]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size.html

  * igt@kms_flip@2x-flip-vs-expired-vblank:
    - shard-dg2-set2:     [FAIL][182] ([Intel XE#301] / [Intel XE#3321]) -> [PASS][183]
   [182]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-434/igt@kms_flip@2x-flip-vs-expired-vblank.html
   [183]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_flip@2x-flip-vs-expired-vblank.html

  * igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a6-dp4:
    - shard-dg2-set2:     [FAIL][184] ([Intel XE#301]) -> [PASS][185]
   [184]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-434/igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a6-dp4.html
   [185]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-432/igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a6-dp4.html

  * igt@kms_flip@2x-plain-flip-interruptible:
    - shard-bmg:          [SKIP][186] ([Intel XE#2316]) -> [PASS][187] +4 other tests pass
   [186]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_flip@2x-plain-flip-interruptible.html
   [187]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_flip@2x-plain-flip-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-lnl:          [FAIL][188] ([Intel XE#301]) -> [PASS][189] +2 other tests pass
   [188]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-lnl-3/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
   [189]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-lnl-2/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@kms_flip@plain-flip-interruptible@b-hdmi-a1:
    - shard-adlp:         [DMESG-WARN][190] ([Intel XE#4543]) -> [PASS][191] +6 other tests pass
   [190]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-6/igt@kms_flip@plain-flip-interruptible@b-hdmi-a1.html
   [191]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-2/igt@kms_flip@plain-flip-interruptible@b-hdmi-a1.html

  * igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-32bpp-xtile-downscaling@pipe-a-valid-mode:
    - shard-adlp:         [DMESG-FAIL][192] ([Intel XE#4543] / [Intel XE#4921]) -> [PASS][193] +1 other test pass
   [192]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-2/igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-32bpp-xtile-downscaling@pipe-a-valid-mode.html
   [193]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-1/igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-32bpp-xtile-downscaling@pipe-a-valid-mode.html

  * igt@kms_hdr@static-toggle:
    - shard-bmg:          [SKIP][194] ([Intel XE#1503]) -> [PASS][195]
   [194]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_hdr@static-toggle.html
   [195]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@kms_hdr@static-toggle.html

  * igt@kms_plane_multiple@2x-tiling-none:
    - shard-bmg:          [SKIP][196] ([Intel XE#4596]) -> [PASS][197]
   [196]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_plane_multiple@2x-tiling-none.html
   [197]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_plane_multiple@2x-tiling-none.html

  * igt@kms_vrr@negative-basic:
    - shard-bmg:          [SKIP][198] ([Intel XE#1499]) -> [PASS][199]
   [198]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_vrr@negative-basic.html
   [199]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-8/igt@kms_vrr@negative-basic.html

  * igt@xe_exec_compute_mode@twice-bindexecqueue-userptr-invalidate:
    - shard-dg2-set2:     [INCOMPLETE][200] -> [PASS][201]
   [200]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-dg2-433/igt@xe_exec_compute_mode@twice-bindexecqueue-userptr-invalidate.html
   [201]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-dg2-464/igt@xe_exec_compute_mode@twice-bindexecqueue-userptr-invalidate.html

  * igt@xe_pm@s4-vm-bind-prefetch:
    - shard-adlp:         [DMESG-WARN][202] ([Intel XE#2953] / [Intel XE#4173]) -> [PASS][203] +2 other tests pass
   [202]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-adlp-8/igt@xe_pm@s4-vm-bind-prefetch.html
   [203]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-adlp-6/igt@xe_pm@s4-vm-bind-prefetch.html

  * igt@xe_sriov_auto_provisioning@exclusive-ranges@numvfs-random:
    - shard-bmg:          [FAIL][204] ([Intel XE#5937]) -> [PASS][205] +3 other tests pass
   [204]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@xe_sriov_auto_provisioning@exclusive-ranges@numvfs-random.html
   [205]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@xe_sriov_auto_provisioning@exclusive-ranges@numvfs-random.html

  
#### Warnings ####

  * igt@kms_content_protection@atomic:
    - shard-bmg:          [FAIL][206] ([Intel XE#1178]) -> [SKIP][207] ([Intel XE#2341])
   [206]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-1/igt@kms_content_protection@atomic.html
   [207]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_content_protection@atomic.html

  * igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-pri-indfb-draw-mmap-wc:
    - shard-bmg:          [SKIP][208] ([Intel XE#2311]) -> [SKIP][209] ([Intel XE#2312]) +12 other tests skip
   [208]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-8/igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-pri-indfb-draw-mmap-wc.html
   [209]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-wc:
    - shard-bmg:          [SKIP][210] ([Intel XE#4141]) -> [SKIP][211] ([Intel XE#2312]) +3 other tests skip
   [210]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-2/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-wc.html
   [211]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-render:
    - shard-bmg:          [SKIP][212] ([Intel XE#2312]) -> [SKIP][213] ([Intel XE#4141]) +5 other tests skip
   [212]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-render.html
   [213]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-1/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-spr-indfb-draw-render:
    - shard-bmg:          [SKIP][214] ([Intel XE#2312]) -> [SKIP][215] ([Intel XE#2311]) +11 other tests skip
   [214]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-spr-indfb-draw-render.html
   [215]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-7/igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-spr-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-move:
    - shard-bmg:          [SKIP][216] ([Intel XE#2313]) -> [SKIP][217] ([Intel XE#2312]) +9 other tests skip
   [216]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-1/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-move.html
   [217]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-cur-indfb-move.html

  * igt@kms_frontbuffer_tracking@psr-2p-scndscrn-indfb-msflip-blt:
    - shard-bmg:          [SKIP][218] ([Intel XE#2312]) -> [SKIP][219] ([Intel XE#2313]) +6 other tests skip
   [218]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-6/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-indfb-msflip-blt.html
   [219]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-5/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-indfb-msflip-blt.html

  * igt@kms_tiled_display@basic-test-pattern-with-chamelium:
    - shard-bmg:          [SKIP][220] ([Intel XE#2509]) -> [SKIP][221] ([Intel XE#2426])
   [220]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8/shard-bmg-1/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html
   [221]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/shard-bmg-4/igt@kms_tiled_display@basic-test-pattern-with-chamelium.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [Intel XE#1124]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1124
  [Intel XE#1126]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1126
  [Intel XE#1129]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1129
  [Intel XE#1135]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1135
  [Intel XE#1178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1178
  [Intel XE#1392]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1392
  [Intel XE#1401]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1401
  [Intel XE#1406]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1406
  [Intel XE#1421]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1421
  [Intel XE#1424]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1424
  [Intel XE#1435]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1435
  [Intel XE#1439]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1439
  [Intel XE#1450]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1450
  [Intel XE#1475]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1475
  [Intel XE#1489]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1489
  [Intel XE#1499]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1499
  [Intel XE#1503]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1503
  [Intel XE#1727]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1727
  [Intel XE#1745]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1745
  [Intel XE#2234]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2234
  [Intel XE#2245]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2245
  [Intel XE#2252]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2252
  [Intel XE#2284]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2284
  [Intel XE#2291]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2291
  [Intel XE#2293]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2293
  [Intel XE#2311]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2311
  [Intel XE#2312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2312
  [Intel XE#2313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2313
  [Intel XE#2316]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2316
  [Intel XE#2322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2322
  [Intel XE#2341]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2341
  [Intel XE#2360]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2360
  [Intel XE#2374]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2374
  [Intel XE#2380]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2380
  [Intel XE#2390]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2390
  [Intel XE#2426]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2426
  [Intel XE#2509]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2509
  [Intel XE#2568]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2568
  [Intel XE#261]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/261
  [Intel XE#2669]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2669
  [Intel XE#2705]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2705
  [Intel XE#2763]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2763
  [Intel XE#2850]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2850
  [Intel XE#288]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/288
  [Intel XE#2887]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2887
  [Intel XE#2893]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2893
  [Intel XE#2953]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2953
  [Intel XE#301]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/301
  [Intel XE#306]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/306
  [Intel XE#307]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/307
  [Intel XE#308]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/308
  [Intel XE#309]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/309
  [Intel XE#310]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/310
  [Intel XE#3113]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3113
  [Intel XE#3141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3141
  [Intel XE#3149]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3149
  [Intel XE#316]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/316
  [Intel XE#3321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3321
  [Intel XE#3342]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3342
  [Intel XE#3374]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3374
  [Intel XE#3414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3414
  [Intel XE#3433]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3433
  [Intel XE#3442]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3442
  [Intel XE#3544]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3544
  [Intel XE#3573]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3573
  [Intel XE#366]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/366
  [Intel XE#367]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/367
  [Intel XE#373]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/373
  [Intel XE#378]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/378
  [Intel XE#3862]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3862
  [Intel XE#3884]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3884
  [Intel XE#4141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4141
  [Intel XE#4156]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4156
  [Intel XE#4173]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4173
  [Intel XE#4212]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4212
  [Intel XE#4273]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4273
  [Intel XE#4302]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4302
  [Intel XE#4331]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4331
  [Intel XE#4345]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4345
  [Intel XE#4459]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4459
  [Intel XE#4522]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4522
  [Intel XE#4537]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4537
  [Intel XE#4543]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4543
  [Intel XE#455]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/455
  [Intel XE#4596]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4596
  [Intel XE#4609]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4609
  [Intel XE#4733]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4733
  [Intel XE#4837]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4837
  [Intel XE#4915]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4915
  [Intel XE#4921]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4921
  [Intel XE#4943]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4943
  [Intel XE#5100]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5100
  [Intel XE#5208]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5208
  [Intel XE#5354]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5354
  [Intel XE#5408]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5408
  [Intel XE#5416]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5416
  [Intel XE#5561]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5561
  [Intel XE#5565]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5565
  [Intel XE#5575]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5575
  [Intel XE#5594]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5594
  [Intel XE#560]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/560
  [Intel XE#5786]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5786
  [Intel XE#5937]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5937
  [Intel XE#5993]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5993
  [Intel XE#6010]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6010
  [Intel XE#610]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/610
  [Intel XE#619]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/619
  [Intel XE#6251]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6251
  [Intel XE#6266]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6266
  [Intel XE#6312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6312
  [Intel XE#6318]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6318
  [Intel XE#6360]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6360
  [Intel XE#6361]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6361
  [Intel XE#6366]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6366
  [Intel XE#6407]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6407
  [Intel XE#651]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/651
  [Intel XE#653]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/653
  [Intel XE#656]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/656
  [Intel XE#6569]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6569
  [Intel XE#6612]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6612
  [Intel XE#688]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/688
  [Intel XE#718]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/718
  [Intel XE#776]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/776
  [Intel XE#787]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/787
  [Intel XE#836]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/836
  [Intel XE#908]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/908
  [Intel XE#929]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/929
  [Intel XE#977]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/977


Build changes
-------------

  * IGT: IGT_8629 -> IGT_8630
  * Linux: xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8 -> xe-pw-157698v1

  IGT_8629: 3813b62c3c6200f69d6cc8be1c5e621243ace24f @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  IGT_8630: 8630
  xe-4121-91fc6d984707c9bfd4a60550e6a85f1a991e7ec8: 91fc6d984707c9bfd4a60550e6a85f1a991e7ec8
  xe-pw-157698v1: 157698v1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-157698v1/index.html

[-- Attachment #2: Type: text/html, Size: 75172 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received
  2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
@ 2025-11-21 17:23   ` Lin, Shuicheng
  2025-11-22  1:53     ` Nguyen, Brian3
  2025-11-22 18:25   ` Matthew Brost
  1 sibling, 1 reply; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 17:23 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> TLB_INVALIDATION_SEQNO_INVALID are now used to indicate in progress
> multi-step TLB invalidations, so reset tdr to ensure that action won't
> prematurely trigger when G2H actions are still ongoing.

I am not sure should we re-use the SEQNO_INVALID to indicate that multi-step TLB invalidations.
Before this reuse, is there possible case that the SEQNO_INVALID will be sent?
If yes, we should use another value to indicate the multi-step.
If no, why previous patch add the msg[0] != SEQNO_INVALID check?

> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_tlb_inval.c |  2 ++
>  drivers/gpu/drm/xe/xe_tlb_inval.c     | 16 ++++++++++++++++
>  drivers/gpu/drm/xe/xe_tlb_inval.h     |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> index f1fd2dd90742..cd126c53faab 100644
> --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> @@ -238,6 +238,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc
> *guc, u32 *msg, u32 len)
> 
>  	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
>  		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
> +	else
> +		xe_tlb_inval_reset_timeout(&gt->tlb_inval);

So SEQNO_INVALID is re-used for page reclaim. And for previous code, this else path should never be hit.
GuC will not set this seqno to SEQNO_INVALID in any failure case. Is it right?

Shuicheng

> 
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c
> b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index 918a59e686ea..50f05d6b5672 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -199,6 +199,22 @@ void xe_tlb_inval_reset(struct xe_tlb_inval
> *tlb_inval)
>  	mutex_unlock(&tlb_inval->seqno_lock);
>  }
> 
> +/**
> + * xe_tlb_inval_reset_timeout() - Reset TLB inval fence timeout
> + * @tlb_inval: TLB invalidation client
> + *
> + * Reset the TLB invalidation timeout timer.
> + */
> +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval) {
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tlb_inval->pending_lock, flags);
> +	mod_delayed_work(system_wq, &tlb_inval->fence_tdr,
> +			 tlb_inval->ops->timeout_delay(tlb_inval));
> +	spin_unlock_irqrestore(&tlb_inval->pending_lock, flags); }
> +
>  static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval, int seqno)  {
>  	int seqno_recv = READ_ONCE(tlb_inval->seqno_recv); diff --git
> a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
> index 05614915463a..9dbddc310eb9 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> @@ -17,6 +17,7 @@ struct xe_vm;
>  int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
> 
>  void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval);
> +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval);
>  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
>  		     struct xe_tlb_inval_fence *fence);  int
> xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
@ 2025-11-21 18:02   ` Lin, Shuicheng
  2025-11-22  1:54     ` Nguyen, Brian3
  2025-11-22 19:32   ` Matthew Brost
  1 sibling, 1 reply; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 18:02 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> Allow for tlb_invalidation to configure when driver wants to flush the Private
> Physical Cache (PPC) as a process of the tlb invalidation process.

How about "Allow tlb_invalidation to control whether the driver flushes the Private Physical Cache (PPC) as part of the TLB invalidation process."

> 
> Default behavior is still to always flush the PPC but driver now has the option
> to disable it.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 11 +++++++----
>  drivers/gpu/drm/xe/xe_tlb_inval.c       | 21 ++++++++++++++++++---
>  drivers/gpu/drm/xe/xe_tlb_inval.h       |  5 +++--
>  drivers/gpu/drm/xe/xe_tlb_inval_job.c   |  2 +-
>  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++-
>  drivers/gpu/drm/xe/xe_vm.c              |  4 ++--
>  6 files changed, 35 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> index cd126c53faab..c05709a5bc98 100644
> --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> @@ -34,9 +34,12 @@ static int send_tlb_inval(struct xe_guc *guc, const u32
> *action, int len)
>  			      G2H_LEN_DW_TLB_INVALIDATE, 1);  }
> 
> -#define MAKE_INVAL_OP(type)	((type <<
> XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> +#define MAKE_INVAL_OP_FLUSH(type, flush_cache)	((type <<
> XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
>  		XE_GUC_TLB_INVAL_MODE_HEAVY <<
> XE_GUC_TLB_INVAL_MODE_SHIFT | \
> -		XE_GUC_TLB_INVAL_FLUSH_CACHE)
> +		(flush_cache ? \
> +		XE_GUC_TLB_INVAL_FLUSH_CACHE : 0))
> +
> +#define MAKE_INVAL_OP(type)	MAKE_INVAL_OP_FLUSH(type, true)
> 
>  static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32 seqno)  { @@ -
> 100,7 +103,7 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval,
> u32 seqno)  #define MAX_RANGE_TLB_INVALIDATION_LENGTH
> (rounddown_pow_of_two(ULONG_MAX))
> 
>  static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> -				u64 start, u64 end, u32 asid)
> +				u64 start, u64 end, u32 asid, bool
> flush_cache)
>  {
>  #define MAX_TLB_INVALIDATION_LEN	7
>  	struct xe_guc *guc = tlb_inval->private; @@ -154,7 +157,7 @@ static
> int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
>  						    ilog2(SZ_2M) + 1)));
>  		xe_gt_assert(gt, IS_ALIGNED(start, length));
> 
> -		action[len++] =
> MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
> +		action[len++] =
> MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE,
> +flush_cache);
>  		action[len++] = asid;
>  		action[len++] = lower_32_bits(start);
>  		action[len++] = upper_32_bits(start); diff --git
> a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index 50f05d6b5672..de275759743c 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -324,10 +324,10 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval)
>   */
>  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
>  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
> -		       u32 asid)
> +		       u32 asid, bool flush_cache)
>  {
>  	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> -				  start, end, asid);
> +				  start, end, asid, flush_cache);
>  }
> 
>  /**
> @@ -343,7 +343,7 @@ void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> struct xe_vm *vm)
>  	u64 range = 1ull << vm->xe->info.va_bits;
> 
>  	xe_tlb_inval_fence_init(tlb_inval, &fence, true);
> -	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid);
> +	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid, true);
>  	xe_tlb_inval_fence_wait(&fence);
>  }
> 
> @@ -420,6 +420,20 @@ static const struct dma_fence_ops inval_fence_ops =
> {
>  	.get_timeline_name = xe_inval_fence_get_timeline_name,  };
> 
> +/**
> + * xe_tlb_inval_fence_flush_cache - Control PPC flush at invalidation
> + * @fence: TLB inval fence
> + * @flush_cache: whether to perform PPC cache flush
> + *
> + * Helper function to modify the tlb_inval fence to control the PPC flush.
> + * Other components shouldn't modify fence directly.
> + */
> +void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> +				    bool flush_cache)
> +{
> +	fence->flush_cache = flush_cache;
> +}

I don't see this function is used in the series patches. Could you please double confirm that?
Also, it would be better to add inline and just put the code in the header file.
BTW, how about rename it to xe_tlb_inval_fence_set_flush_cache?

Shuicheng

> +
>  /**
>   * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence
>   * @tlb_inval: TLB invalidation client
> @@ -446,4 +460,5 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval
> *tlb_inval,
>  	else
>  		dma_fence_get(&fence->base);
>  	fence->tlb_inval = tlb_inval;
> +	fence->flush_cache = true;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h
> b/drivers/gpu/drm/xe/xe_tlb_inval.h
> index 9dbddc310eb9..b84ce3e6f294 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> @@ -24,8 +24,9 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);  void
> xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm);  int
> xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
>  		       struct xe_tlb_inval_fence *fence,
> -		       u64 start, u64 end, u32 asid);
> -
> +		       u64 start, u64 end, u32 asid, bool flush_cache); void
> +xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> +				    bool flush_cache);
>  void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
>  			     struct xe_tlb_inval_fence *fence,
>  			     bool stack);
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> index 1ae0dec2cf31..6248f90323a9 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> @@ -49,7 +49,7 @@ static struct dma_fence *xe_tlb_inval_job_run(struct
> xe_dep_job *dep_job)
>  		container_of(job->fence, typeof(*ifence), base);
> 
>  	xe_tlb_inval_range(job->tlb_inval, ifence, job->start,
> -			   job->end, job->vm->usm.asid);
> +			   job->end, job->vm->usm.asid, ifence->flush_cache);
> 
>  	return job->fence;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> index 7a6967ce3b76..c3c3943fb07e 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> @@ -40,12 +40,13 @@ struct xe_tlb_inval_ops {
>  	 * @start: Start address
>  	 * @end: End address
>  	 * @asid: Address space ID
> +	 * @flush_cache: PPC flush control
>  	 *
>  	 * Return 0 on success, -ECANCELED if backend is mid-reset, error on
>  	 * failure
>  	 */
>  	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
> -		     u64 end, u32 asid);
> +		     u64 end, u32 asid, bool flush_cache);
> 
>  	/**
>  	 * @initialized: Backend is initialized @@ -126,6 +127,8 @@ struct
> xe_tlb_inval_fence {
>  	int seqno;
>  	/** @inval_time: time of TLB invalidation */
>  	ktime_t inval_time;
> +	/** @flush_cache: bool for PPC flush, default is true */
> +	bool flush_cache;
>  };
> 
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index
> 7cac646bdf1c..5fb5226574c5 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3907,7 +3907,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> *vm, u64 start,
> 
>  		err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval,
>  					 &fence[fence_id], start, end,
> -					 vm->usm.asid);
> +					 vm->usm.asid, true);
>  		if (err)
>  			goto wait;
>  		++fence_id;
> @@ -3920,7 +3920,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> *vm, u64 start,
> 
>  		err = xe_tlb_inval_range(&tile->media_gt->tlb_inval,
>  					 &fence[fence_id], start, end,
> -					 vm->usm.asid);
> +					 vm->usm.asid, true);
>  		if (err)
>  			goto wait;
>  		++fence_id;
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 04/11] drm/xe: Add page reclamation info to device info
  2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
@ 2025-11-21 18:15   ` Lin, Shuicheng
  2025-11-22 18:31   ` Matthew Brost
  1 sibling, 0 replies; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 18:15 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart, Oak Zeng

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> From: Oak Zeng <oak.zeng@intel.com>
> 
> Starting from Xe3p, HW adds a feature assisting range based page
> reclamation. Introduce a bit in device info to indicate whether device has such
> capability.
> 
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---

LGTM.
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>

>  drivers/gpu/drm/xe/xe_device_types.h | 2 ++
>  drivers/gpu/drm/xe/xe_pci.c          | 1 +
>  drivers/gpu/drm/xe/xe_pci_types.h    | 1 +
>  3 files changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 0b2fa7c56d38..268c8e28601a 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -308,6 +308,8 @@ struct xe_device {
>  		u8 has_mbx_power_limits:1;
>  		/** @info.has_mem_copy_instr: Device supports MEM_COPY
> instruction */
>  		u8 has_mem_copy_instr:1;
> +		/** @info.has_page_reclaim_hw_assist: Device supports page
> reclamation feature */
> +		u8 has_page_reclaim_hw_assist:1;
>  		/** @info.has_pxp: Device has PXP support */
>  		u8 has_pxp:1;
>  		/** @info.has_range_tlb_inval: Has range based TLB
> invalidations */ diff --git a/drivers/gpu/drm/xe/xe_pci.c
> b/drivers/gpu/drm/xe/xe_pci.c index cd03b4b3ebdb..43c47426313e 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -673,6 +673,7 @@ static int xe_info_init_early(struct xe_device *xe,
>  	xe->info.has_heci_cscfi = desc->has_heci_cscfi;
>  	xe->info.has_late_bind = desc->has_late_bind;
>  	xe->info.has_llc = desc->has_llc;
> +	xe->info.has_page_reclaim_hw_assist =
> +desc->has_page_reclaim_hw_assist;
>  	xe->info.has_pxp = desc->has_pxp;
>  	xe->info.has_sriov = xe_configfs_primary_gt_allowed(to_pci_dev(xe-
> >drm.dev)) &&
>  		desc->has_sriov;
> diff --git a/drivers/gpu/drm/xe/xe_pci_types.h
> b/drivers/gpu/drm/xe/xe_pci_types.h
> index 9892c063a9c5..151743d4cf72 100644
> --- a/drivers/gpu/drm/xe/xe_pci_types.h
> +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> @@ -47,6 +47,7 @@ struct xe_device_desc {
>  	u8 has_llc:1;
>  	u8 has_mbx_power_limits:1;
>  	u8 has_mem_copy_instr:1;
> +	u8 has_page_reclaim_hw_assist:1;
>  	u8 has_pxp:1;
>  	u8 has_sriov:1;
>  	u8 needs_scratch:1;
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
  2025-11-18  9:05 ` [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC Brian Nguyen
@ 2025-11-21 18:32   ` Lin, Shuicheng
  2025-11-22  1:56     ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 18:32 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> Add page reclamation related changes to GuC interface, handlers, and senders
> to support page reclamation.
> 
> Currently TLB invalidations will perform an entire PPC flush in order to prevent
> stale memory access for noncoherent system memory. Page reclamation is an
> extension of the typical TLB invalidation workflow, allowing disabling of full
> PPC flush and enable selective PPC flushing. Selective flushing will be decided
> by a list of pages whom's address is passed to GuC at time of action.
> 
> Page reclamation interfaces require at least GuC FW ver 70.31.0.

Should driver disable this feature if the running FW is < 70.31.0?
What will happen if driver send this action while GuC doesn't support it yet?

Shuicheng

> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
>  drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
>  drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
>  drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
>  4 files changed, 21 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> index 47756e4674a1..11de3bdf69b5 100644
> --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> @@ -151,6 +151,8 @@ enum xe_guc_action {
>  	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
>  	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
>  	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
> +	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
> +	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
>  	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
>  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
>  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004, diff --git
> a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c index
> 2697d711adb2..e13704e61032 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct *ct,
> u32 *msg, u32 len)
>  	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
>  	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
>  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
>  		g2h_release_space(ct, len);
>  	}
> 
> @@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct *ct,
> u32 *msg, u32 len)
>  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
>  		break;
>  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
>  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
>  		break;
>  	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
> @@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32 *msg,
> bool fast_path)
>  		switch (action) {
>  		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
>  		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> +		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
>  			break;	/* Process these in fast-path */
>  		default:
>  			return 0;
> @@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32
> *msg, u32 len)
>  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
>  		break;
>  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
>  		__g2h_release_space(ct, len);
>  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
>  		break;
> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> b/drivers/gpu/drm/xe/xe_guc_fwif.h
> index c90dd266e9cf..34d74a71c4f0 100644
> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> @@ -16,6 +16,7 @@
>  #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
>  #define G2H_LEN_DW_TLB_INVALIDATE		3
>  #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
> +#define G2H_LEN_DW_PAGE_RECLAMATION		3
> 
>  #define GUC_ID_MAX			65535
>  #define GUC_ID_UNKNOWN			0xffffffff
> diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> index c05709a5bc98..3185f8dc00c4 100644
> --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> @@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval
> *tlb_inval, u32 seqno)
>  	return -ECANCELED;
>  }
> 
> +static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
> +			     u64 gpu_addr)
> +{
> +	u32 action[] = {
> +		XE_GUC_ACTION_PAGE_RECLAMATION,
> +		seqno,
> +		lower_32_bits(gpu_addr),
> +		upper_32_bits(gpu_addr),
> +	};
> +
> +	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> +			      G2H_LEN_DW_PAGE_RECLAMATION, 1); }
> +
>  /*
>   * Ensure that roundup_pow_of_two(length) doesn't overflow.
>   * Note that roundup_pow_of_two() operates on unsigned long,
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
@ 2025-11-21 21:29   ` Lin, Shuicheng
  2025-11-22  1:57     ` Nguyen, Brian3
  2025-11-22 19:18   ` Matthew Brost
  1 sibling, 1 reply; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 21:29 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> Page reclaim list (PRL) is preparation work for the page reclaim feature.
> The PRL is firstly owned by pt_update_ops and all other page reclaim
> operations will point back to this PRL. PRL generates its entries during the
> unbind page walker, updating the PRL.
> 
> This PRL is restricted to a 4K page, so 512 page entries at most.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile           |   1 +
>  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
>  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
>  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
>  6 files changed, 217 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
>  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index
> e4b273b025d2..048e6c93271c 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
>  	xe_oa.o \
>  	xe_observation.o \
>  	xe_pagefault.o \
> +	xe_page_reclaim.o \
>  	xe_pat.o \
>  	xe_pci.o \
>  	xe_pcode.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> index 4389e5a76f89..4d83461e538b 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> @@ -9,6 +9,7 @@
>  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
>  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> 
> +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
>  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> 
>  #define GUC_GGTT_TOP		0xFEE00000
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> b/drivers/gpu/drm/xe/xe_page_reclaim.c
> new file mode 100644
> index 000000000000..a0d15efff58c
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> @@ -0,0 +1,52 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/kref.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +
> +#include "xe_page_reclaim.h"
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_assert.h"
> +#include "xe_macros.h"
> +
> +/**
> + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> + * @prl: Page reclaim list to reset
> + *
> + * Clears the entries pointer and marks the list as invalid so
> + * future use know PRL is unusable. It is expected that the entries

s/know/knows

> + * have already been released.
> + */
> +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl)
> +{
> +	prl->entries = NULL;
> +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> +
> +/**
> + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list
> +entries
> + * @prl: Page reclaim list to allocate entries for
> + *
> + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to
> NULL.
> + */
> +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> +*prl) {
> +	struct page *page;
> +
> +	XE_WARN_ON(prl->entries != NULL);
> +	if (prl->entries)
> +		return 0;

These lines could be combined like this:
	if (XE_WARN_ON(prl->entries))
		return 0;

> +
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (page) {
> +		prl->entries = page_address(page);
> +		prl->num_entries = 0;
> +	}
> +
> +	return page ? 0 : -ENOMEM;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> b/drivers/gpu/drm/xe/xe_page_reclaim.h
> new file mode 100644
> index 000000000000..d066d7d97f79
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PAGE_RECLAIM_H_
> +#define _XE_PAGE_RECLAIM_H_
> +
> +#include <linux/kref.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/workqueue.h>
> +
> +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> +
> +struct xe_guc_page_reclaim_entry {
> +	u32 valid:1;
> +	u32 reclamation_size:6;

Maybe add comments for what does this size mean?
"the size of the page to be invalidated and flushed from non-coherent cache. "

> +	u32 reserved:5;

s/reserved/reserved0
As there is reserved1 below.

> +	u32 address_lo:20;
> +	u32 address_hi:20;
> +	u32 reserved1:12;
> +} __packed;
> +
> +struct xe_page_reclaim_list {
> +	/** @entries: array of page reclaim entries, page allocated */
> +	struct xe_guc_page_reclaim_entry *entries;
> +	/** @num_entries: number of entries */
> +	int num_entries;
> +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> +};
> +
> +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> +*prl); static inline void xe_page_reclaim_entries_get(struct
> +xe_guc_page_reclaim_entry *entries) {
> +	if (entries)
> +		get_page(virt_to_page(entries));
> +}
> +
> +static inline void xe_page_reclaim_entries_put(struct
> +xe_guc_page_reclaim_entry *entries) {
> +	if (entries)
> +		put_page(virt_to_page(entries));
> +}
> +
> +#endif	/* _XE_PAGE_RECLAIM_H_ */
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index
> 884127b4d97d..532a047676d4 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -12,6 +12,7 @@
>  #include "xe_exec_queue.h"
>  #include "xe_gt.h"
>  #include "xe_migrate.h"
> +#include "xe_page_reclaim.h"
>  #include "xe_pt_types.h"
>  #include "xe_pt_walk.h"
>  #include "xe_res_cursor.h"
> @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
>  	/* Output */
>  	/* @wupd: Structure to track the page-table updates we're building */
>  	struct xe_walk_update wupd;
> +
> +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> +	struct xe_page_reclaim_list *prl;
>  };
> 
>  /*
> @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64 next,
> unsigned int level,
>  	return false;
>  }
> 
> +/* Huge 2MB leaf lives directly in a level-1 table and has no children
> +*/ static bool is_large_pte(struct xe_pt *pte) {
> +	return pte->level == 1 && !pte->base.children; }
> +
> +/* page_size = 2^(reclamation_size + 12) */
> +#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> 		\
> +({									\
> +	BUILD_BUG_ON(!__builtin_constant_p(page_size));
> 	\
> +	ilog2(page_size) - 12;						\
> +})
> +
> +static void generate_reclaim_entry(struct xe_tile *tile,
> +				   struct xe_page_reclaim_list *prl,
> +				   u64 pte,
> +				   struct xe_pt *xe_child)
> +{
> +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> +	const u64 field_mask = GENMASK_ULL(19, 0);
> +	u32 reclamation_size;
> +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;

It seems we don't need this "max_entries", just use MAX_ENTRIES directly in the code. 

> +	int num_entries = prl->num_entries;
> +
> +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> +	xe_tile_assert(tile, reclaim_entries);
> +
> +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> +		return;
> +
> +	/* Overflow: mark as invalid through num_entries */
> +	if (num_entries >= max_entries) {
> +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> +		return;
> +	}
> +
> +	/**
> +	 * reclamation_size indicates the size of the page to be
> +	 * invalidated and flushed from non-coherent cache.
> +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> +	 * Only valid for these specific levels.
> +	 */
> +
> +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> +		reclamation_size =
> COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K);  /* reclamation_size = 0 */

Not sure this COMPUTE_RECLAIM_ADDRESS_MASK is needed or not.
How about:
	reclamation_size = 0; /* reclaim page size: SZ_4K */

> +	else if (xe_child->level == 0)
> +		reclamation_size =
> COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> +	else if (is_large_pte(xe_child))
> +		reclamation_size =
> COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M);  /* reclamation_size = 2 */
> +	else
> +		return;

Is it expected to enter the last else path?
If it is failure path, how about add a WARN_ON_ONCE message?

> +
> +	reclaim_entries[num_entries].valid = 1;
> +	reclaim_entries[num_entries].reclamation_size =
> +		reclamation_size;
> +	reclaim_entries[num_entries].address_lo =
> +		FIELD_GET(field_mask, phys_addr);
> +	reclaim_entries[num_entries].address_hi =
> +		FIELD_GET(field_mask, phys_addr >> 20);
> +	prl->num_entries++;
> +}
> +
>  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
>  				    unsigned int level, u64 addr, u64 next,
>  				    struct xe_ptw **child,
> @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct
> xe_ptw *parent, pgoff_t offset,
>  				    struct xe_pt_walk *walk)
>  {
>  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child), base);
> +	struct xe_pt_stage_unbind_walk *xe_walk =
> +		container_of(walk, typeof(*xe_walk), base);
> +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> 
>  	XE_WARN_ON(!*child);
>  	XE_WARN_ON(!level);
> 
> +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> +	if (xe_walk->prl && (xe_child->level == 0 || is_large_pte(xe_child))) {
> +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);

If count > 512, generate_reclaim_entry() should fail due to reach max entries.
How about try to check count <= 512 - prl->num_entries before the loop?

Shuicheng

> +
> +		for (pgoff_t i = 0; i < count; i++) {
> +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) *
> sizeof(u64), u64);
> +
> +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> +					       pte, xe_child);
> +		}
> +	}
> +
>  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> 
>  	return 0;
> @@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct
> xe_tile *tile,  {
>  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
>  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> +	struct xe_vm_pgtable_update_op *pt_update_op =
> +		container_of(entries, struct xe_vm_pgtable_update_op,
> entries[0]);
>  	struct xe_pt_stage_unbind_walk xe_walk = {
>  		.base = {
>  			.ops = &xe_pt_stage_unbind_ops,
> @@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct
> xe_tile *tile,
>  		.modified_start = start,
>  		.modified_end = end,
>  		.wupd.entries = entries,
> +		.prl = pt_update_op->prl,
>  	};
>  	struct xe_pt *pt = vm->pt_root[tile->id];
> 
> @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
>  			     struct xe_vm_pgtable_update_ops
> *pt_update_ops,
>  			     struct xe_vma *vma)
>  {
> +	struct xe_device *xe = tile_to_xe(tile);
>  	u32 current_op = pt_update_ops->current_op;
>  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> >ops[current_op];
>  	int err;
> @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
>  	pt_op->vma = vma;
>  	pt_op->bind = false;
>  	pt_op->rebind = false;
> +	/* Maintain one PRL located in pt_update_ops that all others in
> unbind op reference */
> +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops-
> >prl.entries) {
> +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops-
> >prl);
> +		if (err < 0)
> +			xe_page_reclaim_list_invalidate(&pt_update_ops-
> >prl);
> +	}
> +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> +NULL;
> 
>  	err = vma_reserve_fences(tile_to_xe(tile), vma);
>  	if (err)
> @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> 
>  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
>  						vma, NULL, pt_op->entries);
> +	/* Free PRL if list declared as invalid */
> +	if (pt_update_ops->prl.entries &&
> +	    pt_update_ops->prl.num_entries ==
> XE_PAGE_RECLAIM_INVALID_LIST) {
> +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> +		pt_op->prl = NULL;
> +		pt_update_ops->prl.entries = NULL;
> +	}
> 
>  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
>  				pt_op->num_entries, false);
> @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm
> *vm,
>  	pt_op->vma = XE_INVALID_VMA;
>  	pt_op->bind = false;
>  	pt_op->rebind = false;
> +	pt_op->prl = NULL;
> 
>  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
>  						pt_op->entries);
> @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> xe_vm_pgtable_update_ops *pt_update_ops)
>  	init_llist_head(&pt_update_ops->deferred);
>  	pt_update_ops->start = ~0x0ull;
>  	pt_update_ops->last = 0x0ull;
> +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
>  }
> 
>  /**
> @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile,
> struct xe_vma_ops *vops)
>  		&vops->pt_update_ops[tile->id];
>  	int i;
> 
> +	if (pt_update_ops->prl.entries) {
> +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> +	}
> +
>  	lockdep_assert_held(&vops->vm->lock);
>  	xe_vm_assert_held(vops->vm);
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> b/drivers/gpu/drm/xe/xe_pt_types.h
> index 881f01e14db8..26e5295f118e 100644
> --- a/drivers/gpu/drm/xe/xe_pt_types.h
> +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> @@ -8,6 +8,7 @@
> 
>  #include <linux/types.h>
> 
> +#include "xe_page_reclaim.h"
>  #include "xe_pt_walk.h"
> 
>  struct xe_bo;
> @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
>  	bool bind;
>  	/** @rebind: is a rebind */
>  	bool rebind;
> +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> +	struct xe_page_reclaim_list *prl;
>  };
> 
>  /** struct xe_vm_pgtable_update_ops: page table update operations */ @@ -
> 119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
>  	 * slots are idle.
>  	 */
>  	bool wait_vm_kernel;
> +	/** @prl: embedded page reclaim list */
> +	struct xe_page_reclaim_list prl;
>  };
> 
>  #endif
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 11/11] drm/xe: Add debugfs support for page reclamation
  2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
@ 2025-11-21 22:32   ` Lin, Shuicheng
  2025-11-22  1:57     ` Nguyen, Brian3
  2025-11-22 14:18   ` Michal Wajdeczko
  1 sibling, 1 reply; 51+ messages in thread
From: Lin, Shuicheng @ 2025-11-21 22:32 UTC (permalink / raw)
  To: Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> Allow for runtime modification to page reclamation feature through debugfs
> configuration. This parameter will only take effect if the platform supports the
> page reclamation feature by default.
> 
> Move xe_match_desc to common header for debugfs access to read default
> device values of xe driver for current platform.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_configfs.c | 11 +-------
> drivers/gpu/drm/xe/xe_debugfs.c  | 47
> ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_device.c   | 10 +++++++
>  drivers/gpu/drm/xe/xe_device.h   |  2 ++
>  4 files changed, 60 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> b/drivers/gpu/drm/xe/xe_configfs.c
> index 9f6251b1008b..efc6d0690b27 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -15,6 +15,7 @@
> 
>  #include "instructions/xe_mi_commands.h"
>  #include "xe_configfs.h"
> +#include "xe_device.h"
>  #include "xe_gt_types.h"
>  #include "xe_hw_engine_types.h"
>  #include "xe_module.h"
> @@ -925,16 +926,6 @@ static const struct config_item_type
> xe_config_sriov_type = {
>  	.ct_attrs	= xe_config_sriov_attrs,
>  };
> 
> -static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev) -{
> -	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> -	struct pci_driver *drv = to_pci_driver(driver);
> -	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> -	const struct pci_device_id *found = pci_match_id(ids, pdev);
> -
> -	return found ? (const void *)found->driver_data : NULL;
> -}
> -
>  static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn)  {
>  	struct pci_dev *physfn = pci_physfn(virtfn); diff --git
> a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index e91da9589c5f..572c61ee1e29 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -19,6 +19,7 @@
>  #include "xe_gt_printk.h"
>  #include "xe_guc_ads.h"
>  #include "xe_mmio.h"
> +#include "xe_pci_types.h"
>  #include "xe_pm.h"
>  #include "xe_psmi.h"
>  #include "xe_pxp_debugfs.h"
> @@ -297,6 +298,49 @@ static const struct file_operations
> wedged_mode_fops = {
>  	.write = wedged_mode_set,
>  };
> 
> +static ssize_t page_reclaim_hw_assist_show(struct file *f, char __user *ubuf,
> +					   size_t size, loff_t *pos)
> +{
> +	struct xe_device *xe = file_inode(f)->i_private;
> +	char buf[8];
> +	int len;
> +
> +	len = scnprintf(buf, sizeof(buf), "%d\n", xe-
> >info.has_page_reclaim_hw_assist);
> +	return simple_read_from_buffer(ubuf, size, pos, buf, len); }
> +
> +static ssize_t page_reclaim_hw_assist_set(struct file *f, const char __user
> *ubuf,
> +					  size_t size, loff_t *pos)
> +{
> +	struct xe_device *xe = file_inode(f)->i_private;
> +	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> +	const struct xe_device_desc *desc = xe_match_desc(pdev);
> +	unsigned int val;
> +	ssize_t ret;
> +
> +	ret = kstrtouint_from_user(ubuf, size, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	/**
> +	 * Don't modify if page reclamation support isn't normally
> +	 * supported by the HW.
> +	 */

Nit:
/** is reserved for kernel-doc comment. So it should be /* here for normal comment.
How about "Don't modify it if page reclamation isn't supported by the hardware."?

Other code LGTM.
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>

> +
> +	if (!desc || !desc->has_page_reclaim_hw_assist)
> +		return -ENODEV;
> +
> +	xe->info.has_page_reclaim_hw_assist = !!val;
> +
> +	return size;
> +}
> +
> +static const struct file_operations page_reclaim_hw_assist_fops = {
> +	.owner = THIS_MODULE,
> +	.read = page_reclaim_hw_assist_show,
> +	.write = page_reclaim_hw_assist_set,
> +};
> +
>  static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user
> *ubuf,
>  					    size_t size, loff_t *pos)
>  {
> @@ -403,6 +447,9 @@ void xe_debugfs_register(struct xe_device *xe)
>  	debugfs_create_file("disable_late_binding", 0600, root, xe,
>  			    &disable_late_binding_fops);
> 
> +	debugfs_create_file("page_reclaim_hw_assist", 0600, root, xe,
> +			    &page_reclaim_hw_assist_fops);
> +
>  	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1;
> ++mem_type) {
>  		man = ttm_manager_type(bdev, mem_type);
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index c7d373c70f0f..16afddc5e35e 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -1295,3 +1295,13 @@ void xe_device_declare_wedged(struct xe_device
> *xe)
>  		drm_dev_wedged_event(&xe->drm, xe->wedged.method,
> NULL);
>  	}
>  }
> +
> +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev) {
> +	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> +	struct pci_driver *drv = to_pci_driver(driver);
> +	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> +	const struct pci_device_id *found = pci_match_id(ids, pdev);
> +
> +	return found ? (const void *)found->driver_data : NULL; }
> diff --git a/drivers/gpu/drm/xe/xe_device.h
> b/drivers/gpu/drm/xe/xe_device.h index 32cc6323b7f6..a66e8e4b3e01
> 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -193,6 +193,8 @@ void xe_device_declare_wedged(struct xe_device
> *xe);  struct xe_file *xe_file_get(struct xe_file *xef);  void xe_file_put(struct
> xe_file *xef);
> 
> +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev);
> +
>  int xe_is_injection_active(void);
> 
>  /*
> --
> 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received
  2025-11-21 17:23   ` Lin, Shuicheng
@ 2025-11-22  1:53     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-22  1:53 UTC (permalink / raw)
  To: Lin, Shuicheng, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Friday, November 21, 2025 9:24 AM Lin, Shuicheng wrote:
> On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > TLB_INVALIDATION_SEQNO_INVALID are now used to indicate in progress
> > multi-step TLB invalidations, so reset tdr to ensure that action won't
> > prematurely trigger when G2H actions are still ongoing.
> 
> I am not sure should we re-use the SEQNO_INVALID to indicate that multi-
> step TLB invalidations.

My understanding is that SEQNO_INVALID is already used for the
intermediate steps of Context based TLB invalidations patchset [1],
which should also benefit from this? Since fence request time starts
on first TLB CTB action, it is good to reset the TDR on G2H ack.
GuC processed through one CTB action, but we are waiting
for one or more of the same timeout durations for TLB actions,
like page reclaim sending 2 CTB actions.

> Before this reuse, is there possible case that the SEQNO_INVALID will be sent?

SEQNO_INVALID here is used to indicate that the seqno itself
is not possible to get normally, so it should be fine to continue
using this value. It is the purpose of the addition of this new value.

The other possible case so far is done by that patch series, when
multiple TLB invalidations are sent per one seqno, thus every TLB
invalidation action except the last one has this SEQNO_INVALID.

> If yes, we should use another value to indicate the multi-step.
> If no, why previous patch add the msg[0] != SEQNO_INVALID check?
>

The previous patch added this check because we only want to handle
the tlb invalidation fence after all actions are completed. No need to
call the done_handler early. We are just tagging this H2G action as
an intermediate operation so that we don't ack the fence completion
early.

[1] https://patchwork.freedesktop.org/series/156874/
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c |  2 ++
> >  drivers/gpu/drm/xe/xe_tlb_inval.c     | 16 ++++++++++++++++
> >  drivers/gpu/drm/xe/xe_tlb_inval.h     |  1 +
> >  3 files changed, 19 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index f1fd2dd90742..cd126c53faab 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -238,6 +238,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc
> > *guc, u32 *msg, u32 len)
> >
> >  	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
> >  		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
> > +	else
> > +		xe_tlb_inval_reset_timeout(&gt->tlb_inval);
> 
> So SEQNO_INVALID is re-used for page reclaim. And for previous code, this
> else path should never be hit.
> GuC will not set this seqno to SEQNO_INVALID in any failure case. Is it right?
> 
> Shuicheng
> 

Yes. Existing code before context tlb invalidation and page reclamation will
not hit this path. GuC shouldn't be modifying any seqno that KMD provides
it in my understanding.

Brian

> >
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..50f05d6b5672 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -199,6 +199,22 @@ void xe_tlb_inval_reset(struct xe_tlb_inval
> > *tlb_inval)
> >  	mutex_unlock(&tlb_inval->seqno_lock);
> >  }
> >
> > +/**
> > + * xe_tlb_inval_reset_timeout() - Reset TLB inval fence timeout
> > + * @tlb_inval: TLB invalidation client
> > + *
> > + * Reset the TLB invalidation timeout timer.
> > + */
> > +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval) {
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&tlb_inval->pending_lock, flags);
> > +	mod_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > +			 tlb_inval->ops->timeout_delay(tlb_inval));
> > +	spin_unlock_irqrestore(&tlb_inval->pending_lock, flags); }
> > +
> >  static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval, int
> seqno)  {
> >  	int seqno_recv = READ_ONCE(tlb_inval->seqno_recv); diff --git
> > a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > index 05614915463a..9dbddc310eb9 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > @@ -17,6 +17,7 @@ struct xe_vm;
> >  int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
> >
> >  void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval);
> > +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval);
> >  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
> >  		     struct xe_tlb_inval_fence *fence);  int
> > xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> > --
> > 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  2025-11-21 18:02   ` Lin, Shuicheng
@ 2025-11-22  1:54     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-22  1:54 UTC (permalink / raw)
  To: Lin, Shuicheng, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart


On Friday, November 21, 2025 10:03 AM Lin, Shuicheng wrote:
> On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > Allow for tlb_invalidation to configure when driver wants to flush the
> > Private Physical Cache (PPC) as a process of the tlb invalidation process.
> 
> How about "Allow tlb_invalidation to control whether the driver flushes the
> Private Physical Cache (PPC) as part of the TLB invalidation process."
>
 
Sure, will change in next patchset.

> >
> > Default behavior is still to always flush the PPC but driver now has
> > the option to disable it.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 11 +++++++----
> >  drivers/gpu/drm/xe/xe_tlb_inval.c       | 21 ++++++++++++++++++---
> >  drivers/gpu/drm/xe/xe_tlb_inval.h       |  5 +++--
> >  drivers/gpu/drm/xe/xe_tlb_inval_job.c   |  2 +-
> >  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++-
> >  drivers/gpu/drm/xe/xe_vm.c              |  4 ++--
> >  6 files changed, 35 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index cd126c53faab..c05709a5bc98 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -34,9 +34,12 @@ static int send_tlb_inval(struct xe_guc *guc, const
> > u32 *action, int len)
> >  			      G2H_LEN_DW_TLB_INVALIDATE, 1);  }
> >
> > -#define MAKE_INVAL_OP(type)	((type <<
> > XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> > +#define MAKE_INVAL_OP_FLUSH(type, flush_cache)	((type <<
> > XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> >  		XE_GUC_TLB_INVAL_MODE_HEAVY <<
> > XE_GUC_TLB_INVAL_MODE_SHIFT | \
> > -		XE_GUC_TLB_INVAL_FLUSH_CACHE)
> > +		(flush_cache ? \
> > +		XE_GUC_TLB_INVAL_FLUSH_CACHE : 0))
> > +
> > +#define MAKE_INVAL_OP(type)	MAKE_INVAL_OP_FLUSH(type, true)
> >
> >  static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32
> > seqno)  { @@ -
> > 100,7 +103,7 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval
> > *tlb_inval,
> > u32 seqno)  #define MAX_RANGE_TLB_INVALIDATION_LENGTH
> > (rounddown_pow_of_two(ULONG_MAX))
> >
> >  static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> > -				u64 start, u64 end, u32 asid)
> > +				u64 start, u64 end, u32 asid, bool
> > flush_cache)
> >  {
> >  #define MAX_TLB_INVALIDATION_LEN	7
> >  	struct xe_guc *guc = tlb_inval->private; @@ -154,7 +157,7 @@ static
> > int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> >  						    ilog2(SZ_2M) + 1)));
> >  		xe_gt_assert(gt, IS_ALIGNED(start, length));
> >
> > -		action[len++] =
> > MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
> > +		action[len++] =
> > MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE,
> > +flush_cache);
> >  		action[len++] = asid;
> >  		action[len++] = lower_32_bits(start);
> >  		action[len++] = upper_32_bits(start); diff --git
> > a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 50f05d6b5672..de275759743c 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -324,10 +324,10 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval
> *tlb_inval)
> >   */
> >  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> >  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
> > -		       u32 asid)
> > +		       u32 asid, bool flush_cache)
> >  {
> >  	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> > -				  start, end, asid);
> > +				  start, end, asid, flush_cache);
> >  }
> >
> >  /**
> > @@ -343,7 +343,7 @@ void xe_tlb_inval_vm(struct xe_tlb_inval
> > *tlb_inval, struct xe_vm *vm)
> >  	u64 range = 1ull << vm->xe->info.va_bits;
> >
> >  	xe_tlb_inval_fence_init(tlb_inval, &fence, true);
> > -	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid);
> > +	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid, true);
> >  	xe_tlb_inval_fence_wait(&fence);
> >  }
> >
> > @@ -420,6 +420,20 @@ static const struct dma_fence_ops inval_fence_ops
> > = {
> >  	.get_timeline_name = xe_inval_fence_get_timeline_name,  };
> >
> > +/**
> > + * xe_tlb_inval_fence_flush_cache - Control PPC flush at invalidation
> > + * @fence: TLB inval fence
> > + * @flush_cache: whether to perform PPC cache flush
> > + *
> > + * Helper function to modify the tlb_inval fence to control the PPC flush.
> > + * Other components shouldn't modify fence directly.
> > + */
> > +void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> > +				    bool flush_cache)
> > +{
> > +	fence->flush_cache = flush_cache;
> > +}
> 
> I don't see this function is used in the series patches. Could you please
> double confirm that?
> Also, it would be better to add inline and just put the code in the header file.
> BTW, how about rename it to xe_tlb_inval_fence_set_flush_cache?
> 
> Shuicheng
> 

I'll remove it in the next patchset as well, it is no longer needed since
we are integrating page reclaim into the tlb invalidation fence itself. Thanks.

Brian

> > +
> >  /**
> >   * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence
> >   * @tlb_inval: TLB invalidation client @@ -446,4 +460,5 @@ void
> > xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
> >  	else
> >  		dma_fence_get(&fence->base);
> >  	fence->tlb_inval = tlb_inval;
> > +	fence->flush_cache = true;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > index 9dbddc310eb9..b84ce3e6f294 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > @@ -24,8 +24,9 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval
> > *tlb_inval);  void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> > struct xe_vm *vm);  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> >  		       struct xe_tlb_inval_fence *fence,
> > -		       u64 start, u64 end, u32 asid);
> > -
> > +		       u64 start, u64 end, u32 asid, bool flush_cache); void
> > +xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> > +				    bool flush_cache);
> >  void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
> >  			     struct xe_tlb_inval_fence *fence,
> >  			     bool stack);
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > index 1ae0dec2cf31..6248f90323a9 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > @@ -49,7 +49,7 @@ static struct dma_fence *xe_tlb_inval_job_run(struct
> > xe_dep_job *dep_job)
> >  		container_of(job->fence, typeof(*ifence), base);
> >
> >  	xe_tlb_inval_range(job->tlb_inval, ifence, job->start,
> > -			   job->end, job->vm->usm.asid);
> > +			   job->end, job->vm->usm.asid, ifence->flush_cache);
> >
> >  	return job->fence;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > index 7a6967ce3b76..c3c3943fb07e 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > @@ -40,12 +40,13 @@ struct xe_tlb_inval_ops {
> >  	 * @start: Start address
> >  	 * @end: End address
> >  	 * @asid: Address space ID
> > +	 * @flush_cache: PPC flush control
> >  	 *
> >  	 * Return 0 on success, -ECANCELED if backend is mid-reset, error on
> >  	 * failure
> >  	 */
> >  	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
> > -		     u64 end, u32 asid);
> > +		     u64 end, u32 asid, bool flush_cache);
> >
> >  	/**
> >  	 * @initialized: Backend is initialized @@ -126,6 +127,8 @@ struct
> > xe_tlb_inval_fence {
> >  	int seqno;
> >  	/** @inval_time: time of TLB invalidation */
> >  	ktime_t inval_time;
> > +	/** @flush_cache: bool for PPC flush, default is true */
> > +	bool flush_cache;
> >  };
> >
> >  #endif
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index
> > 7cac646bdf1c..5fb5226574c5 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -3907,7 +3907,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> > *vm, u64 start,
> >
> >  		err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval,
> >  					 &fence[fence_id], start, end,
> > -					 vm->usm.asid);
> > +					 vm->usm.asid, true);
> >  		if (err)
> >  			goto wait;
> >  		++fence_id;
> > @@ -3920,7 +3920,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> > *vm, u64 start,
> >
> >  		err = xe_tlb_inval_range(&tile->media_gt->tlb_inval,
> >  					 &fence[fence_id], start, end,
> > -					 vm->usm.asid);
> > +					 vm->usm.asid, true);
> >  		if (err)
> >  			goto wait;
> >  		++fence_id;
> > --
> > 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
  2025-11-21 18:32   ` Lin, Shuicheng
@ 2025-11-22  1:56     ` Nguyen, Brian3
  2025-11-22 18:39       ` Matthew Brost
  0 siblings, 1 reply; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-22  1:56 UTC (permalink / raw)
  To: Lin, Shuicheng, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart


On Friday, November 21, 2025 10:33 AM Lin, Shuicheng wrote:
> On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > Add page reclamation related changes to GuC interface, handlers, and
> > senders to support page reclamation.
> >
> > Currently TLB invalidations will perform an entire PPC flush in order
> > to prevent stale memory access for noncoherent system memory. Page
> > reclamation is an extension of the typical TLB invalidation workflow,
> > allowing disabling of full PPC flush and enable selective PPC
> > flushing. Selective flushing will be decided by a list of pages whom's address
> is passed to GuC at time of action.
> >
> > Page reclamation interfaces require at least GuC FW ver 70.31.0.
> 
> Should driver disable this feature if the running FW is < 70.31.0?

Default FW version is above this at time of patchset submission so
I had assumed it not to be a problem, since the danger is a user
forcibly using a bad FW which already has unpredictable results.

However, in hindsight, it is easy enough to skip if FW version is lower,
and we can safely fallback to default TLB invalidation, so I'll proceed
with adding a check within the later patches that'll disable
page reclamation within the xe_guc_tlb_inval.c unless there are
any objections.

> What will happen if driver send this action while GuC doesn't support it yet?
> 
> Shuicheng
> 

AFAIK, if action is sent before correct FW version, it'll report
out GUC_HXG_TYPE_RESPONSE_FAILURE in g2h
due to illegal operation, eventually triggering reset.

Brian

> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
> >  drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
> >  drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
> >  4 files changed, 21 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > index 47756e4674a1..11de3bdf69b5 100644
> > --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > @@ -151,6 +151,8 @@ enum xe_guc_action {
> >  	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
> >  	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
> >  	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
> > +	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
> > +	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
> >  	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
> >  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
> >  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004, diff --git
> > a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index
> > 2697d711adb2..e13704e61032 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct *ct,
> > u32 *msg, u32 len)
> >  	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
> >  	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
> >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> >  		g2h_release_space(ct, len);
> >  	}
> >
> > @@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct *ct,
> > u32 *msg, u32 len)
> >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> >  		break;
> >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> >  		break;
> >  	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
> > @@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32
> > *msg, bool fast_path)
> >  		switch (action) {
> >  		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
> >  		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > +		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> >  			break;	/* Process these in fast-path */
> >  		default:
> >  			return 0;
> > @@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct,
> > u32 *msg, u32 len)
> >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> >  		break;
> >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> >  		__g2h_release_space(ct, len);
> >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> >  		break;
> > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > index c90dd266e9cf..34d74a71c4f0 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > @@ -16,6 +16,7 @@
> >  #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
> >  #define G2H_LEN_DW_TLB_INVALIDATE		3
> >  #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
> > +#define G2H_LEN_DW_PAGE_RECLAMATION		3
> >
> >  #define GUC_ID_MAX			65535
> >  #define GUC_ID_UNKNOWN			0xffffffff
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index c05709a5bc98..3185f8dc00c4 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval
> > *tlb_inval, u32 seqno)
> >  	return -ECANCELED;
> >  }
> >
> > +static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
> > +			     u64 gpu_addr)
> > +{
> > +	u32 action[] = {
> > +		XE_GUC_ACTION_PAGE_RECLAMATION,
> > +		seqno,
> > +		lower_32_bits(gpu_addr),
> > +		upper_32_bits(gpu_addr),
> > +	};
> > +
> > +	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > +			      G2H_LEN_DW_PAGE_RECLAMATION, 1); }
> > +
> >  /*
> >   * Ensure that roundup_pow_of_two(length) doesn't overflow.
> >   * Note that roundup_pow_of_two() operates on unsigned long,
> > --
> > 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-21 21:29   ` Lin, Shuicheng
@ 2025-11-22  1:57     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-22  1:57 UTC (permalink / raw)
  To: Lin, Shuicheng, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart


On Friday, November 21, 2025 1:30 PM Lin, Shuicheng wrote:
> On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > The PRL is firstly owned by pt_update_ops and all other page reclaim
> > operations will point back to this PRL. PRL generates its entries
> > during the unbind page walker, updating the PRL.
> >
> > This PRL is restricted to a 4K page, so 512 page entries at most.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/Makefile           |   1 +
> >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> >  6 files changed, 217 insertions(+)
> >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
> >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e4b273b025d2..048e6c93271c 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> >  	xe_oa.o \
> >  	xe_observation.o \
> >  	xe_pagefault.o \
> > +	xe_page_reclaim.o \
> >  	xe_pat.o \
> >  	xe_pci.o \
> >  	xe_pcode.o \
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > index 4389e5a76f89..4d83461e538b 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > @@ -9,6 +9,7 @@
> >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> >
> > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> >
> >  #define GUC_GGTT_TOP		0xFEE00000
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > new file mode 100644
> > index 000000000000..a0d15efff58c
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > @@ -0,0 +1,52 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation  */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/kref.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +
> > +#include "xe_page_reclaim.h"
> > +
> > +#include "regs/xe_gt_regs.h"
> > +#include "xe_assert.h"
> > +#include "xe_macros.h"
> > +
> > +/**
> > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > + * @prl: Page reclaim list to reset
> > + *
> > + * Clears the entries pointer and marks the list as invalid so
> > + * future use know PRL is unusable. It is expected that the entries
> 
> s/know/knows

Changed.

> 
> > + * have already been released.
> > + */
> > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > +*prl) {
> > +	prl->entries = NULL;
> > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > +
> > +/**
> > + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list
> > +entries
> > + * @prl: Page reclaim list to allocate entries for
> > + *
> > + * Allocate one 4K page for the PRL entries, otherwise assign
> > +prl->entries to
> > NULL.
> > + */
> > +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> > +*prl) {
> > +	struct page *page;
> > +
> > +	XE_WARN_ON(prl->entries != NULL);
> > +	if (prl->entries)
> > +		return 0;
> 
> These lines could be combined like this:
> 	if (XE_WARN_ON(prl->entries))
> 		return 0;
> 

Changed.

> > +
> > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > +	if (page) {
> > +		prl->entries = page_address(page);
> > +		prl->num_entries = 0;
> > +	}
> > +
> > +	return page ? 0 : -ENOMEM;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > new file mode 100644
> > index 000000000000..d066d7d97f79
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation  */
> > +
> > +#ifndef _XE_PAGE_RECLAIM_H_
> > +#define _XE_PAGE_RECLAIM_H_
> > +
> > +#include <linux/kref.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +#include <linux/types.h>
> > +#include <linux/workqueue.h>
> > +
> > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > +
> > +struct xe_guc_page_reclaim_entry {
> > +	u32 valid:1;
> > +	u32 reclamation_size:6;
> 
> Maybe add comments for what does this size mean?
> "the size of the page to be invalidated and flushed from non-coherent cache."
> 

Let me add some comments for each member.

> > +	u32 reserved:5;
> 
> s/reserved/reserved0
> As there is reserved1 below.

Ack the change.

> 
> > +	u32 address_lo:20;
> > +	u32 address_hi:20;
> > +	u32 reserved1:12;
> > +} __packed;
> > +
> > +struct xe_page_reclaim_list {
> > +	/** @entries: array of page reclaim entries, page allocated */
> > +	struct xe_guc_page_reclaim_entry *entries;
> > +	/** @num_entries: number of entries */
> > +	int num_entries;
> > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > +};
> > +
> > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > +*prl); int xe_page_reclaim_list_alloc_entries(struct
> > +xe_page_reclaim_list *prl); static inline void
> > +xe_page_reclaim_entries_get(struct
> > +xe_guc_page_reclaim_entry *entries) {
> > +	if (entries)
> > +		get_page(virt_to_page(entries));
> > +}
> > +
> > +static inline void xe_page_reclaim_entries_put(struct
> > +xe_guc_page_reclaim_entry *entries) {
> > +	if (entries)
> > +		put_page(virt_to_page(entries));
> > +}
> > +
> > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index
> > 884127b4d97d..532a047676d4 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -12,6 +12,7 @@
> >  #include "xe_exec_queue.h"
> >  #include "xe_gt.h"
> >  #include "xe_migrate.h"
> > +#include "xe_page_reclaim.h"
> >  #include "xe_pt_types.h"
> >  #include "xe_pt_walk.h"
> >  #include "xe_res_cursor.h"
> > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> >  	/* Output */
> >  	/* @wupd: Structure to track the page-table updates we're building
> */
> >  	struct xe_walk_update wupd;
> > +
> > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > +	struct xe_page_reclaim_list *prl;
> >  };
> >
> >  /*
> > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64
> > next, unsigned int level,
> >  	return false;
> >  }
> >
> > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > +	return pte->level == 1 && !pte->base.children; }
> > +
> > +/* page_size = 2^(reclamation_size + 12) */ #define
> > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > 		\
> > +({									\
> > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));
> > 	\
> > +	ilog2(page_size) - 12;						\
> > +})
> > +
> > +static void generate_reclaim_entry(struct xe_tile *tile,
> > +				   struct xe_page_reclaim_list *prl,
> > +				   u64 pte,
> > +				   struct xe_pt *xe_child)
> > +{
> > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > +	u32 reclamation_size;
> > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> 
> It seems we don't need this "max_entries", just use MAX_ENTRIES directly in
> the code.
> 

Dropped the max_entries.

> > +	int num_entries = prl->num_entries;
> > +
> > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > +	xe_tile_assert(tile, reclaim_entries);
> > +
> > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > +		return;
> > +
> > +	/* Overflow: mark as invalid through num_entries */
> > +	if (num_entries >= max_entries) {
> > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > +		return;
> > +	}
> > +
> > +	/**
> > +	 * reclamation_size indicates the size of the page to be
> > +	 * invalidated and flushed from non-coherent cache.
> > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > +	 * Only valid for these specific levels.
> > +	 */
> > +
> > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > +		reclamation_size =
> > COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K);  /* reclamation_size = 0 */
> 
> Not sure this COMPUTE_RECLAIM_ADDRESS_MASK is needed or not.
> How about:
> 	reclamation_size = 0; /* reclaim page size: SZ_4K */
> 

I thought about this originally, but it did seem like magic numbers.
The page size appeared more beneficial/meaningful in code.
This is also intended to be a compile-time computation
so, it shouldn't be a concern there. It feels like the values of 0, 4, 9
are less useful to have compared to what they correspond to
page size-wise. So, I would like to keep this macro but let me
know your thoughts.

BTW the values in the comments are wrong, its 4 and 9 respectively.
Need to update that as well...

> > +	else if (xe_child->level == 0)
> > +		reclamation_size =
> > COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
/* reclamation_size = 4 */
> > +	else if (is_large_pte(xe_child))
> > +		reclamation_size =
> > COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M);  /* reclamation_size = 2 */
/* reclamation_size = 9 */
> > +	else
> > +		return;
> 
> Is it expected to enter the last else path?
> If it is failure path, how about add a WARN_ON_ONCE message?
> 

Not expected, I'll add that WARN_ON_ONCE message.

> > +
> > +	reclaim_entries[num_entries].valid = 1;
> > +	reclaim_entries[num_entries].reclamation_size =
> > +		reclamation_size;
> > +	reclaim_entries[num_entries].address_lo =
> > +		FIELD_GET(field_mask, phys_addr);
> > +	reclaim_entries[num_entries].address_hi =
> > +		FIELD_GET(field_mask, phys_addr >> 20);
> > +	prl->num_entries++;
> > +}
> > +
> >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> >  				    unsigned int level, u64 addr, u64 next,
> >  				    struct xe_ptw **child,
> > @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct
> > xe_ptw *parent, pgoff_t offset,
> >  				    struct xe_pt_walk *walk)
> >  {
> >  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child),
> > base);
> > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > +		container_of(walk, typeof(*xe_walk), base);
> > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> >
> >  	XE_WARN_ON(!*child);
> >  	XE_WARN_ON(!level);
> >
> > +	/* 4K and 64K Pages are level 0, large pte needs additional handling.
> */
> > +	if (xe_walk->prl && (xe_child->level == 0 || is_large_pte(xe_child))) {
> > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> 
> If count > 512, generate_reclaim_entry() should fail due to reach max entries.
> How about try to check count <= 512 - prl->num_entries before the loop?
> 
> Shuicheng
> 

Ha... That is a lot smarter than checking in generate*. Let me move some checks
around and switch over to that. Going to do a check that'll perform that check
and invalidate the list if it manages to overflow.

Brian

> > +
> > +		for (pgoff_t i = 0; i < count; i++) {
> > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) *
> > sizeof(u64), u64);
> > +
> > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > +					       pte, xe_child);
> > +		}
> > +	}
> > +
> >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> >
> >  	return 0;
> > @@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct
> > xe_tile *tile,  {
> >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > entries[0]);
> >  	struct xe_pt_stage_unbind_walk xe_walk = {
> >  		.base = {
> >  			.ops = &xe_pt_stage_unbind_ops,
> > @@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct
> > xe_tile *tile,
> >  		.modified_start = start,
> >  		.modified_end = end,
> >  		.wupd.entries = entries,
> > +		.prl = pt_update_op->prl,
> >  	};
> >  	struct xe_pt *pt = vm->pt_root[tile->id];
> >
> > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >  			     struct xe_vm_pgtable_update_ops
> *pt_update_ops,
> >  			     struct xe_vma *vma)
> >  {
> > +	struct xe_device *xe = tile_to_xe(tile);
> >  	u32 current_op = pt_update_ops->current_op;
> >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > >ops[current_op];
> >  	int err;
> > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >  	pt_op->vma = vma;
> >  	pt_op->bind = false;
> >  	pt_op->rebind = false;
> > +	/* Maintain one PRL located in pt_update_ops that all others in
> > unbind op reference */
> > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops-
> > >prl.entries) {
> > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops-
> > >prl);
> > +		if (err < 0)
> > +			xe_page_reclaim_list_invalidate(&pt_update_ops-
> > >prl);
> > +	}
> > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > +NULL;
> >
> >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> >  	if (err)
> > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile
> > *tile,
> >
> >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> >  						vma, NULL, pt_op->entries);
> > +	/* Free PRL if list declared as invalid */
> > +	if (pt_update_ops->prl.entries &&
> > +	    pt_update_ops->prl.num_entries ==
> > XE_PAGE_RECLAIM_INVALID_LIST) {
> > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > +		pt_op->prl = NULL;
> > +		pt_update_ops->prl.entries = NULL;
> > +	}
> >
> >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> >  				pt_op->num_entries, false);
> > @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm
> > *vm,
> >  	pt_op->vma = XE_INVALID_VMA;
> >  	pt_op->bind = false;
> >  	pt_op->rebind = false;
> > +	pt_op->prl = NULL;
> >
> >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> >  						pt_op->entries);
> > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > xe_vm_pgtable_update_ops *pt_update_ops)
> >  	init_llist_head(&pt_update_ops->deferred);
> >  	pt_update_ops->start = ~0x0ull;
> >  	pt_update_ops->last = 0x0ull;
> > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> >  }
> >
> >  /**
> > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile
> > *tile, struct xe_vma_ops *vops)
> >  		&vops->pt_update_ops[tile->id];
> >  	int i;
> >
> > +	if (pt_update_ops->prl.entries) {
> > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > +	}
> > +
> >  	lockdep_assert_held(&vops->vm->lock);
> >  	xe_vm_assert_held(vops->vm);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > b/drivers/gpu/drm/xe/xe_pt_types.h
> > index 881f01e14db8..26e5295f118e 100644
> > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > @@ -8,6 +8,7 @@
> >
> >  #include <linux/types.h>
> >
> > +#include "xe_page_reclaim.h"
> >  #include "xe_pt_walk.h"
> >
> >  struct xe_bo;
> > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> >  	bool bind;
> >  	/** @rebind: is a rebind */
> >  	bool rebind;
> > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > +	struct xe_page_reclaim_list *prl;
> >  };
> >
> >  /** struct xe_vm_pgtable_update_ops: page table update operations */
> > @@ -
> > 119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> >  	 * slots are idle.
> >  	 */
> >  	bool wait_vm_kernel;
> > +	/** @prl: embedded page reclaim list */
> > +	struct xe_page_reclaim_list prl;
> >  };
> >
> >  #endif
> > --
> > 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 11/11] drm/xe: Add debugfs support for page reclamation
  2025-11-21 22:32   ` Lin, Shuicheng
@ 2025-11-22  1:57     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-22  1:57 UTC (permalink / raw)
  To: Lin, Shuicheng, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Summers, Stuart

On Friday, November 21, 2025 2:32 PM Lin, Shuicheng wrote:
> On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > Allow for runtime modification to page reclamation feature through
> > debugfs configuration. This parameter will only take effect if the
> > platform supports the page reclamation feature by default.
> >
> > Move xe_match_desc to common header for debugfs access to read default
> > device values of xe driver for current platform.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_configfs.c | 11 +-------
> > drivers/gpu/drm/xe/xe_debugfs.c  | 47
> > ++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_device.c   | 10 +++++++
> >  drivers/gpu/drm/xe/xe_device.h   |  2 ++
> >  4 files changed, 60 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> > b/drivers/gpu/drm/xe/xe_configfs.c
> > index 9f6251b1008b..efc6d0690b27 100644
> > --- a/drivers/gpu/drm/xe/xe_configfs.c
> > +++ b/drivers/gpu/drm/xe/xe_configfs.c
> > @@ -15,6 +15,7 @@
> >
> >  #include "instructions/xe_mi_commands.h"
> >  #include "xe_configfs.h"
> > +#include "xe_device.h"
> >  #include "xe_gt_types.h"
> >  #include "xe_hw_engine_types.h"
> >  #include "xe_module.h"
> > @@ -925,16 +926,6 @@ static const struct config_item_type
> > xe_config_sriov_type = {
> >  	.ct_attrs	= xe_config_sriov_attrs,
> >  };
> >
> > -static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev) -{
> > -	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> > -	struct pci_driver *drv = to_pci_driver(driver);
> > -	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> > -	const struct pci_device_id *found = pci_match_id(ids, pdev);
> > -
> > -	return found ? (const void *)found->driver_data : NULL;
> > -}
> > -
> >  static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn)  {
> >  	struct pci_dev *physfn = pci_physfn(virtfn); diff --git
> > a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index e91da9589c5f..572c61ee1e29 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -19,6 +19,7 @@
> >  #include "xe_gt_printk.h"
> >  #include "xe_guc_ads.h"
> >  #include "xe_mmio.h"
> > +#include "xe_pci_types.h"
> >  #include "xe_pm.h"
> >  #include "xe_psmi.h"
> >  #include "xe_pxp_debugfs.h"
> > @@ -297,6 +298,49 @@ static const struct file_operations
> > wedged_mode_fops = {
> >  	.write = wedged_mode_set,
> >  };
> >
> > +static ssize_t page_reclaim_hw_assist_show(struct file *f, char __user *ubuf,
> > +					   size_t size, loff_t *pos)
> > +{
> > +	struct xe_device *xe = file_inode(f)->i_private;
> > +	char buf[8];
> > +	int len;
> > +
> > +	len = scnprintf(buf, sizeof(buf), "%d\n", xe-
> > >info.has_page_reclaim_hw_assist);
> > +	return simple_read_from_buffer(ubuf, size, pos, buf, len); }
> > +
> > +static ssize_t page_reclaim_hw_assist_set(struct file *f, const char
> > +__user
> > *ubuf,
> > +					  size_t size, loff_t *pos)
> > +{
> > +	struct xe_device *xe = file_inode(f)->i_private;
> > +	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> > +	const struct xe_device_desc *desc = xe_match_desc(pdev);
> > +	unsigned int val;
> > +	ssize_t ret;
> > +
> > +	ret = kstrtouint_from_user(ubuf, size, 0, &val);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/**
> > +	 * Don't modify if page reclamation support isn't normally
> > +	 * supported by the HW.
> > +	 */
> 
> Nit:
> /** is reserved for kernel-doc comment. So it should be /* here for normal
> comment.
> How about "Don't modify it if page reclamation isn't supported by the
> hardware."?
> 
> Other code LGTM.
> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
> 

Thanks, will revise comment on next patch.

> > +
> > +	if (!desc || !desc->has_page_reclaim_hw_assist)
> > +		return -ENODEV;
> > +
> > +	xe->info.has_page_reclaim_hw_assist = !!val;
> > +
> > +	return size;
> > +}
> > +
> > +static const struct file_operations page_reclaim_hw_assist_fops = {
> > +	.owner = THIS_MODULE,
> > +	.read = page_reclaim_hw_assist_show,
> > +	.write = page_reclaim_hw_assist_set, };
> > +
> >  static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char
> > __user *ubuf,
> >  					    size_t size, loff_t *pos)
> >  {
> > @@ -403,6 +447,9 @@ void xe_debugfs_register(struct xe_device *xe)
> >  	debugfs_create_file("disable_late_binding", 0600, root, xe,
> >  			    &disable_late_binding_fops);
> >
> > +	debugfs_create_file("page_reclaim_hw_assist", 0600, root, xe,
> > +			    &page_reclaim_hw_assist_fops);
> > +
> >  	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1;
> > ++mem_type) {
> >  		man = ttm_manager_type(bdev, mem_type);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c
> > b/drivers/gpu/drm/xe/xe_device.c index c7d373c70f0f..16afddc5e35e
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -1295,3 +1295,13 @@ void xe_device_declare_wedged(struct xe_device
> > *xe)
> >  		drm_dev_wedged_event(&xe->drm, xe->wedged.method, NULL);
> >  	}
> >  }
> > +
> > +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev) {
> > +	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> > +	struct pci_driver *drv = to_pci_driver(driver);
> > +	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> > +	const struct pci_device_id *found = pci_match_id(ids, pdev);
> > +
> > +	return found ? (const void *)found->driver_data : NULL; }
> > diff --git a/drivers/gpu/drm/xe/xe_device.h
> > b/drivers/gpu/drm/xe/xe_device.h index 32cc6323b7f6..a66e8e4b3e01
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -193,6 +193,8 @@ void xe_device_declare_wedged(struct xe_device
> > *xe);  struct xe_file *xe_file_get(struct xe_file *xef);  void
> > xe_file_put(struct xe_file *xef);
> >
> > +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev);
> > +
> >  int xe_is_injection_active(void);
> >
> >  /*
> > --
> > 2.51.2


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job
  2025-11-18  9:05 ` [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job Brian Nguyen
@ 2025-11-22 13:52   ` Michal Wajdeczko
  2025-11-25 11:20     ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Michal Wajdeczko @ 2025-11-22 13:52 UTC (permalink / raw)
  To: Brian Nguyen, intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers



On 11/18/2025 10:05 AM, Brian Nguyen wrote:
> Use page reclaim list as indicator if page reclaim action is desired and
> pass it to tlb inval fence to handle.
> 
> Job will need to maintain its own embedded copy to ensure lifetime of
> PRL exist until job has run.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---

...

>  
> +/**
> + * xe_tlb_inval_job_add_page_reclaim() - Embed PRL into a TLB job
> + * @job: TLB invalidation job that may trigger reclamation
> + * @prl: Page reclaim list populated during unbind
> + *
> + * Copies @prl into the job and takes an extra reference to the entry page so
> + * ownership can transfer to the TLB fence when the job is pushed.
> + */
> +void xe_tlb_inval_job_add_page_reclaim(struct xe_tlb_inval_job *job,
> +				       struct xe_page_reclaim_list *prl)
> +{
> +	struct xe_device *xe = gt_to_xe(job->q->gt);
> +
> +	WARN_ON(!xe->info.has_page_reclaim_hw_assist);

you can use here debug-only:

	xe_gt_assert(job->q->gt, xe->info.has_page_reclaim_hw_assist);

or if you want keep it in production builds:

	xe_gt_WARN_ON(...

> +	job->prl = *prl;
> +	/* Pair with put after bo creation */
> +	xe_page_reclaim_entries_get(job->prl.entries);
> +}
> +

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/11] drm/xe: Add debugfs support for page reclamation
  2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
  2025-11-21 22:32   ` Lin, Shuicheng
@ 2025-11-22 14:18   ` Michal Wajdeczko
  2025-11-25 11:21     ` Nguyen, Brian3
  1 sibling, 1 reply; 51+ messages in thread
From: Michal Wajdeczko @ 2025-11-22 14:18 UTC (permalink / raw)
  To: Brian Nguyen, intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers



On 11/18/2025 10:05 AM, Brian Nguyen wrote:
> Allow for runtime modification to page reclamation feature through
> debugfs configuration. This parameter will only take effect if the
> platform supports the page reclamation feature by default.
> 
> Move xe_match_desc to common header for debugfs access to read default
> device values of xe driver for current platform.

this seems to be unnecessary, see below

> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_configfs.c | 11 +-------
>  drivers/gpu/drm/xe/xe_debugfs.c  | 47 ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_device.c   | 10 +++++++
>  drivers/gpu/drm/xe/xe_device.h   |  2 ++
>  4 files changed, 60 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 9f6251b1008b..efc6d0690b27 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -15,6 +15,7 @@
>  
>  #include "instructions/xe_mi_commands.h"
>  #include "xe_configfs.h"
> +#include "xe_device.h"
>  #include "xe_gt_types.h"
>  #include "xe_hw_engine_types.h"
>  #include "xe_module.h"
> @@ -925,16 +926,6 @@ static const struct config_item_type xe_config_sriov_type = {
>  	.ct_attrs	= xe_config_sriov_attrs,
>  };
>  
> -static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
> -{
> -	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> -	struct pci_driver *drv = to_pci_driver(driver);
> -	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> -	const struct pci_device_id *found = pci_match_id(ids, pdev);
> -
> -	return found ? (const void *)found->driver_data : NULL;
> -}
> -
>  static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn)
>  {
>  	struct pci_dev *physfn = pci_physfn(virtfn);
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index e91da9589c5f..572c61ee1e29 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -19,6 +19,7 @@
>  #include "xe_gt_printk.h"
>  #include "xe_guc_ads.h"
>  #include "xe_mmio.h"
> +#include "xe_pci_types.h"
>  #include "xe_pm.h"
>  #include "xe_psmi.h"
>  #include "xe_pxp_debugfs.h"
> @@ -297,6 +298,49 @@ static const struct file_operations wedged_mode_fops = {
>  	.write = wedged_mode_set,
>  };
>  
> +static ssize_t page_reclaim_hw_assist_show(struct file *f, char __user *ubuf,
> +					   size_t size, loff_t *pos)
> +{
> +	struct xe_device *xe = file_inode(f)->i_private;
> +	char buf[8];
> +	int len;
> +
> +	len = scnprintf(buf, sizeof(buf), "%d\n", xe->info.has_page_reclaim_hw_assist);
> +	return simple_read_from_buffer(ubuf, size, pos, buf, len);
> +}
> +
> +static ssize_t page_reclaim_hw_assist_set(struct file *f, const char __user *ubuf,
> +					  size_t size, loff_t *pos)
> +{
> +	struct xe_device *xe = file_inode(f)->i_private;
> +	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> +	const struct xe_device_desc *desc = xe_match_desc(pdev);
> +	unsigned int val;
> +	ssize_t ret;
> +
> +	ret = kstrtouint_from_user(ubuf, size, 0, &val);

kstrtobool_from_user

> +	if (ret)
> +		return ret;
> +
> +	/**
> +	 * Don't modify if page reclamation support isn't normally
> +	 * supported by the HW.
> +	 */
> +
> +	if (!desc || !desc->has_page_reclaim_hw_assist)
> +		return -ENODEV;

instead of checking desc->has_page_reclaim_hw_assist capability here

> +
> +	xe->info.has_page_reclaim_hw_assist = !!val;
> +
> +	return size;
> +}
> +
> +static const struct file_operations page_reclaim_hw_assist_fops = {
> +	.owner = THIS_MODULE,
> +	.read = page_reclaim_hw_assist_show,
> +	.write = page_reclaim_hw_assist_set,
> +};
> +
>  static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user *ubuf,
>  					    size_t size, loff_t *pos)
>  {
> @@ -403,6 +447,9 @@ void xe_debugfs_register(struct xe_device *xe)
>  	debugfs_create_file("disable_late_binding", 0600, root, xe,
>  			    &disable_late_binding_fops);
>  

better to expose "page_reclaim_hw_assist" file *only* if required
capability is present and we can get that flag directly from the xe:

	if (xe->info.has_page_reclaim_hw_assist)

> +	debugfs_create_file("page_reclaim_hw_assist", 0600, root, xe,
> +			    &page_reclaim_hw_assist_fops);
> +
>  	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
>  		man = ttm_manager_type(bdev, mem_type);
>  
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index c7d373c70f0f..16afddc5e35e 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -1295,3 +1295,13 @@ void xe_device_declare_wedged(struct xe_device *xe)
>  		drm_dev_wedged_event(&xe->drm, xe->wedged.method, NULL);
>  	}
>  }
> +
> +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)

note that this function was specific for configfs case where might not
have the xe device, hence the manual lookup was needed

if in the future for some reasons we would like to get access to the desc
from the xe, then we should rather consider adding a const pointer to it

> +{
> +	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> +	struct pci_driver *drv = to_pci_driver(driver);
> +	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> +	const struct pci_device_id *found = pci_match_id(ids, pdev);
> +
> +	return found ? (const void *)found->driver_data : NULL;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 32cc6323b7f6..a66e8e4b3e01 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -193,6 +193,8 @@ void xe_device_declare_wedged(struct xe_device *xe);
>  struct xe_file *xe_file_get(struct xe_file *xef);
>  void xe_file_put(struct xe_file *xef);
>  
> +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev);
> +
>  int xe_is_injection_active(void);
>  
>  /*


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received
  2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
  2025-11-21 17:23   ` Lin, Shuicheng
@ 2025-11-22 18:25   ` Matthew Brost
  2025-11-25 11:01     ` Nguyen, Brian3
  1 sibling, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 18:25 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe, tejas.upadhyay, shuicheng.lin, stuart.summers

On Tue, Nov 18, 2025 at 05:05:43PM +0800, Brian Nguyen wrote:
> TLB_INVALIDATION_SEQNO_INVALID are now used to indicate in progress
> multi-step TLB invalidations, so reset tdr to ensure that action
> won't prematurely trigger when G2H actions are still ongoing.
> 

I think thid patch makes sense but comments below.

> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_tlb_inval.c |  2 ++
>  drivers/gpu/drm/xe/xe_tlb_inval.c     | 16 ++++++++++++++++
>  drivers/gpu/drm/xe/xe_tlb_inval.h     |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> index f1fd2dd90742..cd126c53faab 100644
> --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> @@ -238,6 +238,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
>  
>  	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
>  		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
> +	else
> +		xe_tlb_inval_reset_timeout(&gt->tlb_inval);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index 918a59e686ea..50f05d6b5672 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -199,6 +199,22 @@ void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval)
>  	mutex_unlock(&tlb_inval->seqno_lock);
>  }
>  
> +/**
> + * xe_tlb_inval_reset_timeout() - Reset TLB inval fence timeout
> + * @tlb_inval: TLB invalidation client
> + *
> + * Reset the TLB invalidation timeout timer.
> + */
> +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tlb_inval->pending_lock, flags);
> +	mod_delayed_work(system_wq, &tlb_inval->fence_tdr,
> +			 tlb_inval->ops->timeout_delay(tlb_inval));

You don't need a lock for this. It is done in xe_tlb_inval_done_handler
under this lock as the pending list of TLB invalidations, which
pending_lock protects, is tied to whether we want to reschedule the
timeout or cancel it. So please drop the lock here and then also update
xe_tlb_inval_done_handler to call this new helper.

Matt

> +	spin_unlock_irqrestore(&tlb_inval->pending_lock, flags);
> +}
> +
>  static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval, int seqno)
>  {
>  	int seqno_recv = READ_ONCE(tlb_inval->seqno_recv);
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
> index 05614915463a..9dbddc310eb9 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> @@ -17,6 +17,7 @@ struct xe_vm;
>  int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
>  
>  void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval);
> +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval);
>  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
>  		     struct xe_tlb_inval_fence *fence);
>  int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/11] drm/xe: Add page reclamation info to device info
  2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
  2025-11-21 18:15   ` Lin, Shuicheng
@ 2025-11-22 18:31   ` Matthew Brost
  1 sibling, 0 replies; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 18:31 UTC (permalink / raw)
  To: Brian Nguyen
  Cc: intel-xe, tejas.upadhyay, shuicheng.lin, stuart.summers, Oak Zeng

On Tue, Nov 18, 2025 at 05:05:45PM +0800, Brian Nguyen wrote:
> From: Oak Zeng <oak.zeng@intel.com>
> 
> Starting from Xe3p, HW adds a feature assisting range based page
> reclamation. Introduce a bit in device info to indicate whether
> device has such capability.
> 
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_device_types.h | 2 ++
>  drivers/gpu/drm/xe/xe_pci.c          | 1 +
>  drivers/gpu/drm/xe/xe_pci_types.h    | 1 +
>  3 files changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 0b2fa7c56d38..268c8e28601a 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -308,6 +308,8 @@ struct xe_device {
>  		u8 has_mbx_power_limits:1;
>  		/** @info.has_mem_copy_instr: Device supports MEM_COPY instruction */
>  		u8 has_mem_copy_instr:1;
> +		/** @info.has_page_reclaim_hw_assist: Device supports page reclamation feature */
> +		u8 has_page_reclaim_hw_assist:1;
>  		/** @info.has_pxp: Device has PXP support */
>  		u8 has_pxp:1;
>  		/** @info.has_range_tlb_inval: Has range based TLB invalidations */
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index cd03b4b3ebdb..43c47426313e 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -673,6 +673,7 @@ static int xe_info_init_early(struct xe_device *xe,
>  	xe->info.has_heci_cscfi = desc->has_heci_cscfi;
>  	xe->info.has_late_bind = desc->has_late_bind;
>  	xe->info.has_llc = desc->has_llc;
> +	xe->info.has_page_reclaim_hw_assist = desc->has_page_reclaim_hw_assist;
>  	xe->info.has_pxp = desc->has_pxp;
>  	xe->info.has_sriov = xe_configfs_primary_gt_allowed(to_pci_dev(xe->drm.dev)) &&
>  		desc->has_sriov;
> diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
> index 9892c063a9c5..151743d4cf72 100644
> --- a/drivers/gpu/drm/xe/xe_pci_types.h
> +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> @@ -47,6 +47,7 @@ struct xe_device_desc {
>  	u8 has_llc:1;
>  	u8 has_mbx_power_limits:1;
>  	u8 has_mem_copy_instr:1;
> +	u8 has_page_reclaim_hw_assist:1;
>  	u8 has_pxp:1;
>  	u8 has_sriov:1;
>  	u8 needs_scratch:1;
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
  2025-11-22  1:56     ` Nguyen, Brian3
@ 2025-11-22 18:39       ` Matthew Brost
  2025-11-25 11:13         ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 18:39 UTC (permalink / raw)
  To: Nguyen, Brian3
  Cc: Lin, Shuicheng, intel-xe@lists.freedesktop.org, Upadhyay,  Tejas,
	Summers, Stuart

On Fri, Nov 21, 2025 at 06:56:27PM -0700, Nguyen, Brian3 wrote:
> 
> On Friday, November 21, 2025 10:33 AM Lin, Shuicheng wrote:
> > On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > > Add page reclamation related changes to GuC interface, handlers, and
> > > senders to support page reclamation.
> > >
> > > Currently TLB invalidations will perform an entire PPC flush in order
> > > to prevent stale memory access for noncoherent system memory. Page
> > > reclamation is an extension of the typical TLB invalidation workflow,
> > > allowing disabling of full PPC flush and enable selective PPC
> > > flushing. Selective flushing will be decided by a list of pages whom's address
> > is passed to GuC at time of action.
> > >
> > > Page reclamation interfaces require at least GuC FW ver 70.31.0.
> > 
> > Should driver disable this feature if the running FW is < 70.31.0?
> 
> Default FW version is above this at time of patchset submission so
> I had assumed it not to be a problem, since the danger is a user
> forcibly using a bad FW which already has unpredictable results.
> 
> However, in hindsight, it is easy enough to skip if FW version is lower,
> and we can safely fallback to default TLB invalidation, so I'll proceed
> with adding a check within the later patches that'll disable
> page reclamation within the xe_guc_tlb_inval.c unless there are
> any objections.
> 

I would just flip 'xe->info.has_page_reclaim_hw_assist' to false very
early in driver load once we have the GuC version if GuC version doesn't
support it.

> > What will happen if driver send this action while GuC doesn't support it yet?
> > 
> > Shuicheng
> > 
> 
> AFAIK, if action is sent before correct FW version, it'll report
> out GUC_HXG_TYPE_RESPONSE_FAILURE in g2h
> due to illegal operation, eventually triggering reset.
> 
> Brian
> 
> > >
> > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
> > >  drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
> > >  drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
> > >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
> > >  4 files changed, 21 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > index 47756e4674a1..11de3bdf69b5 100644
> > > --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > @@ -151,6 +151,8 @@ enum xe_guc_action {
> > >  	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
> > >  	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
> > >  	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
> > > +	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
> > > +	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
> > >  	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
> > >  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
> > >  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004, diff --git
> > > a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > index
> > > 2697d711adb2..e13704e61032 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > @@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
> > >  	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		g2h_release_space(ct, len);
> > >  	}
> > >
> > > @@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > >  		break;
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);

I get what is happening here - page reclaim G2H just uses a shared seqno
with TLB invalidations but this looks very odd to the reader of the code
who hasn't worked on this or understands this as shared function is
called for both G2H. Can you add a comment here explaining why it is ok
to use a single G2H handler for both TLB invalidations and page reclaim?

Matt

> > >  		break;
> > >  	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
> > > @@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct, u32
> > > *msg, bool fast_path)
> > >  		switch (action) {
> > >  		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
> > >  		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  			break;	/* Process these in fast-path */
> > >  		default:
> > >  			return 0;
> > > @@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct,
> > > u32 *msg, u32 len)
> > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > >  		break;
> > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > >  		__g2h_release_space(ct, len);
> > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> > >  		break;
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > index c90dd266e9cf..34d74a71c4f0 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > @@ -16,6 +16,7 @@
> > >  #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
> > >  #define G2H_LEN_DW_TLB_INVALIDATE		3
> > >  #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
> > > +#define G2H_LEN_DW_PAGE_RECLAMATION		3
> > >
> > >  #define GUC_ID_MAX			65535
> > >  #define GUC_ID_UNKNOWN			0xffffffff
> > > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > index c05709a5bc98..3185f8dc00c4 100644
> > > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > @@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval
> > > *tlb_inval, u32 seqno)
> > >  	return -ECANCELED;
> > >  }
> > >
> > > +static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
> > > +			     u64 gpu_addr)
> > > +{
> > > +	u32 action[] = {
> > > +		XE_GUC_ACTION_PAGE_RECLAMATION,
> > > +		seqno,
> > > +		lower_32_bits(gpu_addr),
> > > +		upper_32_bits(gpu_addr),
> > > +	};
> > > +
> > > +	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > > +			      G2H_LEN_DW_PAGE_RECLAMATION, 1); }
> > > +
> > >  /*
> > >   * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > >   * Note that roundup_pow_of_two() operates on unsigned long,
> > > --
> > > 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
  2025-11-21 21:29   ` Lin, Shuicheng
@ 2025-11-22 19:18   ` Matthew Brost
  2025-11-25 11:18     ` Nguyen, Brian3
  1 sibling, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 19:18 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe, tejas.upadhyay, shuicheng.lin, stuart.summers

On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> Page reclaim list (PRL) is preparation work for the page reclaim feature.
> The PRL is firstly owned by pt_update_ops and all other page reclaim
> operations will point back to this PRL. PRL generates its entries during
> the unbind page walker, updating the PRL.
> 
> This PRL is restricted to a 4K page, so 512 page entries at most.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/Makefile           |   1 +
>  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
>  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
>  drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
>  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
>  6 files changed, 217 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
>  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> 
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index e4b273b025d2..048e6c93271c 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
>  	xe_oa.o \
>  	xe_observation.o \
>  	xe_pagefault.o \
> +	xe_page_reclaim.o \
>  	xe_pat.o \
>  	xe_pci.o \
>  	xe_pcode.o \
> diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> index 4389e5a76f89..4d83461e538b 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> @@ -9,6 +9,7 @@
>  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
>  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
>  
> +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
>  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
>  
>  #define GUC_GGTT_TOP		0xFEE00000
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
> new file mode 100644
> index 000000000000..a0d15efff58c
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> @@ -0,0 +1,52 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/kref.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +
> +#include "xe_page_reclaim.h"
> +
> +#include "regs/xe_gt_regs.h"
> +#include "xe_assert.h"
> +#include "xe_macros.h"
> +
> +/**
> + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> + * @prl: Page reclaim list to reset
> + *
> + * Clears the entries pointer and marks the list as invalid so
> + * future use know PRL is unusable. It is expected that the entries
> + * have already been released.
> + */
> +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl)
> +{
> +	prl->entries = NULL;
> +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> +}
> +
> +/**
> + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list entries
> + * @prl: Page reclaim list to allocate entries for
> + *
> + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> + */
> +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl)
> +{
> +	struct page *page;
> +
> +	XE_WARN_ON(prl->entries != NULL);
> +	if (prl->entries)
> +		return 0;
> +
> +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +	if (page) {
> +		prl->entries = page_address(page);
> +		prl->num_entries = 0;
> +	}
> +
> +	return page ? 0 : -ENOMEM;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
> new file mode 100644
> index 000000000000..d066d7d97f79
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PAGE_RECLAIM_H_
> +#define _XE_PAGE_RECLAIM_H_
> +
> +#include <linux/kref.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/workqueue.h>
> +
> +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> +
> +struct xe_guc_page_reclaim_entry {
> +	u32 valid:1;
> +	u32 reclamation_size:6;
> +	u32 reserved:5;
> +	u32 address_lo:20;
> +	u32 address_hi:20;
> +	u32 reserved1:12;

This is wire interface with the GuC. Bitfields can based on endianess of
the CPU. I know this is a iGPU feature for now but it could possibly
change in the future, with that, to future proof can the layout of this
be setup via defines / macros? 

> +} __packed;
> +
> +struct xe_page_reclaim_list {
> +	/** @entries: array of page reclaim entries, page allocated */
> +	struct xe_guc_page_reclaim_entry *entries;
> +	/** @num_entries: number of entries */
> +	int num_entries;
> +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> +};
> +
> +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
> +static inline void xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries)
> +{
> +	if (entries)
> +		get_page(virt_to_page(entries));
> +}
> +
> +static inline void xe_page_reclaim_entries_put(struct xe_guc_page_reclaim_entry *entries)
> +{
> +	if (entries)
> +		put_page(virt_to_page(entries));
> +}

Kernel doc for static inlines.

> +
> +#endif	/* _XE_PAGE_RECLAIM_H_ */
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 884127b4d97d..532a047676d4 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -12,6 +12,7 @@
>  #include "xe_exec_queue.h"
>  #include "xe_gt.h"
>  #include "xe_migrate.h"
> +#include "xe_page_reclaim.h"
>  #include "xe_pt_types.h"
>  #include "xe_pt_walk.h"
>  #include "xe_res_cursor.h"
> @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
>  	/* Output */
>  	/* @wupd: Structure to track the page-table updates we're building */
>  	struct xe_walk_update wupd;
> +
> +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> +	struct xe_page_reclaim_list *prl;
>  };
>  
>  /*
> @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64 next, unsigned int level,
>  	return false;
>  }
>  
> +/* Huge 2MB leaf lives directly in a level-1 table and has no children */
> +static bool is_large_pte(struct xe_pt *pte)
> +{
> +	return pte->level == 1 && !pte->base.children;
> +}
> +
> +/* page_size = 2^(reclamation_size + 12) */
> +#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size)				\
> +({									\
> +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> +	ilog2(page_size) - 12;						\

s/12/XE_PTE_SHIFT ?

> +})
> +
> +static void generate_reclaim_entry(struct xe_tile *tile,
> +				   struct xe_page_reclaim_list *prl,
> +				   u64 pte,
> +				   struct xe_pt *xe_child)

Nit, xe_pt can be on the same line as 'u64 pte'.

> +{
> +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> +	const u64 field_mask = GENMASK_ULL(19, 0);
> +	u32 reclamation_size;

Nit, I'd make the last variable declared on the stack for readability. 

> +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> +	int num_entries = prl->num_entries;
> +
> +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> +	xe_tile_assert(tile, reclaim_entries);
> +
> +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> +		return;
> +
> +	/* Overflow: mark as invalid through num_entries */
> +	if (num_entries >= max_entries) {
> +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> +		return;
> +	}
> +
> +	/**
> +	 * reclamation_size indicates the size of the page to be
> +	 * invalidated and flushed from non-coherent cache.
> +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> +	 * Only valid for these specific levels.
> +	 */
> +
> +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K);  /* reclamation_size = 0 */
> +	else if (xe_child->level == 0)
> +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> +	else if (is_large_pte(xe_child))
> +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M);  /* reclamation_size = 2 */

What happens if we have 1G page? That doesn't seem to be handled.

> +	else
> +		return;
> +
> +	reclaim_entries[num_entries].valid = 1;
> +	reclaim_entries[num_entries].reclamation_size =
> +		reclamation_size;
> +	reclaim_entries[num_entries].address_lo =
> +		FIELD_GET(field_mask, phys_addr);
> +	reclaim_entries[num_entries].address_hi =
> +		FIELD_GET(field_mask, phys_addr >> 20);

As suggested above, use macros/defines here to setup the entry.

> +	prl->num_entries++;
> +}
> +
>  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
>  				    unsigned int level, u64 addr, u64 next,
>  				    struct xe_ptw **child,
> @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
>  				    struct xe_pt_walk *walk)
>  {
>  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child), base);
> +	struct xe_pt_stage_unbind_walk *xe_walk =
> +		container_of(walk, typeof(*xe_walk), base);
> +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
>  
>  	XE_WARN_ON(!*child);
>  	XE_WARN_ON(!level);
>  
> +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> +	if (xe_walk->prl && (xe_child->level == 0 || is_large_pte(xe_child))) {

And also here? 1G pages are unhandled? Please explain.

> +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> +
> +		for (pgoff_t i = 0; i < count; i++) {
> +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64), u64);
> +
> +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> +					       pte, xe_child);
> +		}
> +	}
> +
>  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
>  
>  	return 0;
> @@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile,
>  {
>  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
>  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> +	struct xe_vm_pgtable_update_op *pt_update_op =
> +		container_of(entries, struct xe_vm_pgtable_update_op, entries[0]);
>  	struct xe_pt_stage_unbind_walk xe_walk = {
>  		.base = {
>  			.ops = &xe_pt_stage_unbind_ops,
> @@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile,
>  		.modified_start = start,
>  		.modified_end = end,
>  		.wupd.entries = entries,
> +		.prl = pt_update_op->prl,
>  	};
>  	struct xe_pt *pt = vm->pt_root[tile->id];
>  
> @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
>  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
>  			     struct xe_vma *vma)
>  {
> +	struct xe_device *xe = tile_to_xe(tile);
>  	u32 current_op = pt_update_ops->current_op;
>  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops->ops[current_op];
>  	int err;
> @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
>  	pt_op->vma = vma;
>  	pt_op->bind = false;
>  	pt_op->rebind = false;
> +	/* Maintain one PRL located in pt_update_ops that all others in unbind op reference */
> +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> +		if (err < 0)
> +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);

I don't think you need to call xe_page_reclaim_list_invalidate, right?
If xe_page_reclaim_list_alloc_entries fails the prl should be in the
init state.

> +	}
> +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL;
>  
>  	err = vma_reserve_fences(tile_to_xe(tile), vma);
>  	if (err)
> @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
>  
>  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
>  						vma, NULL, pt_op->entries);
> +	/* Free PRL if list declared as invalid */
> +	if (pt_update_ops->prl.entries &&
> +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> +		pt_op->prl = NULL;
> +		pt_update_ops->prl.entries = NULL;

Call xe_page_reclaim_list_invalidate for clarity?

> +	}
>  
>  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
>  				pt_op->num_entries, false);
> @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
>  	pt_op->vma = XE_INVALID_VMA;
>  	pt_op->bind = false;
>  	pt_op->rebind = false;
> +	pt_op->prl = NULL;
>  
>  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
>  						pt_op->entries);
> @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct xe_vm_pgtable_update_ops *pt_update_ops)
>  	init_llist_head(&pt_update_ops->deferred);
>  	pt_update_ops->start = ~0x0ull;
>  	pt_update_ops->last = 0x0ull;
> +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);

Can we introduce a function called xe_page_reclaim_list_init for
clarity? It might do the same thing as xe_page_reclaim_list_invalidate
but it would make this a little more clear. Likewise later in the series
when a job is created, you can call xe_page_reclaim_list_init there too.

>  }
>  
>  /**
> @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
>  		&vops->pt_update_ops[tile->id];
>  	int i;
>  
> +	if (pt_update_ops->prl.entries) {
> +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> +	}
> +
>  	lockdep_assert_held(&vops->vm->lock);
>  	xe_vm_assert_held(vops->vm);
>  
> diff --git a/drivers/gpu/drm/xe/xe_pt_types.h b/drivers/gpu/drm/xe/xe_pt_types.h
> index 881f01e14db8..26e5295f118e 100644
> --- a/drivers/gpu/drm/xe/xe_pt_types.h
> +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> @@ -8,6 +8,7 @@
>  
>  #include <linux/types.h>
>  
> +#include "xe_page_reclaim.h"
>  #include "xe_pt_walk.h"
>  
>  struct xe_bo;
> @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
>  	bool bind;
>  	/** @rebind: is a rebind */
>  	bool rebind;
> +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> +	struct xe_page_reclaim_list *prl;

Can you move this above the bools in the layout of
xe_vm_pgtable_update_op, likely just below "struct xe_vma".

>  };
>  
>  /** struct xe_vm_pgtable_update_ops: page table update operations */
> @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
>  	 * slots are idle.
>  	 */
>  	bool wait_vm_kernel;
> +	/** @prl: embedded page reclaim list */
> +	struct xe_page_reclaim_list prl;

Same thing here, move just below "struct xe_exec_queue".

Matt

>  };
>  
>  #endif
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
  2025-11-21 18:02   ` Lin, Shuicheng
@ 2025-11-22 19:32   ` Matthew Brost
  2025-11-25 11:07     ` Nguyen, Brian3
  1 sibling, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 19:32 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe, tejas.upadhyay, shuicheng.lin, stuart.summers

On Tue, Nov 18, 2025 at 05:05:44PM +0800, Brian Nguyen wrote:
> Allow for tlb_invalidation to configure when driver wants to flush the
> Private Physical Cache (PPC) as a process of the tlb invalidation
> process.
> 
> Default behavior is still to always flush the PPC but driver now has the
> option to disable it.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 11 +++++++----
>  drivers/gpu/drm/xe/xe_tlb_inval.c       | 21 ++++++++++++++++++---
>  drivers/gpu/drm/xe/xe_tlb_inval.h       |  5 +++--
>  drivers/gpu/drm/xe/xe_tlb_inval_job.c   |  2 +-
>  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++-
>  drivers/gpu/drm/xe/xe_vm.c              |  4 ++--
>  6 files changed, 35 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> index cd126c53faab..c05709a5bc98 100644
> --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> @@ -34,9 +34,12 @@ static int send_tlb_inval(struct xe_guc *guc, const u32 *action, int len)
>  			      G2H_LEN_DW_TLB_INVALIDATE, 1);
>  }
>  
> -#define MAKE_INVAL_OP(type)	((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> +#define MAKE_INVAL_OP_FLUSH(type, flush_cache)	((type << XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
>  		XE_GUC_TLB_INVAL_MODE_HEAVY << XE_GUC_TLB_INVAL_MODE_SHIFT | \
> -		XE_GUC_TLB_INVAL_FLUSH_CACHE)
> +		(flush_cache ? \
> +		XE_GUC_TLB_INVAL_FLUSH_CACHE : 0))
> +
> +#define MAKE_INVAL_OP(type)	MAKE_INVAL_OP_FLUSH(type, true)
>  
>  static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32 seqno)
>  {
> @@ -100,7 +103,7 @@ static int send_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval, u32 seqno)
>  #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX))
>  
>  static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> -				u64 start, u64 end, u32 asid)
> +				u64 start, u64 end, u32 asid, bool flush_cache)

Later in the series a drm_suballoc is passed in as an argument here.
Isn't that enough to know if we need to flush the cache?

>  {
>  #define MAX_TLB_INVALIDATION_LEN	7
>  	struct xe_guc *guc = tlb_inval->private;
> @@ -154,7 +157,7 @@ static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
>  						    ilog2(SZ_2M) + 1)));
>  		xe_gt_assert(gt, IS_ALIGNED(start, length));
>  
> -		action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
> +		action[len++] = MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE, flush_cache);
>  		action[len++] = asid;
>  		action[len++] = lower_32_bits(start);
>  		action[len++] = upper_32_bits(start);
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index 50f05d6b5672..de275759743c 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -324,10 +324,10 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval)
>   */
>  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
>  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
> -		       u32 asid)
> +		       u32 asid, bool flush_cache)

Then here, later in the series PRL is attached to the fence but can we
change that to an argument here?

>  {
>  	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> -				  start, end, asid);
> +				  start, end, asid, flush_cache);
>  }
>  
>  /**
> @@ -343,7 +343,7 @@ void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm)
>  	u64 range = 1ull << vm->xe->info.va_bits;
>  
>  	xe_tlb_inval_fence_init(tlb_inval, &fence, true);
> -	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid);
> +	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid, true);
>  	xe_tlb_inval_fence_wait(&fence);
>  }
>  
> @@ -420,6 +420,20 @@ static const struct dma_fence_ops inval_fence_ops = {
>  	.get_timeline_name = xe_inval_fence_get_timeline_name,
>  };
>  
> +/**
> + * xe_tlb_inval_fence_flush_cache - Control PPC flush at invalidation
> + * @fence: TLB inval fence
> + * @flush_cache: whether to perform PPC cache flush
> + *
> + * Helper function to modify the tlb_inval fence to control the PPC flush.
> + * Other components shouldn't modify fence directly.
> + */
> +void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> +				    bool flush_cache)
> +{
> +	fence->flush_cache = flush_cache;
> +}
> +
>  /**
>   * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence
>   * @tlb_inval: TLB invalidation client
> @@ -446,4 +460,5 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
>  	else
>  		dma_fence_get(&fence->base);
>  	fence->tlb_inval = tlb_inval;
> +	fence->flush_cache = true;

I don't think we want PRL (later in the series) or flush_cache stored in
the fence (i.e., don't modify the fence structure in this series) rather
store the PRL in the job and pass into xe_tlb_inval_range as argument,
NULL implictly implies flush the cache.

Matt

>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h
> index 9dbddc310eb9..b84ce3e6f294 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> @@ -24,8 +24,9 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
>  void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm);
>  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
>  		       struct xe_tlb_inval_fence *fence,
> -		       u64 start, u64 end, u32 asid);
> -
> +		       u64 start, u64 end, u32 asid, bool flush_cache);
> +void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> +				    bool flush_cache);
>  void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
>  			     struct xe_tlb_inval_fence *fence,
>  			     bool stack);
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> index 1ae0dec2cf31..6248f90323a9 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> @@ -49,7 +49,7 @@ static struct dma_fence *xe_tlb_inval_job_run(struct xe_dep_job *dep_job)
>  		container_of(job->fence, typeof(*ifence), base);
>  
>  	xe_tlb_inval_range(job->tlb_inval, ifence, job->start,
> -			   job->end, job->vm->usm.asid);
> +			   job->end, job->vm->usm.asid, ifence->flush_cache);
>  
>  	return job->fence;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> index 7a6967ce3b76..c3c3943fb07e 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> @@ -40,12 +40,13 @@ struct xe_tlb_inval_ops {
>  	 * @start: Start address
>  	 * @end: End address
>  	 * @asid: Address space ID
> +	 * @flush_cache: PPC flush control
>  	 *
>  	 * Return 0 on success, -ECANCELED if backend is mid-reset, error on
>  	 * failure
>  	 */
>  	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
> -		     u64 end, u32 asid);
> +		     u64 end, u32 asid, bool flush_cache);
>  
>  	/**
>  	 * @initialized: Backend is initialized
> @@ -126,6 +127,8 @@ struct xe_tlb_inval_fence {
>  	int seqno;
>  	/** @inval_time: time of TLB invalidation */
>  	ktime_t inval_time;
> +	/** @flush_cache: bool for PPC flush, default is true */
> +	bool flush_cache;
>  };
>  
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 7cac646bdf1c..5fb5226574c5 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3907,7 +3907,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start,
>  
>  		err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval,
>  					 &fence[fence_id], start, end,
> -					 vm->usm.asid);
> +					 vm->usm.asid, true);
>  		if (err)
>  			goto wait;
>  		++fence_id;
> @@ -3920,7 +3920,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start,
>  
>  		err = xe_tlb_inval_range(&tile->media_gt->tlb_inval,
>  					 &fence[fence_id], start, end,
> -					 vm->usm.asid);
> +					 vm->usm.asid, true);
>  		if (err)
>  			goto wait;
>  		++fence_id;
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/11] drm/xe: Suballocate BO for page reclaim
  2025-11-18  9:05 ` [PATCH 07/11] drm/xe: Suballocate BO for page reclaim Brian Nguyen
@ 2025-11-22 19:42   ` Matthew Brost
  2025-11-25 11:20     ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-22 19:42 UTC (permalink / raw)
  To: Brian Nguyen; +Cc: intel-xe, tejas.upadhyay, shuicheng.lin, stuart.summers

On Tue, Nov 18, 2025 at 05:05:48PM +0800, Brian Nguyen wrote:
> Page reclamation feature needs the PRL to be suballocated into a
> GGTT-mapped BO. On allocation failure, fallback to default tlb
> invalidation with full PPC flush.
> 
> PRL's BO allocation is managed in separate pool to ensure 4K alignment
> for proper GGTT address.
> 
> With BO, pass into TLB invalidation backend and modify fence to
> accomadate accordingly.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device_types.h    |  7 ++++++
>  drivers/gpu/drm/xe/xe_page_reclaim.c    | 33 +++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_page_reclaim.h    |  4 +++
>  drivers/gpu/drm/xe/xe_tile.c            |  5 ++++
>  drivers/gpu/drm/xe/xe_tlb_inval.c       | 18 ++++++++++++--
>  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++
>  6 files changed, 70 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 268c8e28601a..057df3f9dc1d 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -184,6 +184,13 @@ struct xe_tile {
>  		 * Media GT shares a pool with its primary GT.
>  		 */
>  		struct xe_sa_manager *kernel_bb_pool;
> +
> +		/**
> +		 * @mem.reclaim_pool: Pool for PRLs allocated.
> +		 *
> +		 * Only main GT has page reclaim list allocations.
> +		 */
> +		struct xe_sa_manager *reclaim_pool;
>  	} mem;
>  
>  	/** @sriov: tile level virtualization data */
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
> index a0d15efff58c..801a7f1731c0 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> @@ -13,6 +13,39 @@
>  #include "regs/xe_gt_regs.h"
>  #include "xe_assert.h"
>  #include "xe_macros.h"
> +#include "xe_sa.h"
> +#include "xe_tlb_inval_types.h"
> +
> +/**
> + * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT BO
> + * @tlb_inval: TLB invalidation frontend associated with the request
> + * @fence: Fence carrying the PRL metadata
> + *
> + * Suballocates a 4K BO out of the tile reclaim pool, copies the PRL CPU
> + * copy into the BO and queues the buffer for release when @fence signals.
> + *
> + * Return: 0 on success or -ENOMEM if the suballocation fails.
> + */
> +int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence)

As discussed here [1] let's try to avoid storing anything in related to
PRL in "struct xe_tlb_inval_fence". So I think reclaim_entries + number
entries should be argumen to this function and return "struct
drm_subaloc*) or ERR_PTR here.

[1] https://patchwork.freedesktop.org/patch/689042/?series=157698&rev=1#comment_1267062

> +{
> +	struct xe_gt *gt = container_of(tlb_inval, struct xe_gt, tlb_inval);
> +	struct xe_tile *tile = gt_to_tile(gt);
> +
> +	/* Maximum size of PRL is 1 4K-page */
> +	fence->prl_sa = __xe_sa_bo_new(tile->mem.reclaim_pool,
> +				       XE_PAGE_RECLAIM_LIST_MAX_SIZE, GFP_ATOMIC);

Any reason we can't pass in the number of entries for better
suballocation? Or does PRL in GuC interface need to be page aligned?

> +	if (IS_ERR(fence->prl_sa))
> +		return -ENOMEM;
> +
> +	memcpy(xe_sa_bo_cpu_addr(fence->prl_sa), fence->reclaim_entries,
> +	       XE_PAGE_RECLAIM_LIST_MAX_SIZE);

If we had the number of entries we could save a few instructions on the
memory copy too.

> +	xe_sa_bo_flush_write(fence->prl_sa);
> +
> +	/* Queue up sa_bo_free on fence signal */
> +	xe_sa_bo_free(fence->prl_sa, &fence->base);
> +
> +	return 0;
> +}
>  
>  /**
>   * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
> index d066d7d97f79..f82b4d0865e0 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> @@ -15,6 +15,9 @@
>  #define XE_PAGE_RECLAIM_MAX_ENTRIES	512
>  #define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
>  
> +struct xe_tlb_inval;
> +struct xe_tlb_inval_fence;
> +
>  struct xe_guc_page_reclaim_entry {
>  	u32 valid:1;
>  	u32 reclamation_size:6;
> @@ -32,6 +35,7 @@ struct xe_page_reclaim_list {
>  #define XE_PAGE_RECLAIM_INVALID_LIST	-1
>  };
>  
> +int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence);
>  void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
>  int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
>  static inline void xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries)
> diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
> index 4f4f9a5c43af..63c060c2ea5c 100644
> --- a/drivers/gpu/drm/xe/xe_tile.c
> +++ b/drivers/gpu/drm/xe/xe_tile.c
> @@ -209,6 +209,11 @@ int xe_tile_init(struct xe_tile *tile)
>  	if (IS_ERR(tile->mem.kernel_bb_pool))
>  		return PTR_ERR(tile->mem.kernel_bb_pool);
>  
> +	/* Optimistically anticipate at most 256 TLB fences with PRL */
> +	tile->mem.reclaim_pool = xe_sa_bo_manager_init(tile, SZ_1M, XE_PAGE_RECLAIM_LIST_MAX_SIZE);
> +	if (IS_ERR(tile->mem.reclaim_pool))
> +		return PTR_ERR(tile->mem.reclaim_pool);
> +
>  	return 0;
>  }
>  void xe_tile_migrate_wait(struct xe_tile *tile)
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index de275759743c..67a047521165 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -15,6 +15,7 @@
>  #include "xe_guc_ct.h"
>  #include "xe_guc_tlb_inval.h"
>  #include "xe_mmio.h"
> +#include "xe_page_reclaim.h"
>  #include "xe_pm.h"
>  #include "xe_tlb_inval.h"
>  #include "xe_trace.h"
> @@ -326,8 +327,19 @@ int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
>  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
>  		       u32 asid, bool flush_cache)
>  {
> -	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> -				  start, end, asid, flush_cache);
> +	int err;
> +
> +	if (fence->reclaim_entries) {
> +		err = xe_page_reclaim_create_prl_bo(tlb_inval, fence);
> +		if (err) {
> +			flush_cache = true;
> +			fence->prl_sa = NULL;
> +		}
> +	}

Should we do the above step in run_job of the TLB invalidation job? I
think that might be cleaner wrt to layering and make it clear only TLB
invalidation jobs can use PRL. I don't see an easy way to implement
non-job based TLB invalidations with a PRL as those are typically in the
path of reclaim (no memory allocations).

> +	err = xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> +				 start, end, asid, flush_cache);
> +
> +	return err;
>  }
>  
>  /**
> @@ -461,4 +473,6 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
>  		dma_fence_get(&fence->base);
>  	fence->tlb_inval = tlb_inval;
>  	fence->flush_cache = true;
> +	fence->reclaim_entries = NULL;
> +	fence->prl_sa = NULL;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> index c3c3943fb07e..7cf741e6a0c7 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> @@ -9,6 +9,7 @@
>  #include <linux/workqueue.h>
>  #include <linux/dma-fence.h>
>  
> +struct xe_guc_page_reclaim_entry;
>  struct xe_tlb_inval;
>  
>  /** struct xe_tlb_inval_ops - TLB invalidation ops (backend) */
> @@ -129,6 +130,10 @@ struct xe_tlb_inval_fence {
>  	ktime_t inval_time;
>  	/** @flush_cache: bool for PPC flush, default is true */
>  	bool flush_cache;
> +	/** @reclaim_entries: list of pages to reclaim */
> +	struct xe_guc_page_reclaim_entry *reclaim_entries;
> +	/** @prl_sa: BO allocation for page reclaim list */
> +	struct drm_suballoc *prl_sa;

Again, let's try to hard move all of these things out the fence (store
them in the job if needed).

Matt

>  };
>  
>  #endif
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  2025-11-18  9:05 ` [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim Brian Nguyen
@ 2025-11-24 12:29   ` Matthew Auld
  2025-11-25  6:12     ` Nguyen, Brian3
  2025-11-25 11:48     ` Upadhyay, Tejas
  0 siblings, 2 replies; 51+ messages in thread
From: Matthew Auld @ 2025-11-24 12:29 UTC (permalink / raw)
  To: Brian Nguyen, intel-xe
  Cc: tejas.upadhyay, matthew.brost, shuicheng.lin, stuart.summers

On 18/11/2025 09:05, Brian Nguyen wrote:
> In Xe3p and beyond, there are additional hardware managed L2$ flushing
> for the deemed transient display and transient app buffers. In those
> scenarios, page reclamation is unnecessary resulting in redundant
> cachline flushes, so skip over those corresponding ranges.
> 
> Add chicken bit to determine media engine status to help facilitate
> decision making in L2$ flush skipping.
> 
> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>   drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 +++++++
>   drivers/gpu/drm/xe/xe_page_reclaim.c | 43 ++++++++++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_page_reclaim.h |  3 ++
>   drivers/gpu/drm/xe/xe_pat.c          |  9 +-----
>   drivers/gpu/drm/xe/xe_pt.c           |  3 +-
>   5 files changed, 60 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index 917a088c28f2..a18a2d59153e 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -99,6 +99,14 @@
>   #define VE1_AUX_INV				XE_REG(0x42b8)
>   #define   AUX_INV				REG_BIT(0)
>   
> +#define _PAT_PTA				0x4820
> +#define   XE2_NO_PROMOTE			REG_BIT(10)
> +#define   XE2_COMP_EN				REG_BIT(9)
> +#define   XE2_L3_CLOS				REG_GENMASK(7, 6)
> +#define   XE2_L3_POLICY				REG_GENMASK(5, 4)
> +#define   XE2_L4_POLICY				REG_GENMASK(3, 2)
> +#define   XE2_COH_MODE				REG_GENMASK(1, 0)
> +
>   #define XE2_LMEM_CFG				XE_REG(0x48b0)
>   
>   #define XEHP_FLAT_CCS_BASE_ADDR			XE_REG_MCR(0x4910)
> @@ -429,6 +437,9 @@
>   
>   #define XE2_GLOBAL_INVAL			XE_REG(0xb404)
>   
> +#define LTISEQCHK				XE_REG(0xb49c)
> +#define   XE3P_MEDIA_IS_ON			REG_BIT(2)
> +
>   #define XE2LPM_L3SQCREG2			XE_REG_MCR(0xb604)
>   
>   #define XE2LPM_L3SQCREG3			XE_REG_MCR(0xb608)
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c
> index 801a7f1731c0..2f0e7547732c 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> @@ -13,8 +13,51 @@
>   #include "regs/xe_gt_regs.h"
>   #include "xe_assert.h"
>   #include "xe_macros.h"
> +#include "xe_mmio.h"
> +#include "xe_pat.h"
>   #include "xe_sa.h"
>   #include "xe_tlb_inval_types.h"
> +#include "xe_vm.h"
> +
> +/**
> + * xe_page_reclaim_skip() - Decide whether PRL should be skipped for a VMA
> + * @tile: Tile owning the VMA
> + * @vma: VMA under consideration
> + *
> + * Xe3p and beyond can handle PPC flushing for specific PAT encodings.
> + * Skip PPC flushing in both scenarios below.
> + * - pat_index is transient display (1)
> + * - pat_index is transient app (2) and Media is off
> + *
> + * Return: true when page reclamation is unnecessary, false otherwise.
> + */
> +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma)
> +{
> +	struct xe_device *xe = xe_vma_vm(vma)->xe;
> +	struct xe_mmio *mmio = &tile->primary_gt->mmio;
> +	u16 pat_index = vma->attr.pat_index;
> +	u32 pat_value;
> +	u8 l3_policy;
> +	bool is_media_awake;
> +
> +	/* Ensure called only with Xe3p due to associated PAT index */
> +	xe_assert(tile->xe, GRAPHICS_VER(tile->xe) >= 35);
> +	xe_assert(tile->xe, pat_index < xe->pat.n_entries);
> +
> +	pat_value = xe->pat.table[pat_index].value;
> +	l3_policy = REG_FIELD_GET(XE2_L3_POLICY, pat_value);

I think if we need something like this, it might make sense to create a 
helper in xe_pat and use that here? Not sure if want stuff outside of 
xe_pat looking at such internals.

> +	is_media_awake = xe_mmio_read32(mmio, LTISEQCHK) & XE3P_MEDIA_IS_ON;

Do we need this? Whether media is off/on should be an internal detail 
for fw/hw, not KMD I think, and will influence whether fw/hw will only 
flush cahelines shared with CPU or whether to flush entire cache at 
various places, like end of submission. Also this seems racy, since 
Media can turn on/off after checking this?

> +
> +	/**
> +	 *   - l3_policy:   0=WB, 1=XD ("WB - Transient Display"),

Why do we skip Transient Display? Can you share some more details or 
maybe add a comment here? AFAIK transient display just allows using the 
GPU caches for display surfaces, with the idea of then doing a targeted 
transient flush only when doing the actual scanout. On newer hw this 
flush is done by hw, I think, instead of KMD, but I assume it is only 
done when doing the scanout step? Or is that now handled differently?

Concern here is that user does render copy to display surface with 
transient display PAT index but then never does an actual scanout, and 
then just deletes the memory. Where is the flush in that flow?

> +	 *                  2=XA ("WB - Transient App" for Xe3p), 3=UC
> +	 * From Xe3p, transient display flush is taken care by HW, l3_policy = 1
> +	 *
> +	 * Also with Xe3p, pat_index=18/19 corresponds to transient app flushing
> +	 * which is handled by HW when media is off.
> +	 */
> +	return (l3_policy == 1 || (!is_media_awake && (pat_index == 18 || pat_index == 19)));
> +}
>   
>   /**
>    * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT BO
> diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h
> index f82b4d0865e0..dafd4edd6f61 100644
> --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> @@ -17,6 +17,8 @@
>   
>   struct xe_tlb_inval;
>   struct xe_tlb_inval_fence;
> +struct xe_tile;
> +struct xe_vma;
>   
>   struct xe_guc_page_reclaim_entry {
>   	u32 valid:1;
> @@ -35,6 +37,7 @@ struct xe_page_reclaim_list {
>   #define XE_PAGE_RECLAIM_INVALID_LIST	-1
>   };
>   
> +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma);
>   int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence);
>   void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
>   int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
> diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
> index 1b4d5d3def0f..4783acd1f027 100644
> --- a/drivers/gpu/drm/xe/xe_pat.c
> +++ b/drivers/gpu/drm/xe/xe_pat.c
> @@ -9,6 +9,7 @@
>   
>   #include <generated/xe_wa_oob.h>
>   
> +#include "regs/xe_gt_regs.h"
>   #include "regs/xe_reg_defs.h"
>   #include "xe_assert.h"
>   #include "xe_device.h"
> @@ -23,14 +24,6 @@
>   #define _PAT_INDEX(index)			_PICK_EVEN_2RANGES(index, 8, \
>   								   0x4800, 0x4804, \
>   								   0x4848, 0x484c)
> -#define _PAT_PTA				0x4820
> -
> -#define XE2_NO_PROMOTE				REG_BIT(10)
> -#define XE2_COMP_EN				REG_BIT(9)
> -#define XE2_L3_CLOS				REG_GENMASK(7, 6)
> -#define XE2_L3_POLICY				REG_GENMASK(5, 4)
> -#define XE2_L4_POLICY				REG_GENMASK(3, 2)
> -#define XE2_COH_MODE				REG_GENMASK(1, 0)
>   
>   #define XELPG_L4_POLICY_MASK			REG_GENMASK(3, 2)
>   #define XELPG_PAT_3_UC				REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 03723c8d2601..8ccab39c2599 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -2008,7 +2008,8 @@ static int unbind_op_prepare(struct xe_tile *tile,
>   		if (err < 0)
>   			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
>   	}
> -	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL;
> +	pt_op->prl = (pt_update_ops->prl.entries &&
> +		     !xe_page_reclaim_skip(tile, vma)) ? &pt_update_ops->prl : NULL;
>   
>   	err = vma_reserve_fences(tile_to_xe(tile), vma);
>   	if (err)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  2025-11-24 12:29   ` Matthew Auld
@ 2025-11-25  6:12     ` Nguyen, Brian3
  2025-11-25 11:48     ` Upadhyay, Tejas
  1 sibling, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25  6:12 UTC (permalink / raw)
  To: Auld, Matthew, intel-xe@lists.freedesktop.org, Upadhyay, Tejas
  Cc: Brost, Matthew, Lin, Shuicheng, Summers, Stuart

On Monday, November 24, 2025 4:30 AM, Matthew Auld wrote:
> On 18/11/2025 09:05, Brian Nguyen wrote:
> > In Xe3p and beyond, there are additional hardware managed L2$ flushing
> > for the deemed transient display and transient app buffers. In those
> > scenarios, page reclamation is unnecessary resulting in redundant
> > cachline flushes, so skip over those corresponding ranges.
> >
> > Add chicken bit to determine media engine status to help facilitate
> > decision making in L2$ flush skipping.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 +++++++
> >   drivers/gpu/drm/xe/xe_page_reclaim.c | 43 ++++++++++++++++++++++++++++
> >   drivers/gpu/drm/xe/xe_page_reclaim.h |  3 ++
> >   drivers/gpu/drm/xe/xe_pat.c          |  9 +-----
> >   drivers/gpu/drm/xe/xe_pt.c           |  3 +-
> >   5 files changed, 60 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > index 917a088c28f2..a18a2d59153e 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > @@ -99,6 +99,14 @@
> >   #define VE1_AUX_INV				XE_REG(0x42b8)
> >   #define   AUX_INV				REG_BIT(0)
> >
> > +#define _PAT_PTA				0x4820
> > +#define   XE2_NO_PROMOTE			REG_BIT(10)
> > +#define   XE2_COMP_EN				REG_BIT(9)
> > +#define   XE2_L3_CLOS				REG_GENMASK(7, 6)
> > +#define   XE2_L3_POLICY				REG_GENMASK(5, 4)
> > +#define   XE2_L4_POLICY				REG_GENMASK(3, 2)
> > +#define   XE2_COH_MODE				REG_GENMASK(1, 0)
> > +
> >   #define XE2_LMEM_CFG				XE_REG(0x48b0)
> >
> >   #define XEHP_FLAT_CCS_BASE_ADDR 	XE_REG_MCR(0x4910)
> > @@ -429,6 +437,9 @@
> >
> >   #define XE2_GLOBAL_INVAL			XE_REG(0xb404)
> >
> > +#define LTISEQCHK				XE_REG(0xb49c)
> > +#define   XE3P_MEDIA_IS_ON			REG_BIT(2)
> > +
> >   #define XE2LPM_L3SQCREG2			XE_REG_MCR(0xb604)
> >
> >   #define XE2LPM_L3SQCREG3			XE_REG_MCR(0xb608)
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > index 801a7f1731c0..2f0e7547732c 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > @@ -13,8 +13,51 @@
> >   #include "regs/xe_gt_regs.h"
> >   #include "xe_assert.h"
> >   #include "xe_macros.h"
> > +#include "xe_mmio.h"
> > +#include "xe_pat.h"
> >   #include "xe_sa.h"
> >   #include "xe_tlb_inval_types.h"
> > +#include "xe_vm.h"
> > +
> > +/**
> > + * xe_page_reclaim_skip() - Decide whether PRL should be skipped for
> > +a VMA
> > + * @tile: Tile owning the VMA
> > + * @vma: VMA under consideration
> > + *
> > + * Xe3p and beyond can handle PPC flushing for specific PAT encodings.
> > + * Skip PPC flushing in both scenarios below.
> > + * - pat_index is transient display (1)
> > + * - pat_index is transient app (2) and Media is off
> > + *
> > + * Return: true when page reclamation is unnecessary, false otherwise.
> > + */
> > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma) {
> > +	struct xe_device *xe = xe_vma_vm(vma)->xe;
> > +	struct xe_mmio *mmio = &tile->primary_gt->mmio;
> > +	u16 pat_index = vma->attr.pat_index;
> > +	u32 pat_value;
> > +	u8 l3_policy;
> > +	bool is_media_awake;
> > +
> > +	/* Ensure called only with Xe3p due to associated PAT index */
> > +	xe_assert(tile->xe, GRAPHICS_VER(tile->xe) >= 35);
> > +	xe_assert(tile->xe, pat_index < xe->pat.n_entries);
> > +
> > +	pat_value = xe->pat.table[pat_index].value;
> > +	l3_policy = REG_FIELD_GET(XE2_L3_POLICY, pat_value);
> 
> I think if we need something like this, it might make sense to create a helper in
> xe_pat and use that here? Not sure if want stuff outside of xe_pat looking at such
> internals.
> 

Ahh, got it. I can move those changes into xe_pat. Perhaps after we confirm if this
would still be needed.

> > +	is_media_awake = xe_mmio_read32(mmio, LTISEQCHK) & XE3P_MEDIA_IS_ON;
> 
> Do we need this? Whether media is off/on should be an internal detail for fw/hw,
> not KMD I think, and will influence whether fw/hw will only flush cahelines shared
> with CPU or whether to flush entire cache at various places, like end of
> submission. Also this seems racy, since Media can turn on/off after checking this?
> 

Ohh, that makes sense. Agree on this being racy. This was part of a
conversation I had with Tejas about attempting to optimize some of the
redundant cacheline flushing possible in Xe3p. Since page reclamation is
issued by KMD, we could potentially be flushing cachelines handled
previously by HW with transient app pat index.

My thought was that in the transient app phase where a full L2$ flush is still
performed, i.e. when media is on, page reclamation could be used to
still provide some benefit in that case. But it looks like it may be better to
remove this media check entirely and just decide on letting the default
TLB invalidation flow handle it (with full PPC flush)?

> > +
> > +	/**
> > +	 *   - l3_policy:   0=WB, 1=XD ("WB - Transient Display"),
> 
> Why do we skip Transient Display? Can you share some more details or
> maybe add a comment here? AFAIK transient display just allows using the
> GPU caches for display surfaces, with the idea of then doing a targeted
> transient flush only when doing the actual scanout. On newer hw this
> flush is done by hw, I think, instead of KMD, but I assume it is only
> done when doing the scanout step? Or is that now handled differently?
> 

From my impression, transient display flushing would be handled by HW
on newer HW as you stated. We are skipping page reclamation in order
to avoid having to do duplicate flushes on the same cachelines.
From Tejas's previous description, transient display flushes are optimized
by the HW and thus driver shouldn't be issuing a page reclamation
operation to handle flushing those same cachelines again.
(same logic extend for transient app flushes). 

Perhaps my ignorance & inexperience is showing here and I misunderstood
something...
@Upadhyay, Tejas could you provide your thoughts here as well? You may
be more well equipped to answer when the flushes on the transient *
pat index here occurs. Thanks.

> Concern here is that user does render copy to display surface with
> transient display PAT index but then never does an actual scanout, and
> then just deletes the memory. Where is the flush in that flow?
> 

Again, this might be more related to Tejas' transient flushing feature here,
so should probably rely on his answer but if you're deleting the memory,
that implies an unbind of said memory, so you follow the standard tlb
invalidation job that this patch series is targeting. Previously I had
understood the feature as a blanket "the HW manages flushing of all
transient app/display pat indices", but that might not necessarily be
the case upon revisiting the spec.

So, let's see what Tejas thinks about it...

Brian

> > +	 *                  2=XA ("WB - Transient App" for Xe3p), 3=UC
> > +	 * From Xe3p, transient display flush is taken care by HW, l3_policy = 1
> > +	 *
> > +	 * Also with Xe3p, pat_index=18/19 corresponds to transient app flushing
> > +	 * which is handled by HW when media is off.
> > +	 */
> > +	return (l3_policy == 1 || (!is_media_awake && (pat_index == 18 || pat_index == 19)));
> > +}
> >
> >   /**
> >    * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT BO
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > index f82b4d0865e0..dafd4edd6f61 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > @@ -17,6 +17,8 @@
> >
> >   struct xe_tlb_inval;
> >   struct xe_tlb_inval_fence;
> > +struct xe_tile;
> > +struct xe_vma;
> >
> >   struct xe_guc_page_reclaim_entry {
> >   	u32 valid:1;
> > @@ -35,6 +37,7 @@ struct xe_page_reclaim_list {
> >   #define XE_PAGE_RECLAIM_INVALID_LIST	-1
> >   };
> >
> > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma);
> >   int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct xe_tlb_inval_fence *fence);
> >   void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> >   int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
> > diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
> > index 1b4d5d3def0f..4783acd1f027 100644
> > --- a/drivers/gpu/drm/xe/xe_pat.c
> > +++ b/drivers/gpu/drm/xe/xe_pat.c
> > @@ -9,6 +9,7 @@
> >
> >   #include <generated/xe_wa_oob.h>
> >
> > +#include "regs/xe_gt_regs.h"
> >   #include "regs/xe_reg_defs.h"
> >   #include "xe_assert.h"
> >   #include "xe_device.h"
> > @@ -23,14 +24,6 @@
> >   #define _PAT_INDEX(index)			_PICK_EVEN_2RANGES(index, 8, \
> >   								   0x4800, 0x4804, \
> >   								   0x4848, 0x484c)
> > -#define _PAT_PTA				0x4820
> > -
> > -#define XE2_NO_PROMOTE				REG_BIT(10)
> > -#define XE2_COMP_EN				REG_BIT(9)
> > -#define XE2_L3_CLOS				REG_GENMASK(7, 6)
> > -#define XE2_L3_POLICY				REG_GENMASK(5, 4)
> > -#define XE2_L4_POLICY				REG_GENMASK(3, 2)
> > -#define XE2_COH_MODE				REG_GENMASK(1, 0)
> >
> >   #define XELPG_L4_POLICY_MASK			REG_GENMASK(3, 2)
> >   #define XELPG_PAT_3_UC	REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index 03723c8d2601..8ccab39c2599 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -2008,7 +2008,8 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >   		if (err < 0)
> >   			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> >   	}
> > -	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL;
> > +	pt_op->prl = (pt_update_ops->prl.entries &&
> > +		     !xe_page_reclaim_skip(tile, vma)) ? &pt_update_ops->prl : NULL;
> >
> >   	err = vma_reserve_fences(tile_to_xe(tile), vma);
> >   	if (err)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received
  2025-11-22 18:25   ` Matthew Brost
@ 2025-11-25 11:01     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:01 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Saturday, November 22, 2025 10:25 AM, Matthew Brost wrote:
> On Tue, Nov 18, 2025 at 05:05:43PM +0800, Brian Nguyen wrote:
> > TLB_INVALIDATION_SEQNO_INVALID are now used to indicate in progress
> > multi-step TLB invalidations, so reset tdr to ensure that action won't
> > prematurely trigger when G2H actions are still ongoing.
> >
> 
> I think thid patch makes sense but comments below.
> 
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c |  2 ++
> >  drivers/gpu/drm/xe/xe_tlb_inval.c     | 16 ++++++++++++++++
> >  drivers/gpu/drm/xe/xe_tlb_inval.h     |  1 +
> >  3 files changed, 19 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index f1fd2dd90742..cd126c53faab 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -238,6 +238,8 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc
> > *guc, u32 *msg, u32 len)
> >
> >  	if (msg[0] != TLB_INVALIDATION_SEQNO_INVALID)
> >  		xe_tlb_inval_done_handler(&gt->tlb_inval, msg[0]);
> > +	else
> > +		xe_tlb_inval_reset_timeout(&gt->tlb_inval);
> >
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..50f05d6b5672 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -199,6 +199,22 @@ void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval)
> >  	mutex_unlock(&tlb_inval->seqno_lock);
> >  }
> >
> > +/**
> > + * xe_tlb_inval_reset_timeout() - Reset TLB inval fence timeout
> > + * @tlb_inval: TLB invalidation client
> > + *
> > + * Reset the TLB invalidation timeout timer.
> > + */
> > +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval) {
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&tlb_inval->pending_lock, flags);
> > +	mod_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > +			 tlb_inval->ops->timeout_delay(tlb_inval));
> 
> You don't need a lock for this. It is done in xe_tlb_inval_done_handler under this
> lock as the pending list of TLB invalidations, which pending_lock protects, is tied to
> whether we want to reschedule the timeout or cancel it. So please drop the lock
> here and then also update xe_tlb_inval_done_handler to call this new helper.
> 
> Matt
> 

I have moved xe_tlb_inval_reset_timeout into xe_tlb_inval_done_handler
and converted it to a static func. This also implies that the
xe_guc_tlb_inval_done_handler has been reverted to its original
changes and the SEQNO_INVALID skip are done in the done_handler now.

Brian

> > +	spin_unlock_irqrestore(&tlb_inval->pending_lock, flags); }
> > +
> >  static bool xe_tlb_inval_seqno_past(struct xe_tlb_inval *tlb_inval,
> > int seqno)  {
> >  	int seqno_recv = READ_ONCE(tlb_inval->seqno_recv); diff --git
> > a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > index 05614915463a..9dbddc310eb9 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > @@ -17,6 +17,7 @@ struct xe_vm;
> >  int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
> >
> >  void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval);
> > +void xe_tlb_inval_reset_timeout(struct xe_tlb_inval *tlb_inval);
> >  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
> >  		     struct xe_tlb_inval_fence *fence);  int
> > xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> > --
> > 2.51.2
> >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
  2025-11-22 19:32   ` Matthew Brost
@ 2025-11-25 11:07     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:07 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Saturday, November 22, 2025 11:33 AM, Matthew Brost wrote:
> On Tue, Nov 18, 2025 at 05:05:44PM +0800, Brian Nguyen wrote:
> > Allow for tlb_invalidation to configure when driver wants to flush the
> > Private Physical Cache (PPC) as a process of the tlb invalidation
> > process.
> >
> > Default behavior is still to always flush the PPC but driver now has
> > the option to disable it.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c   | 11 +++++++----
> >  drivers/gpu/drm/xe/xe_tlb_inval.c       | 21 ++++++++++++++++++---
> >  drivers/gpu/drm/xe/xe_tlb_inval.h       |  5 +++--
> >  drivers/gpu/drm/xe/xe_tlb_inval_job.c   |  2 +-
> >  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++-
> >  drivers/gpu/drm/xe/xe_vm.c              |  4 ++--
> >  6 files changed, 35 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > index cd126c53faab..c05709a5bc98 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > @@ -34,9 +34,12 @@ static int send_tlb_inval(struct xe_guc *guc, const u32
> *action, int len)
> >  			      G2H_LEN_DW_TLB_INVALIDATE, 1);  }
> >
> > -#define MAKE_INVAL_OP(type)	((type <<
> XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> > +#define MAKE_INVAL_OP_FLUSH(type, flush_cache)	((type <<
> XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> >  		XE_GUC_TLB_INVAL_MODE_HEAVY <<
> XE_GUC_TLB_INVAL_MODE_SHIFT | \
> > -		XE_GUC_TLB_INVAL_FLUSH_CACHE)
> > +		(flush_cache ? \
> > +		XE_GUC_TLB_INVAL_FLUSH_CACHE : 0))
> > +
> > +#define MAKE_INVAL_OP(type)	MAKE_INVAL_OP_FLUSH(type, true)
> >
> >  static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, u32
> > seqno)  { @@ -100,7 +103,7 @@ static int send_tlb_inval_ggtt(struct
> > xe_tlb_inval *tlb_inval, u32 seqno)  #define
> > MAX_RANGE_TLB_INVALIDATION_LENGTH
> (rounddown_pow_of_two(ULONG_MAX))
> >
> >  static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> > -				u64 start, u64 end, u32 asid)
> > +				u64 start, u64 end, u32 asid, bool flush_cache)
> 
> Later in the series a drm_suballoc is passed in as an argument here.
> Isn't that enough to know if we need to flush the cache?
> 

My original intentions here were accounting for the scenario we may not
not be using page reclaim, but some other feature was handling the HW
flush, such as the HW transient display/app flushing. However it may not
be necessary anymore. I'll remove it for now.

> >  {
> >  #define MAX_TLB_INVALIDATION_LEN	7
> >  	struct xe_guc *guc = tlb_inval->private; @@ -154,7 +157,7 @@ static
> > int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
> >  						    ilog2(SZ_2M) + 1)));
> >  		xe_gt_assert(gt, IS_ALIGNED(start, length));
> >
> > -		action[len++] =
> MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
> > +		action[len++] =
> > +MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE, flush_cache);
> >  		action[len++] = asid;
> >  		action[len++] = lower_32_bits(start);
> >  		action[len++] = upper_32_bits(start); diff --git
> > a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 50f05d6b5672..de275759743c 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -324,10 +324,10 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval)
> >   */
> >  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> >  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
> > -		       u32 asid)
> > +		       u32 asid, bool flush_cache)
> 
> Then here, later in the series PRL is attached to the fence but can we change that
> to an argument here?
> 

Sure, will pass the prl_sa here later in the series and do allocation in tlb_inval_job.
I'll remove these flush_cache through all the range and consider the prl_sa as
the primary indicator to flush PPC or not.

> >  {
> >  	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> > -				  start, end, asid);
> > +				  start, end, asid, flush_cache);
> >  }
> >
> >  /**
> > @@ -343,7 +343,7 @@ void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> struct xe_vm *vm)
> >  	u64 range = 1ull << vm->xe->info.va_bits;
> >
> >  	xe_tlb_inval_fence_init(tlb_inval, &fence, true);
> > -	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid);
> > +	xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid, true);
> >  	xe_tlb_inval_fence_wait(&fence);
> >  }
> >
> > @@ -420,6 +420,20 @@ static const struct dma_fence_ops inval_fence_ops = {
> >  	.get_timeline_name = xe_inval_fence_get_timeline_name,  };
> >
> > +/**
> > + * xe_tlb_inval_fence_flush_cache - Control PPC flush at invalidation
> > + * @fence: TLB inval fence
> > + * @flush_cache: whether to perform PPC cache flush
> > + *
> > + * Helper function to modify the tlb_inval fence to control the PPC flush.
> > + * Other components shouldn't modify fence directly.
> > + */
> > +void xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> > +				    bool flush_cache)
> > +{
> > +	fence->flush_cache = flush_cache;
> > +}
> > +
> >  /**
> >   * xe_tlb_inval_fence_init() - Initialize TLB invalidation fence
> >   * @tlb_inval: TLB invalidation client @@ -446,4 +460,5 @@ void
> > xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
> >  	else
> >  		dma_fence_get(&fence->base);
> >  	fence->tlb_inval = tlb_inval;
> > +	fence->flush_cache = true;
> 
> I don't think we want PRL (later in the series) or flush_cache stored in the fence
> (i.e., don't modify the fence structure in this series) rather store the PRL in the job
> and pass into xe_tlb_inval_range as argument, NULL implictly implies flush the
> cache.
> 
> Matt
> 

Got it. I'll transition over using the PRL pointer to indicate flushing the BO, and
drop these flush_cache inputs here.

Brian

> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > index 9dbddc310eb9..b84ce3e6f294 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > @@ -24,8 +24,9 @@ int xe_tlb_inval_ggtt(struct xe_tlb_inval
> > *tlb_inval);  void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> > struct xe_vm *vm);  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> >  		       struct xe_tlb_inval_fence *fence,
> > -		       u64 start, u64 end, u32 asid);
> > -
> > +		       u64 start, u64 end, u32 asid, bool flush_cache); void
> > +xe_tlb_inval_fence_flush_cache(struct xe_tlb_inval_fence *fence,
> > +				    bool flush_cache);
> >  void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
> >  			     struct xe_tlb_inval_fence *fence,
> >  			     bool stack);
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > index 1ae0dec2cf31..6248f90323a9 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c
> > @@ -49,7 +49,7 @@ static struct dma_fence *xe_tlb_inval_job_run(struct
> xe_dep_job *dep_job)
> >  		container_of(job->fence, typeof(*ifence), base);
> >
> >  	xe_tlb_inval_range(job->tlb_inval, ifence, job->start,
> > -			   job->end, job->vm->usm.asid);
> > +			   job->end, job->vm->usm.asid, ifence->flush_cache);
> >
> >  	return job->fence;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > index 7a6967ce3b76..c3c3943fb07e 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > @@ -40,12 +40,13 @@ struct xe_tlb_inval_ops {
> >  	 * @start: Start address
> >  	 * @end: End address
> >  	 * @asid: Address space ID
> > +	 * @flush_cache: PPC flush control
> >  	 *
> >  	 * Return 0 on success, -ECANCELED if backend is mid-reset, error on
> >  	 * failure
> >  	 */
> >  	int (*ppgtt)(struct xe_tlb_inval *tlb_inval, u32 seqno, u64 start,
> > -		     u64 end, u32 asid);
> > +		     u64 end, u32 asid, bool flush_cache);
> >
> >  	/**
> >  	 * @initialized: Backend is initialized @@ -126,6 +127,8 @@ struct
> > xe_tlb_inval_fence {
> >  	int seqno;
> >  	/** @inval_time: time of TLB invalidation */
> >  	ktime_t inval_time;
> > +	/** @flush_cache: bool for PPC flush, default is true */
> > +	bool flush_cache;
> >  };
> >
> >  #endif
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index 7cac646bdf1c..5fb5226574c5 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -3907,7 +3907,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> > *vm, u64 start,
> >
> >  		err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval,
> >  					 &fence[fence_id], start, end,
> > -					 vm->usm.asid);
> > +					 vm->usm.asid, true);
> >  		if (err)
> >  			goto wait;
> >  		++fence_id;
> > @@ -3920,7 +3920,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm
> > *vm, u64 start,
> >
> >  		err = xe_tlb_inval_range(&tile->media_gt->tlb_inval,
> >  					 &fence[fence_id], start, end,
> > -					 vm->usm.asid);
> > +					 vm->usm.asid, true);
> >  		if (err)
> >  			goto wait;
> >  		++fence_id;
> > --
> > 2.51.2
> >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC
  2025-11-22 18:39       ` Matthew Brost
@ 2025-11-25 11:13         ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:13 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: Lin, Shuicheng, intel-xe@lists.freedesktop.org, Upadhyay,  Tejas,
	Summers, Stuart

On Saturday, November 22, 2025 10:40 AM, Brost, Matthew wrote:
> On Fri, Nov 21, 2025 at 06:56:27PM -0700, Nguyen, Brian3 wrote:
> >
> > On Friday, November 21, 2025 10:33 AM Lin, Shuicheng wrote:
> > > On Tue, Nov 18, 2025 1:06 AM Brian3 Nguyen wrote:
> > > > Add page reclamation related changes to GuC interface, handlers,
> > > > and senders to support page reclamation.
> > > >
> > > > Currently TLB invalidations will perform an entire PPC flush in
> > > > order to prevent stale memory access for noncoherent system
> > > > memory. Page reclamation is an extension of the typical TLB
> > > > invalidation workflow, allowing disabling of full PPC flush and
> > > > enable selective PPC flushing. Selective flushing will be decided
> > > > by a list of pages whom's address
> > > is passed to GuC at time of action.
> > > >
> > > > Page reclamation interfaces require at least GuC FW ver 70.31.0.
> > >
> > > Should driver disable this feature if the running FW is < 70.31.0?
> >
> > Default FW version is above this at time of patchset submission so I
> > had assumed it not to be a problem, since the danger is a user
> > forcibly using a bad FW which already has unpredictable results.
> >
> > However, in hindsight, it is easy enough to skip if FW version is
> > lower, and we can safely fallback to default TLB invalidation, so I'll
> > proceed with adding a check within the later patches that'll disable
> > page reclamation within the xe_guc_tlb_inval.c unless there are any
> > objections.
> >
> 
> I would just flip 'xe->info.has_page_reclaim_hw_assist' to false very early in driver
> load once we have the GuC version if GuC version doesn't support it.
> 

Makes much more sense to do... will add in xe_guc_init for now. Thanks!

> > > What will happen if driver send this action while GuC doesn't support it yet?
> > >
> > > Shuicheng
> > >
> >
> > AFAIK, if action is sent before correct FW version, it'll report out
> > GUC_HXG_TYPE_RESPONSE_FAILURE in g2h due to illegal operation,
> > eventually triggering reset.
> >
> > Brian
> >
> > > >
> > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/abi/guc_actions_abi.h |  2 ++
> > > >  drivers/gpu/drm/xe/xe_guc_ct.c           |  4 ++++
> > > >  drivers/gpu/drm/xe/xe_guc_fwif.h         |  1 +
> > > >  drivers/gpu/drm/xe/xe_guc_tlb_inval.c    | 14 ++++++++++++++
> > > >  4 files changed, 21 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > > b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > > index 47756e4674a1..11de3bdf69b5 100644
> > > > --- a/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > > +++ b/drivers/gpu/drm/xe/abi/guc_actions_abi.h
> > > > @@ -151,6 +151,8 @@ enum xe_guc_action {
> > > >  	XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
> > > >  	XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
> > > >  	XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
> > > > +	XE_GUC_ACTION_PAGE_RECLAMATION = 0x7003,
> > > > +	XE_GUC_ACTION_PAGE_RECLAMATION_DONE = 0x7004,
> > > >  	XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
> > > >  	XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
> > > >  	XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004, diff --git
> > > > a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > > index
> > > > 2697d711adb2..e13704e61032 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > > > @@ -1311,6 +1311,7 @@ static int parse_g2h_event(struct xe_guc_ct
> > > > *ct,
> > > > u32 *msg, u32 len)
> > > >  	case XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
> > > >  	case XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE:
> > > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > > >  		g2h_release_space(ct, len);
> > > >  	}
> > > >
> > > > @@ -1546,6 +1547,7 @@ static int process_g2h_msg(struct xe_guc_ct
> > > > *ct,
> > > > u32 *msg, u32 len)
> > > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > > >  		break;
> > > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> 
> I get what is happening here - page reclaim G2H just uses a shared seqno with TLB
> invalidations but this looks very odd to the reader of the code who hasn't worked
> on this or understands this as shared function is called for both G2H. Can you add
> a comment here explaining why it is ok to use a single G2H handler for both TLB
> invalidations and page reclaim?
> 
> Matt
> 

Sure! Will add a comment here explaining the relationship between the two.

Brian

> > > >  		break;
> > > >  	case XE_GUC_ACTION_GUC2PF_RELAY_FROM_VF:
> > > > @@ -1711,6 +1713,7 @@ static int g2h_read(struct xe_guc_ct *ct,
> > > > u32 *msg, bool fast_path)
> > > >  		switch (action) {
> > > >  		case XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC:
> > > >  		case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > > +		case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > > >  			break;	/* Process these in fast-path */
> > > >  		default:
> > > >  			return 0;
> > > > @@ -1747,6 +1750,7 @@ static void g2h_fast_path(struct xe_guc_ct
> > > > *ct,
> > > > u32 *msg, u32 len)
> > > >  		ret = xe_guc_pagefault_handler(guc, payload, adj_len);
> > > >  		break;
> > > >  	case XE_GUC_ACTION_TLB_INVALIDATION_DONE:
> > > > +	case XE_GUC_ACTION_PAGE_RECLAMATION_DONE:
> > > >  		__g2h_release_space(ct, len);
> > > >  		ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
> > > >  		break;
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > > b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > > index c90dd266e9cf..34d74a71c4f0 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> > > > @@ -16,6 +16,7 @@
> > > >  #define G2H_LEN_DW_DEREGISTER_CONTEXT		3
> > > >  #define G2H_LEN_DW_TLB_INVALIDATE		3
> > > >  #define G2H_LEN_DW_G2G_NOTIFY_MIN		3
> > > > +#define G2H_LEN_DW_PAGE_RECLAMATION		3
> > > >
> > > >  #define GUC_ID_MAX			65535
> > > >  #define GUC_ID_UNKNOWN			0xffffffff
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > index c05709a5bc98..3185f8dc00c4 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_tlb_inval.c
> > > > @@ -95,6 +95,20 @@ static int send_tlb_inval_ggtt(struct
> > > > xe_tlb_inval *tlb_inval, u32 seqno)
> > > >  	return -ECANCELED;
> > > >  }
> > > >
> > > > +static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
> > > > +			     u64 gpu_addr)
> > > > +{
> > > > +	u32 action[] = {
> > > > +		XE_GUC_ACTION_PAGE_RECLAMATION,
> > > > +		seqno,
> > > > +		lower_32_bits(gpu_addr),
> > > > +		upper_32_bits(gpu_addr),
> > > > +	};
> > > > +
> > > > +	return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > > > +			      G2H_LEN_DW_PAGE_RECLAMATION, 1); }
> > > > +
> > > >  /*
> > > >   * Ensure that roundup_pow_of_two(length) doesn't overflow.
> > > >   * Note that roundup_pow_of_two() operates on unsigned long,
> > > > --
> > > > 2.51.2
> >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-22 19:18   ` Matthew Brost
@ 2025-11-25 11:18     ` Nguyen, Brian3
  2025-11-25 18:34       ` Matthew Brost
  0 siblings, 1 reply; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:18 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > The PRL is firstly owned by pt_update_ops and all other page reclaim
> > operations will point back to this PRL. PRL generates its entries
> > during the unbind page walker, updating the PRL.
> >
> > This PRL is restricted to a 4K page, so 512 page entries at most.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/Makefile           |   1 +
> >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> >  6 files changed, 217 insertions(+)
> >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
> >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e4b273b025d2..048e6c93271c 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> >  	xe_oa.o \
> >  	xe_observation.o \
> >  	xe_pagefault.o \
> > +	xe_page_reclaim.o \
> >  	xe_pat.o \
> >  	xe_pci.o \
> >  	xe_pcode.o \
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > index 4389e5a76f89..4d83461e538b 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > @@ -9,6 +9,7 @@
> >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> >
> > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> >
> >  #define GUC_GGTT_TOP		0xFEE00000
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > new file mode 100644
> > index 000000000000..a0d15efff58c
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > @@ -0,0 +1,52 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright (c) 2025 Intel Corporation  */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/kref.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +
> > +#include "xe_page_reclaim.h"
> > +
> > +#include "regs/xe_gt_regs.h"
> > +#include "xe_assert.h"
> > +#include "xe_macros.h"
> > +
> > +/**
> > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > + * @prl: Page reclaim list to reset
> > + *
> > + * Clears the entries pointer and marks the list as invalid so
> > + * future use know PRL is unusable. It is expected that the entries
> > + * have already been released.
> > + */
> > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > +*prl) {
> > +	prl->entries = NULL;
> > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > +
> > +/**
> > + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list
> > +entries
> > + * @prl: Page reclaim list to allocate entries for
> > + *
> > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > + */
> > +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> > +*prl) {
> > +	struct page *page;
> > +
> > +	XE_WARN_ON(prl->entries != NULL);
> > +	if (prl->entries)
> > +		return 0;
> > +
> > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > +	if (page) {
> > +		prl->entries = page_address(page);
> > +		prl->num_entries = 0;
> > +	}
> > +
> > +	return page ? 0 : -ENOMEM;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > new file mode 100644
> > index 000000000000..d066d7d97f79
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright (c) 2025 Intel Corporation  */
> > +
> > +#ifndef _XE_PAGE_RECLAIM_H_
> > +#define _XE_PAGE_RECLAIM_H_
> > +
> > +#include <linux/kref.h>
> > +#include <linux/mm.h>
> > +#include <linux/slab.h>
> > +#include <linux/types.h>
> > +#include <linux/workqueue.h>
> > +
> > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > +
> > +struct xe_guc_page_reclaim_entry {
> > +	u32 valid:1;
> > +	u32 reclamation_size:6;
> > +	u32 reserved:5;
> > +	u32 address_lo:20;
> > +	u32 address_hi:20;
> > +	u32 reserved1:12;
> 
> This is wire interface with the GuC. Bitfields can based on endianess of the CPU. I
> know this is a iGPU feature for now but it could possibly change in the future, with
> that, to future proof can the layout of this be setup via defines / macros?
> 

Sure, I moved over to the typical FIELD_PREP/GENMASK macros used elsewhere
for the guc interfaces.

> > +} __packed;
> > +
> > +struct xe_page_reclaim_list {
> > +	/** @entries: array of page reclaim entries, page allocated */
> > +	struct xe_guc_page_reclaim_entry *entries;
> > +	/** @num_entries: number of entries */
> > +	int num_entries;
> > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > +};
> > +
> > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > +*prl); int xe_page_reclaim_list_alloc_entries(struct
> > +xe_page_reclaim_list *prl); static inline void
> > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > +	if (entries)
> > +		get_page(virt_to_page(entries));
> > +}
> > +
> > +static inline void xe_page_reclaim_entries_put(struct
> > +xe_guc_page_reclaim_entry *entries) {
> > +	if (entries)
> > +		put_page(virt_to_page(entries));
> > +}
> 
> Kernel doc for static inlines.
> 

Added.

> > +
> > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index 884127b4d97d..532a047676d4 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -12,6 +12,7 @@
> >  #include "xe_exec_queue.h"
> >  #include "xe_gt.h"
> >  #include "xe_migrate.h"
> > +#include "xe_page_reclaim.h"
> >  #include "xe_pt_types.h"
> >  #include "xe_pt_walk.h"
> >  #include "xe_res_cursor.h"
> > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> >  	/* Output */
> >  	/* @wupd: Structure to track the page-table updates we're building */
> >  	struct xe_walk_update wupd;
> > +
> > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > +	struct xe_page_reclaim_list *prl;
> >  };
> >
> >  /*
> > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64 next,
> unsigned int level,
> >  	return false;
> >  }
> >
> > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > +	return pte->level == 1 && !pte->base.children; }
> > +
> > +/* page_size = 2^(reclamation_size + 12) */
> > +#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> 	\
> > +({									\
> > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > +	ilog2(page_size) - 12;						\
> 
> s/12/XE_PTE_SHIFT ?
> 

Done.

> > +})
> > +
> > +static void generate_reclaim_entry(struct xe_tile *tile,
> > +				   struct xe_page_reclaim_list *prl,
> > +				   u64 pte,
> > +				   struct xe_pt *xe_child)
> 
> Nit, xe_pt can be on the same line as 'u64 pte'.
> 

Done.

> > +{
> > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > +	u32 reclamation_size;
> 
> Nit, I'd make the last variable declared on the stack for readability.
> 

Ahh got it, reclamation_size moved to after num_entries.

> > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > +	int num_entries = prl->num_entries;
> > +
> > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > +	xe_tile_assert(tile, reclaim_entries);
> > +
> > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > +		return;
> > +
> > +	/* Overflow: mark as invalid through num_entries */
> > +	if (num_entries >= max_entries) {
> > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > +		return;
> > +	}
> > +
> > +	/**
> > +	 * reclamation_size indicates the size of the page to be
> > +	 * invalidated and flushed from non-coherent cache.
> > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > +	 * Only valid for these specific levels.
> > +	 */
> > +
> > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > +	else if (xe_child->level == 0)
> > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > +	else if (is_large_pte(xe_child))
> > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /* reclamation_size = 2 */
> 
> What happens if we have 1G page? That doesn't seem to be handled.
> 

Page reclamation hardware does not support 1G page. This should
be handled and fallback to standard TLB invalidation PPC flush. I can add
a comment somewhere discussing this but the format for PRL only
supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
mentioning the HW support being limited to these pages and rename the
is_large_pte to is_2m_pte.

> > +	else
> > +		return;
> > +
> > +	reclaim_entries[num_entries].valid = 1;
> > +	reclaim_entries[num_entries].reclamation_size =
> > +		reclamation_size;
> > +	reclaim_entries[num_entries].address_lo =
> > +		FIELD_GET(field_mask, phys_addr);
> > +	reclaim_entries[num_entries].address_hi =
> > +		FIELD_GET(field_mask, phys_addr >> 20);
> 
> As suggested above, use macros/defines here to setup the entry.
> 

Got it, moved over to using other standard define macros.

> > +	prl->num_entries++;
> > +}
> > +
> >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> >  				    unsigned int level, u64 addr, u64 next,
> >  				    struct xe_ptw **child,
> > @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw
> *parent, pgoff_t offset,
> >  				    struct xe_pt_walk *walk)
> >  {
> >  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child),
> > base);
> > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > +		container_of(walk, typeof(*xe_walk), base);
> > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> >
> >  	XE_WARN_ON(!*child);
> >  	XE_WARN_ON(!level);
> >
> > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > +is_large_pte(xe_child))) {
> 
> And also here? 1G pages are unhandled? Please explain.
> 

As stated above, page reclamation only supports 4K, 64K, and 2M pages.
1G page will have to fallback to the standard tlb invalidation with PPC flush.

> > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > +
> > +		for (pgoff_t i = 0; i < count; i++) {
> > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64),
> u64);
> > +
> > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > +					       pte, xe_child);
> > +		}
> > +	}
> > +
> >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> >
> >  	return 0;
> > @@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct
> > xe_tile *tile,  {
> >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > +		container_of(entries, struct xe_vm_pgtable_update_op,
> entries[0]);
> >  	struct xe_pt_stage_unbind_walk xe_walk = {
> >  		.base = {
> >  			.ops = &xe_pt_stage_unbind_ops,
> > @@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile
> *tile,
> >  		.modified_start = start,
> >  		.modified_end = end,
> >  		.wupd.entries = entries,
> > +		.prl = pt_update_op->prl,
> >  	};
> >  	struct xe_pt *pt = vm->pt_root[tile->id];
> >
> > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> >  			     struct xe_vma *vma)
> >  {
> > +	struct xe_device *xe = tile_to_xe(tile);
> >  	u32 current_op = pt_update_ops->current_op;
> >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> >ops[current_op];
> >  	int err;
> > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >  	pt_op->vma = vma;
> >  	pt_op->bind = false;
> >  	pt_op->rebind = false;
> > +	/* Maintain one PRL located in pt_update_ops that all others in unbind op
> reference */
> > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > +		if (err < 0)
> > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> 
> I don't think you need to call xe_page_reclaim_list_invalidate, right?
> If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> 

Yes. I'll drop this call for now then.

> > +	}
> > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > +NULL;
> >
> >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> >  	if (err)
> > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile
> > *tile,
> >
> >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> >  						vma, NULL, pt_op->entries);
> > +	/* Free PRL if list declared as invalid */
> > +	if (pt_update_ops->prl.entries &&
> > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > +		pt_op->prl = NULL;
> > +		pt_update_ops->prl.entries = NULL;
> 
> Call xe_page_reclaim_list_invalidate for clarity?
> 

Updated.

> > +	}
> >
> >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> >  				pt_op->num_entries, false);
> > @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
> >  	pt_op->vma = XE_INVALID_VMA;
> >  	pt_op->bind = false;
> >  	pt_op->rebind = false;
> > +	pt_op->prl = NULL;
> >
> >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> >  						pt_op->entries);
> > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> xe_vm_pgtable_update_ops *pt_update_ops)
> >  	init_llist_head(&pt_update_ops->deferred);
> >  	pt_update_ops->start = ~0x0ull;
> >  	pt_update_ops->last = 0x0ull;
> > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> 
> Can we introduce a function called xe_page_reclaim_list_init for clarity? It might
> do the same thing as xe_page_reclaim_list_invalidate but it would make this a
> little more clear. Likewise later in the series when a job is created, you can call
> xe_page_reclaim_list_init there too.
> 

Sure, I'll write another helper for this and modify both those PRL creation points.

> >  }
> >
> >  /**
> > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> >  		&vops->pt_update_ops[tile->id];
> >  	int i;
> >
> > +	if (pt_update_ops->prl.entries) {
> > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > +	}
> > +
> >  	lockdep_assert_held(&vops->vm->lock);
> >  	xe_vm_assert_held(vops->vm);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > b/drivers/gpu/drm/xe/xe_pt_types.h
> > index 881f01e14db8..26e5295f118e 100644
> > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > @@ -8,6 +8,7 @@
> >
> >  #include <linux/types.h>
> >
> > +#include "xe_page_reclaim.h"
> >  #include "xe_pt_walk.h"
> >
> >  struct xe_bo;
> > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> >  	bool bind;
> >  	/** @rebind: is a rebind */
> >  	bool rebind;
> > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > +	struct xe_page_reclaim_list *prl;
> 
> Can you move this above the bools in the layout of xe_vm_pgtable_update_op,
> likely just below "struct xe_vma".
> 

Ahh got it. Moved.

> >  };
> >
> >  /** struct xe_vm_pgtable_update_ops: page table update operations */
> > @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> >  	 * slots are idle.
> >  	 */
> >  	bool wait_vm_kernel;
> > +	/** @prl: embedded page reclaim list */
> > +	struct xe_page_reclaim_list prl;
> 
> Same thing here, move just below "struct xe_exec_queue".
> 
> Matt
> 

Moved.

Brian

> >  };
> >
> >  #endif
> > --
> > 2.51.2
> >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 07/11] drm/xe: Suballocate BO for page reclaim
  2025-11-22 19:42   ` Matthew Brost
@ 2025-11-25 11:20     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:20 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Saturday, November 22, 2025 11:43 AM, Matthew Brost wrote:
> On Tue, Nov 18, 2025 at 05:05:48PM +0800, Brian Nguyen wrote:
> > Page reclamation feature needs the PRL to be suballocated into a
> > GGTT-mapped BO. On allocation failure, fallback to default tlb
> > invalidation with full PPC flush.
> >
> > PRL's BO allocation is managed in separate pool to ensure 4K alignment
> > for proper GGTT address.
> >
> > With BO, pass into TLB invalidation backend and modify fence to
> > accomadate accordingly.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device_types.h    |  7 ++++++
> >  drivers/gpu/drm/xe/xe_page_reclaim.c    | 33 +++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_page_reclaim.h    |  4 +++
> >  drivers/gpu/drm/xe/xe_tile.c            |  5 ++++
> >  drivers/gpu/drm/xe/xe_tlb_inval.c       | 18 ++++++++++++--
> >  drivers/gpu/drm/xe/xe_tlb_inval_types.h |  5 ++++
> >  6 files changed, 70 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > b/drivers/gpu/drm/xe/xe_device_types.h
> > index 268c8e28601a..057df3f9dc1d 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -184,6 +184,13 @@ struct xe_tile {
> >  		 * Media GT shares a pool with its primary GT.
> >  		 */
> >  		struct xe_sa_manager *kernel_bb_pool;
> > +
> > +		/**
> > +		 * @mem.reclaim_pool: Pool for PRLs allocated.
> > +		 *
> > +		 * Only main GT has page reclaim list allocations.
> > +		 */
> > +		struct xe_sa_manager *reclaim_pool;
> >  	} mem;
> >
> >  	/** @sriov: tile level virtualization data */ diff --git
> > a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > index a0d15efff58c..801a7f1731c0 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > @@ -13,6 +13,39 @@
> >  #include "regs/xe_gt_regs.h"
> >  #include "xe_assert.h"
> >  #include "xe_macros.h"
> > +#include "xe_sa.h"
> > +#include "xe_tlb_inval_types.h"
> > +
> > +/**
> > + * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated
> > +GGTT BO
> > + * @tlb_inval: TLB invalidation frontend associated with the request
> > + * @fence: Fence carrying the PRL metadata
> > + *
> > + * Suballocates a 4K BO out of the tile reclaim pool, copies the PRL
> > +CPU
> > + * copy into the BO and queues the buffer for release when @fence signals.
> > + *
> > + * Return: 0 on success or -ENOMEM if the suballocation fails.
> > + */
> > +int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval,
> > +struct xe_tlb_inval_fence *fence)
> 
> As discussed here [1] let's try to avoid storing anything in related to PRL in "struct
> xe_tlb_inval_fence". So I think reclaim_entries + number entries should be
> argumen to this function and return "struct
> drm_subaloc*) or ERR_PTR here.
> 
> [1] https://patchwork.freedesktop.org/patch/689042/?series=157698&rev=1#comment_1267062
> 

Got it. Will remove the PRL from the tlb fence and adjust funcs accordingly.

> > +{
> > +	struct xe_gt *gt = container_of(tlb_inval, struct xe_gt, tlb_inval);
> > +	struct xe_tile *tile = gt_to_tile(gt);
> > +
> > +	/* Maximum size of PRL is 1 4K-page */
> > +	fence->prl_sa = __xe_sa_bo_new(tile->mem.reclaim_pool,
> > +				       XE_PAGE_RECLAIM_LIST_MAX_SIZE,
> GFP_ATOMIC);
> 
> Any reason we can't pass in the number of entries for better suballocation? Or
> does PRL in GuC interface need to be page aligned?
> 

Looked at the spec again and for PRL, a full zeroed out page_reclaim_entry
is enough to indicate the end of the list, so I will adjust to a (num_entries + 1)
allocation.

PRL in GuC interface requires address to be 4K aligned, but I believe that will
be taken care of by the bo manager of reclaim_pool.

> > +	if (IS_ERR(fence->prl_sa))
> > +		return -ENOMEM;
> > +
> > +	memcpy(xe_sa_bo_cpu_addr(fence->prl_sa), fence->reclaim_entries,
> > +	       XE_PAGE_RECLAIM_LIST_MAX_SIZE);
> 
> If we had the number of entries we could save a few instructions on the memory
> copy too.
> 

Agreed, updating now to the same (num_entries + 1) change.

> > +	xe_sa_bo_flush_write(fence->prl_sa);
> > +
> > +	/* Queue up sa_bo_free on fence signal */
> > +	xe_sa_bo_free(fence->prl_sa, &fence->base);
> > +
> > +	return 0;
> > +}
> >
> >  /**
> >   * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid diff
> > --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > index d066d7d97f79..f82b4d0865e0 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > @@ -15,6 +15,9 @@
> >  #define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> >  #define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> >
> > +struct xe_tlb_inval;
> > +struct xe_tlb_inval_fence;
> > +
> >  struct xe_guc_page_reclaim_entry {
> >  	u32 valid:1;
> >  	u32 reclamation_size:6;
> > @@ -32,6 +35,7 @@ struct xe_page_reclaim_list {
> >  #define XE_PAGE_RECLAIM_INVALID_LIST	-1
> >  };
> >
> > +int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval,
> > +struct xe_tlb_inval_fence *fence);
> >  void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > *prl);  int xe_page_reclaim_list_alloc_entries(struct
> > xe_page_reclaim_list *prl);  static inline void
> > xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries)
> > diff --git a/drivers/gpu/drm/xe/xe_tile.c
> > b/drivers/gpu/drm/xe/xe_tile.c index 4f4f9a5c43af..63c060c2ea5c 100644
> > --- a/drivers/gpu/drm/xe/xe_tile.c
> > +++ b/drivers/gpu/drm/xe/xe_tile.c
> > @@ -209,6 +209,11 @@ int xe_tile_init(struct xe_tile *tile)
> >  	if (IS_ERR(tile->mem.kernel_bb_pool))
> >  		return PTR_ERR(tile->mem.kernel_bb_pool);
> >
> > +	/* Optimistically anticipate at most 256 TLB fences with PRL */
> > +	tile->mem.reclaim_pool = xe_sa_bo_manager_init(tile, SZ_1M, XE_PAGE_RECLAIM_LIST_MAX_SIZE);
> > +	if (IS_ERR(tile->mem.reclaim_pool))
> > +		return PTR_ERR(tile->mem.reclaim_pool);
> > +
> >  	return 0;
> >  }
> >  void xe_tile_migrate_wait(struct xe_tile *tile) diff --git
> > a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index de275759743c..67a047521165 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -15,6 +15,7 @@
> >  #include "xe_guc_ct.h"
> >  #include "xe_guc_tlb_inval.h"
> >  #include "xe_mmio.h"
> > +#include "xe_page_reclaim.h"
> >  #include "xe_pm.h"
> >  #include "xe_tlb_inval.h"
> >  #include "xe_trace.h"
> > @@ -326,8 +327,19 @@ int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> >  		       struct xe_tlb_inval_fence *fence, u64 start, u64 end,
> >  		       u32 asid, bool flush_cache)
> >  {
> > -	return xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> > -				  start, end, asid, flush_cache);
> > +	int err;
> > +
> > +	if (fence->reclaim_entries) {
> > +		err = xe_page_reclaim_create_prl_bo(tlb_inval, fence);
> > +		if (err) {
> > +			flush_cache = true;
> > +			fence->prl_sa = NULL;
> > +		}
> > +	}
> 
> Should we do the above step in run_job of the TLB invalidation job? I think that
> might be cleaner wrt to layering and make it clear only TLB invalidation jobs can
> use PRL. I don't see an easy way to implement non-job based TLB invalidations
> with a PRL as those are typically in the path of reclaim (no memory allocations).
> 

Sure! I was previously considering other TLB invalidation flows but it does seem
like only the tlb_inval_jobs will make use of this feature. I'll move the PRL
allocations to the tlb_inval_job related functions.

> > +	err = xe_tlb_inval_issue(tlb_inval, fence, tlb_inval->ops->ppgtt,
> > +				 start, end, asid, flush_cache);
> > +
> > +	return err;
> >  }
> >
> >  /**
> > @@ -461,4 +473,6 @@ void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval,
> >  		dma_fence_get(&fence->base);
> >  	fence->tlb_inval = tlb_inval;
> >  	fence->flush_cache = true;
> > +	fence->reclaim_entries = NULL;
> > +	fence->prl_sa = NULL;
> >  }
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > index c3c3943fb07e..7cf741e6a0c7 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > @@ -9,6 +9,7 @@
> >  #include <linux/workqueue.h>
> >  #include <linux/dma-fence.h>
> >
> > +struct xe_guc_page_reclaim_entry;
> >  struct xe_tlb_inval;
> >
> >  /** struct xe_tlb_inval_ops - TLB invalidation ops (backend) */ @@
> > -129,6 +130,10 @@ struct xe_tlb_inval_fence {
> >  	ktime_t inval_time;
> >  	/** @flush_cache: bool for PPC flush, default is true */
> >  	bool flush_cache;
> > +	/** @reclaim_entries: list of pages to reclaim */
> > +	struct xe_guc_page_reclaim_entry *reclaim_entries;
> > +	/** @prl_sa: BO allocation for page reclaim list */
> > +	struct drm_suballoc *prl_sa;
> 
> Again, let's try to hard move all of these things out the fence (store them in the
> job if needed).
> 
> Matt
> 

Got it, as stated in above reply, I'll remove these modifications to the fence, and
continue accordingly.

Brian

> >  };
> >
> >  #endif
> > --
> > 2.51.2
> >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job
  2025-11-22 13:52   ` Michal Wajdeczko
@ 2025-11-25 11:20     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:20 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Lin, Shuicheng, Summers, Stuart

On Saturday, November 22, 2025 5:52 AM, Michal Wajdeczko wrote:
> On 11/18/2025 10:05 AM, Brian Nguyen wrote:
> > Use page reclaim list as indicator if page reclaim action is desired
> > and pass it to tlb inval fence to handle.
> >
> > Job will need to maintain its own embedded copy to ensure lifetime of
> > PRL exist until job has run.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> 
> ...
> 
> >
> > +/**
> > + * xe_tlb_inval_job_add_page_reclaim() - Embed PRL into a TLB job
> > + * @job: TLB invalidation job that may trigger reclamation
> > + * @prl: Page reclaim list populated during unbind
> > + *
> > + * Copies @prl into the job and takes an extra reference to the entry
> > +page so
> > + * ownership can transfer to the TLB fence when the job is pushed.
> > + */
> > +void xe_tlb_inval_job_add_page_reclaim(struct xe_tlb_inval_job *job,
> > +				       struct xe_page_reclaim_list *prl) {
> > +	struct xe_device *xe = gt_to_xe(job->q->gt);
> > +
> > +	WARN_ON(!xe->info.has_page_reclaim_hw_assist);
> 
> you can use here debug-only:
> 
> 	xe_gt_assert(job->q->gt, xe->info.has_page_reclaim_hw_assist);
> 
> or if you want keep it in production builds:
> 
> 	xe_gt_WARN_ON(...
> 

Will transition to xe_gt_WARN_ON usage. Thanks! The WARN_ON here
was used mainly because of the debugfs. If debugfs toggles this flag
while a PRL has already been allocated but the TLB invalidation job
has begun issuing, we could potentially trigger this which is why
I wish to keep the WARN_ON.

In practice this should never occur outside debugfs, just worthwhile 
to warn users that we still have page reclamation ongoing, even after
modification of flags.

Brian

> > +	job->prl = *prl;
> > +	/* Pair with put after bo creation */
> > +	xe_page_reclaim_entries_get(job->prl.entries);
> > +}
> > +

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 11/11] drm/xe: Add debugfs support for page reclamation
  2025-11-22 14:18   ` Michal Wajdeczko
@ 2025-11-25 11:21     ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 11:21 UTC (permalink / raw)
  To: Wajdeczko, Michal, intel-xe@lists.freedesktop.org
  Cc: Upadhyay, Tejas, Brost, Matthew, Lin, Shuicheng, Summers, Stuart

On Saturday, November 22, 2025 6:18 AM, Michal Wajdeczko wrote:
> On 11/18/2025 10:05 AM, Brian Nguyen wrote:
> > Allow for runtime modification to page reclamation feature through
> > debugfs configuration. This parameter will only take effect if the
> > platform supports the page reclamation feature by default.
> >
> > Move xe_match_desc to common header for debugfs access to read default
> > device values of xe driver for current platform.
> 
> this seems to be unnecessary, see below
> 
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_configfs.c | 11 +-------
> > drivers/gpu/drm/xe/xe_debugfs.c  | 47 ++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_device.c   | 10 +++++++
> >  drivers/gpu/drm/xe/xe_device.h   |  2 ++
> >  4 files changed, 60 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_configfs.c
> > b/drivers/gpu/drm/xe/xe_configfs.c
> > index 9f6251b1008b..efc6d0690b27 100644
> > --- a/drivers/gpu/drm/xe/xe_configfs.c
> > +++ b/drivers/gpu/drm/xe/xe_configfs.c
> > @@ -15,6 +15,7 @@
> >
> >  #include "instructions/xe_mi_commands.h"
> >  #include "xe_configfs.h"
> > +#include "xe_device.h"
> >  #include "xe_gt_types.h"
> >  #include "xe_hw_engine_types.h"
> >  #include "xe_module.h"
> > @@ -925,16 +926,6 @@ static const struct config_item_type
> xe_config_sriov_type = {
> >  	.ct_attrs	= xe_config_sriov_attrs,
> >  };
> >
> > -static const struct xe_device_desc *xe_match_desc(struct pci_dev
> > *pdev) -{
> > -	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> > -	struct pci_driver *drv = to_pci_driver(driver);
> > -	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> > -	const struct pci_device_id *found = pci_match_id(ids, pdev);
> > -
> > -	return found ? (const void *)found->driver_data : NULL;
> > -}
> > -
> >  static struct pci_dev *get_physfn_instead(struct pci_dev *virtfn)  {
> >  	struct pci_dev *physfn = pci_physfn(virtfn); diff --git
> > a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index e91da9589c5f..572c61ee1e29 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -19,6 +19,7 @@
> >  #include "xe_gt_printk.h"
> >  #include "xe_guc_ads.h"
> >  #include "xe_mmio.h"
> > +#include "xe_pci_types.h"
> >  #include "xe_pm.h"
> >  #include "xe_psmi.h"
> >  #include "xe_pxp_debugfs.h"
> > @@ -297,6 +298,49 @@ static const struct file_operations wedged_mode_fops
> = {
> >  	.write = wedged_mode_set,
> >  };
> >
> > +static ssize_t page_reclaim_hw_assist_show(struct file *f, char __user *ubuf,
> > +					   size_t size, loff_t *pos)
> > +{
> > +	struct xe_device *xe = file_inode(f)->i_private;
> > +	char buf[8];
> > +	int len;
> > +
> > +	len = scnprintf(buf, sizeof(buf), "%d\n", xe-
> >info.has_page_reclaim_hw_assist);
> > +	return simple_read_from_buffer(ubuf, size, pos, buf, len); }
> > +
> > +static ssize_t page_reclaim_hw_assist_set(struct file *f, const char __user
> *ubuf,
> > +					  size_t size, loff_t *pos)
> > +{
> > +	struct xe_device *xe = file_inode(f)->i_private;
> > +	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> > +	const struct xe_device_desc *desc = xe_match_desc(pdev);
> > +	unsigned int val;
> > +	ssize_t ret;
> > +
> > +	ret = kstrtouint_from_user(ubuf, size, 0, &val);
> 
> kstrtobool_from_user
> 

Changed.

> > +	if (ret)
> > +		return ret;
> > +
> > +	/**
> > +	 * Don't modify if page reclamation support isn't normally
> > +	 * supported by the HW.
> > +	 */
> > +
> > +	if (!desc || !desc->has_page_reclaim_hw_assist)
> > +		return -ENODEV;
> 
> instead of checking desc->has_page_reclaim_hw_assist capability here
> 
> > +
> > +	xe->info.has_page_reclaim_hw_assist = !!val;
> > +
> > +	return size;
> > +}
> > +
> > +static const struct file_operations page_reclaim_hw_assist_fops = {
> > +	.owner = THIS_MODULE,
> > +	.read = page_reclaim_hw_assist_show,
> > +	.write = page_reclaim_hw_assist_set, };
> > +
> >  static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user *ubuf,
> >  					    size_t size, loff_t *pos)
> >  {
> > @@ -403,6 +447,9 @@ void xe_debugfs_register(struct xe_device *xe)
> >  	debugfs_create_file("disable_late_binding", 0600, root, xe,
> >  			    &disable_late_binding_fops);
> >
> 
> better to expose "page_reclaim_hw_assist" file *only* if required capability is
> present and we can get that flag directly from the xe:
> 
> 	if (xe->info.has_page_reclaim_hw_assist)
> 

Ohh, got it! That makes sense, much more preferrable. Will add that revision.
Thanks!!

> > +	debugfs_create_file("page_reclaim_hw_assist", 0600, root, xe,
> > +			    &page_reclaim_hw_assist_fops);
> > +
> >  	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1;
> ++mem_type) {
> >  		man = ttm_manager_type(bdev, mem_type);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c
> > b/drivers/gpu/drm/xe/xe_device.c index c7d373c70f0f..16afddc5e35e
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -1295,3 +1295,13 @@ void xe_device_declare_wedged(struct xe_device
> *xe)
> >  		drm_dev_wedged_event(&xe->drm, xe->wedged.method, NULL);
> >  	}
> >  }
> > +
> > +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
> 
> note that this function was specific for configfs case where might not have the xe
> device, hence the manual lookup was needed
> 
> if in the future for some reasons we would like to get access to the desc from the
> xe, then we should rather consider adding a const pointer to it
> 

Understood. With your suggested change above, this xe_match_desc is
no longer necessary and will be removed

Brian 

> > +{
> > +	struct device_driver *driver = driver_find("xe", &pci_bus_type);
> > +	struct pci_driver *drv = to_pci_driver(driver);
> > +	const struct pci_device_id *ids = drv ? drv->id_table : NULL;
> > +	const struct pci_device_id *found = pci_match_id(ids, pdev);
> > +
> > +	return found ? (const void *)found->driver_data : NULL; }
> > diff --git a/drivers/gpu/drm/xe/xe_device.h
> > b/drivers/gpu/drm/xe/xe_device.h index 32cc6323b7f6..a66e8e4b3e01
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -193,6 +193,8 @@ void xe_device_declare_wedged(struct xe_device
> > *xe);  struct xe_file *xe_file_get(struct xe_file *xef);  void
> > xe_file_put(struct xe_file *xef);
> >
> > +const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev);
> > +
> >  int xe_is_injection_active(void);
> >
> >  /*


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  2025-11-24 12:29   ` Matthew Auld
  2025-11-25  6:12     ` Nguyen, Brian3
@ 2025-11-25 11:48     ` Upadhyay, Tejas
  2025-11-25 13:05       ` Upadhyay, Tejas
  1 sibling, 1 reply; 51+ messages in thread
From: Upadhyay, Tejas @ 2025-11-25 11:48 UTC (permalink / raw)
  To: Auld, Matthew, Nguyen, Brian3, intel-xe@lists.freedesktop.org
  Cc: Brost, Matthew, Lin, Shuicheng, Summers, Stuart



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: 24 November 2025 18:00
> To: Nguyen, Brian3 <brian3.nguyen@intel.com>; intel-
> xe@lists.freedesktop.org
> Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; Brost, Matthew
> <matthew.brost@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>;
> Summers, Stuart <stuart.summers@intel.com>
> Subject: Re: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping
> unnecessary page reclaim
> 
> On 18/11/2025 09:05, Brian Nguyen wrote:
> > In Xe3p and beyond, there are additional hardware managed L2$ flushing
> > for the deemed transient display and transient app buffers. In those
> > scenarios, page reclamation is unnecessary resulting in redundant
> > cachline flushes, so skip over those corresponding ranges.
> >
> > Add chicken bit to determine media engine status to help facilitate
> > decision making in L2$ flush skipping.
> >
> > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 +++++++
> >   drivers/gpu/drm/xe/xe_page_reclaim.c | 43
> ++++++++++++++++++++++++++++
> >   drivers/gpu/drm/xe/xe_page_reclaim.h |  3 ++
> >   drivers/gpu/drm/xe/xe_pat.c          |  9 +-----
> >   drivers/gpu/drm/xe/xe_pt.c           |  3 +-
> >   5 files changed, 60 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > index 917a088c28f2..a18a2d59153e 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > @@ -99,6 +99,14 @@
> >   #define VE1_AUX_INV				XE_REG(0x42b8)
> >   #define   AUX_INV				REG_BIT(0)
> >
> > +#define _PAT_PTA				0x4820
> > +#define   XE2_NO_PROMOTE			REG_BIT(10)
> > +#define   XE2_COMP_EN				REG_BIT(9)
> > +#define   XE2_L3_CLOS				REG_GENMASK(7, 6)
> > +#define   XE2_L3_POLICY				REG_GENMASK(5, 4)
> > +#define   XE2_L4_POLICY				REG_GENMASK(3, 2)
> > +#define   XE2_COH_MODE				REG_GENMASK(1, 0)
> > +
> >   #define XE2_LMEM_CFG				XE_REG(0x48b0)
> >
> >   #define XEHP_FLAT_CCS_BASE_ADDR
> 	XE_REG_MCR(0x4910)
> > @@ -429,6 +437,9 @@
> >
> >   #define XE2_GLOBAL_INVAL			XE_REG(0xb404)
> >
> > +#define LTISEQCHK				XE_REG(0xb49c)
> > +#define   XE3P_MEDIA_IS_ON			REG_BIT(2)
> > +
> >   #define XE2LPM_L3SQCREG2
> 	XE_REG_MCR(0xb604)
> >
> >   #define XE2LPM_L3SQCREG3
> 	XE_REG_MCR(0xb608)
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > index 801a7f1731c0..2f0e7547732c 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > @@ -13,8 +13,51 @@
> >   #include "regs/xe_gt_regs.h"
> >   #include "xe_assert.h"
> >   #include "xe_macros.h"
> > +#include "xe_mmio.h"
> > +#include "xe_pat.h"
> >   #include "xe_sa.h"
> >   #include "xe_tlb_inval_types.h"
> > +#include "xe_vm.h"
> > +
> > +/**
> > + * xe_page_reclaim_skip() - Decide whether PRL should be skipped for
> > +a VMA
> > + * @tile: Tile owning the VMA
> > + * @vma: VMA under consideration
> > + *
> > + * Xe3p and beyond can handle PPC flushing for specific PAT encodings.
> > + * Skip PPC flushing in both scenarios below.
> > + * - pat_index is transient display (1)
> > + * - pat_index is transient app (2) and Media is off
> > + *
> > + * Return: true when page reclamation is unnecessary, false otherwise.
> > + */
> > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma) {
> > +	struct xe_device *xe = xe_vma_vm(vma)->xe;
> > +	struct xe_mmio *mmio = &tile->primary_gt->mmio;
> > +	u16 pat_index = vma->attr.pat_index;
> > +	u32 pat_value;
> > +	u8 l3_policy;
> > +	bool is_media_awake;
> > +
> > +	/* Ensure called only with Xe3p due to associated PAT index */
> > +	xe_assert(tile->xe, GRAPHICS_VER(tile->xe) >= 35);
> > +	xe_assert(tile->xe, pat_index < xe->pat.n_entries);
> > +
> > +	pat_value = xe->pat.table[pat_index].value;
> > +	l3_policy = REG_FIELD_GET(XE2_L3_POLICY, pat_value);
> 
> I think if we need something like this, it might make sense to create a helper in
> xe_pat and use that here? Not sure if want stuff outside of xe_pat looking at
> such internals.
> 
> > +	is_media_awake = xe_mmio_read32(mmio, LTISEQCHK) &
> XE3P_MEDIA_IS_ON;
> 
> Do we need this? Whether media is off/on should be an internal detail for
> fw/hw, not KMD I think, and will influence whether fw/hw will only flush
> cahelines shared with CPU or whether to flush entire cache at various places,
> like end of submission. Also this seems racy, since Media can turn on/off after
> checking this?
> 
> > +
> > +	/**
> > +	 *   - l3_policy:   0=WB, 1=XD ("WB - Transient Display"),
> 
> Why do we skip Transient Display? Can you share some more details or
> maybe add a comment here? AFAIK transient display just allows using the
> GPU caches for display surfaces, with the idea of then doing a targeted
> transient flush only when doing the actual scanout. On newer hw this
> flush is done by hw, I think, instead of KMD, but I assume it is only
> done when doing the scanout step? Or is that now handled differently?
> 
> Concern here is that user does render copy to display surface with
> transient display PAT index but then never does an actual scanout, and
> then just deletes the memory. Where is the flush in that flow?

I think this is valid point, but thinking was skipping TD flush on scanout intentionally would be taken care by HW at the time of any next flush sync point. But when we do page reclamation, we will hit any near sync point and next HW flush before page reclamation could execute. May be need to check with HW if flushing will happen on scanout only or at every sync points!

Tejas
> 
> > +	 *                  2=XA ("WB - Transient App" for Xe3p), 3=UC
> > +	 * From Xe3p, transient display flush is taken care by HW, l3_policy = 1
> > +	 *
> > +	 * Also with Xe3p, pat_index=18/19 corresponds to transient app
> flushing
> > +	 * which is handled by HW when media is off.
> > +	 */
> > +	return (l3_policy == 1 || (!is_media_awake && (pat_index == 18 ||
> pat_index == 19)));
> > +}
> >
> >   /**
> >    * xe_page_reclaim_create_prl_bo() - Back a PRL with a suballocated GGTT
> BO
> > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > index f82b4d0865e0..dafd4edd6f61 100644
> > --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > @@ -17,6 +17,8 @@
> >
> >   struct xe_tlb_inval;
> >   struct xe_tlb_inval_fence;
> > +struct xe_tile;
> > +struct xe_vma;
> >
> >   struct xe_guc_page_reclaim_entry {
> >   	u32 valid:1;
> > @@ -35,6 +37,7 @@ struct xe_page_reclaim_list {
> >   #define XE_PAGE_RECLAIM_INVALID_LIST	-1
> >   };
> >
> > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma);
> >   int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval, struct
> xe_tlb_inval_fence *fence);
> >   void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> >   int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl);
> > diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
> > index 1b4d5d3def0f..4783acd1f027 100644
> > --- a/drivers/gpu/drm/xe/xe_pat.c
> > +++ b/drivers/gpu/drm/xe/xe_pat.c
> > @@ -9,6 +9,7 @@
> >
> >   #include <generated/xe_wa_oob.h>
> >
> > +#include "regs/xe_gt_regs.h"
> >   #include "regs/xe_reg_defs.h"
> >   #include "xe_assert.h"
> >   #include "xe_device.h"
> > @@ -23,14 +24,6 @@
> >   #define _PAT_INDEX(index)			_PICK_EVEN_2RANGES(index,
> 8, \
> >   								   0x4800,
> 0x4804, \
> >   								   0x4848,
> 0x484c)
> > -#define _PAT_PTA				0x4820
> > -
> > -#define XE2_NO_PROMOTE				REG_BIT(10)
> > -#define XE2_COMP_EN				REG_BIT(9)
> > -#define XE2_L3_CLOS				REG_GENMASK(7, 6)
> > -#define XE2_L3_POLICY				REG_GENMASK(5, 4)
> > -#define XE2_L4_POLICY				REG_GENMASK(3, 2)
> > -#define XE2_COH_MODE				REG_GENMASK(1, 0)
> >
> >   #define XELPG_L4_POLICY_MASK			REG_GENMASK(3, 2)
> >   #define XELPG_PAT_3_UC
> 	REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index 03723c8d2601..8ccab39c2599 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -2008,7 +2008,8 @@ static int unbind_op_prepare(struct xe_tile *tile,
> >   		if (err < 0)
> >   			xe_page_reclaim_list_invalidate(&pt_update_ops-
> >prl);
> >   	}
> > -	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> NULL;
> > +	pt_op->prl = (pt_update_ops->prl.entries &&
> > +		     !xe_page_reclaim_skip(tile, vma)) ? &pt_update_ops->prl :
> NULL;
> >
> >   	err = vma_reserve_fences(tile_to_xe(tile), vma);
> >   	if (err)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
  2025-11-25 11:48     ` Upadhyay, Tejas
@ 2025-11-25 13:05       ` Upadhyay, Tejas
  0 siblings, 0 replies; 51+ messages in thread
From: Upadhyay, Tejas @ 2025-11-25 13:05 UTC (permalink / raw)
  To: Upadhyay, Tejas, Auld, Matthew, Nguyen, Brian3,
	intel-xe@lists.freedesktop.org, S, Jayakrishna
  Cc: Brost, Matthew, Lin, Shuicheng, Summers, Stuart



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Upadhyay, Tejas
> Sent: 25 November 2025 17:19
> To: Auld, Matthew <matthew.auld@intel.com>; Nguyen, Brian3
> <brian3.nguyen@intel.com>; intel-xe@lists.freedesktop.org
> Cc: Brost, Matthew <matthew.brost@intel.com>; Lin, Shuicheng
> <shuicheng.lin@intel.com>; Summers, Stuart <stuart.summers@intel.com>
> Subject: RE: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping
> unnecessary page reclaim
> 
> 
> 
> > -----Original Message-----
> > From: Auld, Matthew <matthew.auld@intel.com>
> > Sent: 24 November 2025 18:00
> > To: Nguyen, Brian3 <brian3.nguyen@intel.com>; intel-
> > xe@lists.freedesktop.org
> > Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; Brost, Matthew
> > <matthew.brost@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>;
> > Summers, Stuart <stuart.summers@intel.com>
> > Subject: Re: [PATCH 10/11] drm/xe: Optimize flushing of L2$ by
> > skipping unnecessary page reclaim
> >
> > On 18/11/2025 09:05, Brian Nguyen wrote:
> > > In Xe3p and beyond, there are additional hardware managed L2$
> > > flushing for the deemed transient display and transient app buffers.
> > > In those scenarios, page reclamation is unnecessary resulting in
> > > redundant cachline flushes, so skip over those corresponding ranges.
> > >
> > > Add chicken bit to determine media engine status to help facilitate
> > > decision making in L2$ flush skipping.
> > >
> > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 +++++++
> > >   drivers/gpu/drm/xe/xe_page_reclaim.c | 43
> > ++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/xe/xe_page_reclaim.h |  3 ++
> > >   drivers/gpu/drm/xe/xe_pat.c          |  9 +-----
> > >   drivers/gpu/drm/xe/xe_pt.c           |  3 +-
> > >   5 files changed, 60 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > > b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > > index 917a088c28f2..a18a2d59153e 100644
> > > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > > @@ -99,6 +99,14 @@
> > >   #define VE1_AUX_INV				XE_REG(0x42b8)
> > >   #define   AUX_INV				REG_BIT(0)
> > >
> > > +#define _PAT_PTA				0x4820
> > > +#define   XE2_NO_PROMOTE			REG_BIT(10)
> > > +#define   XE2_COMP_EN				REG_BIT(9)
> > > +#define   XE2_L3_CLOS				REG_GENMASK(7, 6)
> > > +#define   XE2_L3_POLICY				REG_GENMASK(5, 4)
> > > +#define   XE2_L4_POLICY				REG_GENMASK(3, 2)
> > > +#define   XE2_COH_MODE				REG_GENMASK(1, 0)
> > > +
> > >   #define XE2_LMEM_CFG				XE_REG(0x48b0)
> > >
> > >   #define XEHP_FLAT_CCS_BASE_ADDR
> > 	XE_REG_MCR(0x4910)
> > > @@ -429,6 +437,9 @@
> > >
> > >   #define XE2_GLOBAL_INVAL			XE_REG(0xb404)
> > >
> > > +#define LTISEQCHK				XE_REG(0xb49c)
> > > +#define   XE3P_MEDIA_IS_ON			REG_BIT(2)
> > > +
> > >   #define XE2LPM_L3SQCREG2
> > 	XE_REG_MCR(0xb604)
> > >
> > >   #define XE2LPM_L3SQCREG3
> > 	XE_REG_MCR(0xb608)
> > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > index 801a7f1731c0..2f0e7547732c 100644
> > > --- a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > @@ -13,8 +13,51 @@
> > >   #include "regs/xe_gt_regs.h"
> > >   #include "xe_assert.h"
> > >   #include "xe_macros.h"
> > > +#include "xe_mmio.h"
> > > +#include "xe_pat.h"
> > >   #include "xe_sa.h"
> > >   #include "xe_tlb_inval_types.h"
> > > +#include "xe_vm.h"
> > > +
> > > +/**
> > > + * xe_page_reclaim_skip() - Decide whether PRL should be skipped
> > > +for a VMA
> > > + * @tile: Tile owning the VMA
> > > + * @vma: VMA under consideration
> > > + *
> > > + * Xe3p and beyond can handle PPC flushing for specific PAT encodings.
> > > + * Skip PPC flushing in both scenarios below.
> > > + * - pat_index is transient display (1)
> > > + * - pat_index is transient app (2) and Media is off
> > > + *
> > > + * Return: true when page reclamation is unnecessary, false otherwise.
> > > + */
> > > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma *vma) {
> > > +	struct xe_device *xe = xe_vma_vm(vma)->xe;
> > > +	struct xe_mmio *mmio = &tile->primary_gt->mmio;
> > > +	u16 pat_index = vma->attr.pat_index;
> > > +	u32 pat_value;
> > > +	u8 l3_policy;
> > > +	bool is_media_awake;
> > > +
> > > +	/* Ensure called only with Xe3p due to associated PAT index */
> > > +	xe_assert(tile->xe, GRAPHICS_VER(tile->xe) >= 35);
> > > +	xe_assert(tile->xe, pat_index < xe->pat.n_entries);
> > > +
> > > +	pat_value = xe->pat.table[pat_index].value;
> > > +	l3_policy = REG_FIELD_GET(XE2_L3_POLICY, pat_value);
> >
> > I think if we need something like this, it might make sense to create
> > a helper in xe_pat and use that here? Not sure if want stuff outside
> > of xe_pat looking at such internals.
> >
> > > +	is_media_awake = xe_mmio_read32(mmio, LTISEQCHK) &
> > XE3P_MEDIA_IS_ON;
> >
> > Do we need this? Whether media is off/on should be an internal detail
> > for fw/hw, not KMD I think, and will influence whether fw/hw will only
> > flush cahelines shared with CPU or whether to flush entire cache at
> > various places, like end of submission. Also this seems racy, since
> > Media can turn on/off after checking this?
> >
> > > +
> > > +	/**
> > > +	 *   - l3_policy:   0=WB, 1=XD ("WB - Transient Display"),
> >
> > Why do we skip Transient Display? Can you share some more details or
> > maybe add a comment here? AFAIK transient display just allows using
> > the GPU caches for display surfaces, with the idea of then doing a
> > targeted transient flush only when doing the actual scanout. On newer
> > hw this flush is done by hw, I think, instead of KMD, but I assume it
> > is only done when doing the scanout step? Or is that now handled
> differently?
> >
> > Concern here is that user does render copy to display surface with
> > transient display PAT index but then never does an actual scanout, and
> > then just deletes the memory. Where is the flush in that flow?
> 
> I think this is valid point, but thinking was skipping TD flush on scanout
> intentionally would be taken care by HW at the time of any next flush sync
> point. But when we do page reclamation, we will hit any near sync point and
> next HW flush before page reclamation could execute. May be need to check
> with HW if flushing will happen on scanout only or at every sync points!
> 
> Tejas

Just had discussion with hardware guy @S, Jayakrishna , HW takes care of TD flush along with transient-app flush. HW sequences these flushes on page reclamation request as well.

Tejas
> >
> > > +	 *                  2=XA ("WB - Transient App" for Xe3p), 3=UC
> > > +	 * From Xe3p, transient display flush is taken care by HW, l3_policy = 1
> > > +	 *
> > > +	 * Also with Xe3p, pat_index=18/19 corresponds to transient app
> > flushing
> > > +	 * which is handled by HW when media is off.
> > > +	 */
> > > +	return (l3_policy == 1 || (!is_media_awake && (pat_index == 18 ||
> > pat_index == 19)));
> > > +}
> > >
> > >   /**
> > >    * xe_page_reclaim_create_prl_bo() - Back a PRL with a
> > > suballocated GGTT
> > BO
> > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > index f82b4d0865e0..dafd4edd6f61 100644
> > > --- a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > @@ -17,6 +17,8 @@
> > >
> > >   struct xe_tlb_inval;
> > >   struct xe_tlb_inval_fence;
> > > +struct xe_tile;
> > > +struct xe_vma;
> > >
> > >   struct xe_guc_page_reclaim_entry {
> > >   	u32 valid:1;
> > > @@ -35,6 +37,7 @@ struct xe_page_reclaim_list {
> > >   #define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > >   };
> > >
> > > +bool xe_page_reclaim_skip(struct xe_tile *tile, struct xe_vma
> > > +*vma);
> > >   int xe_page_reclaim_create_prl_bo(struct xe_tlb_inval *tlb_inval,
> > > struct
> > xe_tlb_inval_fence *fence);
> > >   void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl);
> > >   int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> > > *prl); diff --git a/drivers/gpu/drm/xe/xe_pat.c
> > > b/drivers/gpu/drm/xe/xe_pat.c index 1b4d5d3def0f..4783acd1f027
> > > 100644
> > > --- a/drivers/gpu/drm/xe/xe_pat.c
> > > +++ b/drivers/gpu/drm/xe/xe_pat.c
> > > @@ -9,6 +9,7 @@
> > >
> > >   #include <generated/xe_wa_oob.h>
> > >
> > > +#include "regs/xe_gt_regs.h"
> > >   #include "regs/xe_reg_defs.h"
> > >   #include "xe_assert.h"
> > >   #include "xe_device.h"
> > > @@ -23,14 +24,6 @@
> > >   #define _PAT_INDEX(index)			_PICK_EVEN_2RANGES(index,
> > 8, \
> > >   								   0x4800,
> > 0x4804, \
> > >   								   0x4848,
> > 0x484c)
> > > -#define _PAT_PTA				0x4820
> > > -
> > > -#define XE2_NO_PROMOTE				REG_BIT(10)
> > > -#define XE2_COMP_EN				REG_BIT(9)
> > > -#define XE2_L3_CLOS				REG_GENMASK(7, 6)
> > > -#define XE2_L3_POLICY				REG_GENMASK(5, 4)
> > > -#define XE2_L4_POLICY				REG_GENMASK(3, 2)
> > > -#define XE2_COH_MODE				REG_GENMASK(1, 0)
> > >
> > >   #define XELPG_L4_POLICY_MASK			REG_GENMASK(3, 2)
> > >   #define XELPG_PAT_3_UC
> > 	REG_FIELD_PREP(XELPG_L4_POLICY_MASK, 3)
> > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > > index 03723c8d2601..8ccab39c2599 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > @@ -2008,7 +2008,8 @@ static int unbind_op_prepare(struct xe_tile
> *tile,
> > >   		if (err < 0)
> > >   			xe_page_reclaim_list_invalidate(&pt_update_ops-
> > >prl);
> > >   	}
> > > -	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > NULL;
> > > +	pt_op->prl = (pt_update_ops->prl.entries &&
> > > +		     !xe_page_reclaim_skip(tile, vma)) ? &pt_update_ops->prl :
> > NULL;
> > >
> > >   	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > >   	if (err)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 11:18     ` Nguyen, Brian3
@ 2025-11-25 18:34       ` Matthew Brost
  2025-11-25 19:01         ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-25 18:34 UTC (permalink / raw)
  To: Nguyen, Brian3
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > The PRL is firstly owned by pt_update_ops and all other page reclaim
> > > operations will point back to this PRL. PRL generates its entries
> > > during the unbind page walker, updating the PRL.
> > >
> > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > >
> > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > >  6 files changed, 217 insertions(+)
> > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c
> > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > >
> > > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > > index e4b273b025d2..048e6c93271c 100644
> > > --- a/drivers/gpu/drm/xe/Makefile
> > > +++ b/drivers/gpu/drm/xe/Makefile
> > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > >  	xe_oa.o \
> > >  	xe_observation.o \
> > >  	xe_pagefault.o \
> > > +	xe_page_reclaim.o \
> > >  	xe_pat.o \
> > >  	xe_pci.o \
> > >  	xe_pcode.o \
> > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > index 4389e5a76f89..4d83461e538b 100644
> > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > @@ -9,6 +9,7 @@
> > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > >
> > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > >
> > >  #define GUC_GGTT_TOP		0xFEE00000
> > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > new file mode 100644
> > > index 000000000000..a0d15efff58c
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > @@ -0,0 +1,52 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright (c) 2025 Intel Corporation  */
> > > +
> > > +#include <linux/bitfield.h>
> > > +#include <linux/kref.h>
> > > +#include <linux/mm.h>
> > > +#include <linux/slab.h>
> > > +
> > > +#include "xe_page_reclaim.h"
> > > +
> > > +#include "regs/xe_gt_regs.h"
> > > +#include "xe_assert.h"
> > > +#include "xe_macros.h"
> > > +
> > > +/**
> > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > > + * @prl: Page reclaim list to reset
> > > + *
> > > + * Clears the entries pointer and marks the list as invalid so
> > > + * future use know PRL is unusable. It is expected that the entries
> > > + * have already been released.
> > > + */
> > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > +*prl) {
> > > +	prl->entries = NULL;
> > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > +
> > > +/**
> > > + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list
> > > +entries
> > > + * @prl: Page reclaim list to allocate entries for
> > > + *
> > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > + */
> > > +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list
> > > +*prl) {
> > > +	struct page *page;
> > > +
> > > +	XE_WARN_ON(prl->entries != NULL);
> > > +	if (prl->entries)
> > > +		return 0;
> > > +
> > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > +	if (page) {
> > > +		prl->entries = page_address(page);
> > > +		prl->num_entries = 0;
> > > +	}
> > > +
> > > +	return page ? 0 : -ENOMEM;
> > > +}
> > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > new file mode 100644
> > > index 000000000000..d066d7d97f79
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > @@ -0,0 +1,49 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright (c) 2025 Intel Corporation  */
> > > +
> > > +#ifndef _XE_PAGE_RECLAIM_H_
> > > +#define _XE_PAGE_RECLAIM_H_
> > > +
> > > +#include <linux/kref.h>
> > > +#include <linux/mm.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/types.h>
> > > +#include <linux/workqueue.h>
> > > +
> > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > +
> > > +struct xe_guc_page_reclaim_entry {
> > > +	u32 valid:1;
> > > +	u32 reclamation_size:6;
> > > +	u32 reserved:5;
> > > +	u32 address_lo:20;
> > > +	u32 address_hi:20;
> > > +	u32 reserved1:12;
> > 
> > This is wire interface with the GuC. Bitfields can based on endianess of the CPU. I
> > know this is a iGPU feature for now but it could possibly change in the future, with
> > that, to future proof can the layout of this be setup via defines / macros?
> > 
> 
> Sure, I moved over to the typical FIELD_PREP/GENMASK macros used elsewhere
> for the guc interfaces.
> 
> > > +} __packed;
> > > +
> > > +struct xe_page_reclaim_list {
> > > +	/** @entries: array of page reclaim entries, page allocated */
> > > +	struct xe_guc_page_reclaim_entry *entries;
> > > +	/** @num_entries: number of entries */
> > > +	int num_entries;
> > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > +};
> > > +
> > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > +*prl); int xe_page_reclaim_list_alloc_entries(struct
> > > +xe_page_reclaim_list *prl); static inline void
> > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > +	if (entries)
> > > +		get_page(virt_to_page(entries));
> > > +}
> > > +
> > > +static inline void xe_page_reclaim_entries_put(struct
> > > +xe_guc_page_reclaim_entry *entries) {
> > > +	if (entries)
> > > +		put_page(virt_to_page(entries));
> > > +}
> > 
> > Kernel doc for static inlines.
> > 
> 
> Added.
> 
> > > +
> > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > > index 884127b4d97d..532a047676d4 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > @@ -12,6 +12,7 @@
> > >  #include "xe_exec_queue.h"
> > >  #include "xe_gt.h"
> > >  #include "xe_migrate.h"
> > > +#include "xe_page_reclaim.h"
> > >  #include "xe_pt_types.h"
> > >  #include "xe_pt_walk.h"
> > >  #include "xe_res_cursor.h"
> > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > >  	/* Output */
> > >  	/* @wupd: Structure to track the page-table updates we're building */
> > >  	struct xe_walk_update wupd;
> > > +
> > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > +	struct xe_page_reclaim_list *prl;
> > >  };
> > >
> > >  /*
> > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64 next,
> > unsigned int level,
> > >  	return false;
> > >  }
> > >
> > > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > > +	return pte->level == 1 && !pte->base.children; }
> > > +
> > > +/* page_size = 2^(reclamation_size + 12) */
> > > +#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > 	\
> > > +({									\
> > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > +	ilog2(page_size) - 12;						\
> > 
> > s/12/XE_PTE_SHIFT ?
> > 
> 
> Done.
> 
> > > +})
> > > +
> > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > +				   struct xe_page_reclaim_list *prl,
> > > +				   u64 pte,
> > > +				   struct xe_pt *xe_child)
> > 
> > Nit, xe_pt can be on the same line as 'u64 pte'.
> > 
> 
> Done.
> 
> > > +{
> > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > +	u32 reclamation_size;
> > 
> > Nit, I'd make the last variable declared on the stack for readability.
> > 
> 
> Ahh got it, reclamation_size moved to after num_entries.
> 
> > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > +	int num_entries = prl->num_entries;
> > > +
> > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > +	xe_tile_assert(tile, reclaim_entries);
> > > +
> > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > +		return;
> > > +
> > > +	/* Overflow: mark as invalid through num_entries */
> > > +	if (num_entries >= max_entries) {
> > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > +		return;
> > > +	}
> > > +
> > > +	/**
> > > +	 * reclamation_size indicates the size of the page to be
> > > +	 * invalidated and flushed from non-coherent cache.
> > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > +	 * Only valid for these specific levels.
> > > +	 */
> > > +
> > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > +	else if (xe_child->level == 0)
> > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > +	else if (is_large_pte(xe_child))
> > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /* reclamation_size = 2 */
> > 
> > What happens if we have 1G page? That doesn't seem to be handled.
> > 
> 
> Page reclamation hardware does not support 1G page. This should
> be handled and fallback to standard TLB invalidation PPC flush. I can add

Make sense that we fallback. I am however not seeing where this fallback occurs.

> a comment somewhere discussing this but the format for PRL only
> supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
> mentioning the HW support being limited to these pages and rename the
> is_large_pte to is_2m_pte.
> 
> > > +	else
> > > +		return;

I would think for the fallback, we'd set prl->num_entries to
XE_PAGE_RECLAIM_INVALID_LIST here.

Maybe I'm missing something?

Matt

> > > +
> > > +	reclaim_entries[num_entries].valid = 1;
> > > +	reclaim_entries[num_entries].reclamation_size =
> > > +		reclamation_size;
> > > +	reclaim_entries[num_entries].address_lo =
> > > +		FIELD_GET(field_mask, phys_addr);
> > > +	reclaim_entries[num_entries].address_hi =
> > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > 
> > As suggested above, use macros/defines here to setup the entry.
> > 
> 
> Got it, moved over to using other standard define macros.
> 
> > > +	prl->num_entries++;
> > > +}
> > > +
> > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > >  				    unsigned int level, u64 addr, u64 next,
> > >  				    struct xe_ptw **child,
> > > @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw
> > *parent, pgoff_t offset,
> > >  				    struct xe_pt_walk *walk)
> > >  {
> > >  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child),
> > > base);
> > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > +		container_of(walk, typeof(*xe_walk), base);
> > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > >
> > >  	XE_WARN_ON(!*child);
> > >  	XE_WARN_ON(!level);
> > >
> > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > +is_large_pte(xe_child))) {
> > 
> > And also here? 1G pages are unhandled? Please explain.
> > 
> 
> As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> 
> > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > +
> > > +		for (pgoff_t i = 0; i < count; i++) {
> > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64),
> > u64);
> > > +
> > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > +					       pte, xe_child);
> > > +		}
> > > +	}
> > > +
> > >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> > >
> > >  	return 0;
> > > @@ -1654,6 +1738,8 @@ static unsigned int xe_pt_stage_unbind(struct
> > > xe_tile *tile,  {
> > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > entries[0]);
> > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > >  		.base = {
> > >  			.ops = &xe_pt_stage_unbind_ops,
> > > @@ -1665,6 +1751,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile
> > *tile,
> > >  		.modified_start = start,
> > >  		.modified_end = end,
> > >  		.wupd.entries = entries,
> > > +		.prl = pt_update_op->prl,
> > >  	};
> > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > >
> > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > >  			     struct xe_vma *vma)
> > >  {
> > > +	struct xe_device *xe = tile_to_xe(tile);
> > >  	u32 current_op = pt_update_ops->current_op;
> > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > >ops[current_op];
> > >  	int err;
> > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > >  	pt_op->vma = vma;
> > >  	pt_op->bind = false;
> > >  	pt_op->rebind = false;
> > > +	/* Maintain one PRL located in pt_update_ops that all others in unbind op
> > reference */
> > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > +		if (err < 0)
> > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > 
> > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > 
> 
> Yes. I'll drop this call for now then.
> 
> > > +	}
> > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > +NULL;
> > >
> > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > >  	if (err)
> > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile
> > > *tile,
> > >
> > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > >  						vma, NULL, pt_op->entries);
> > > +	/* Free PRL if list declared as invalid */
> > > +	if (pt_update_ops->prl.entries &&
> > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > +		pt_op->prl = NULL;
> > > +		pt_update_ops->prl.entries = NULL;
> > 
> > Call xe_page_reclaim_list_invalidate for clarity?
> > 
> 
> Updated.
> 
> > > +	}
> > >
> > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > >  				pt_op->num_entries, false);
> > > @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
> > >  	pt_op->vma = XE_INVALID_VMA;
> > >  	pt_op->bind = false;
> > >  	pt_op->rebind = false;
> > > +	pt_op->prl = NULL;
> > >
> > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > >  						pt_op->entries);
> > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > xe_vm_pgtable_update_ops *pt_update_ops)
> > >  	init_llist_head(&pt_update_ops->deferred);
> > >  	pt_update_ops->start = ~0x0ull;
> > >  	pt_update_ops->last = 0x0ull;
> > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > 
> > Can we introduce a function called xe_page_reclaim_list_init for clarity? It might
> > do the same thing as xe_page_reclaim_list_invalidate but it would make this a
> > little more clear. Likewise later in the series when a job is created, you can call
> > xe_page_reclaim_list_init there too.
> > 
> 
> Sure, I'll write another helper for this and modify both those PRL creation points.
> 
> > >  }
> > >
> > >  /**
> > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > >  		&vops->pt_update_ops[tile->id];
> > >  	int i;
> > >
> > > +	if (pt_update_ops->prl.entries) {
> > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > +	}
> > > +
> > >  	lockdep_assert_held(&vops->vm->lock);
> > >  	xe_vm_assert_held(vops->vm);
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > index 881f01e14db8..26e5295f118e 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > @@ -8,6 +8,7 @@
> > >
> > >  #include <linux/types.h>
> > >
> > > +#include "xe_page_reclaim.h"
> > >  #include "xe_pt_walk.h"
> > >
> > >  struct xe_bo;
> > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > >  	bool bind;
> > >  	/** @rebind: is a rebind */
> > >  	bool rebind;
> > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > +	struct xe_page_reclaim_list *prl;
> > 
> > Can you move this above the bools in the layout of xe_vm_pgtable_update_op,
> > likely just below "struct xe_vma".
> > 
> 
> Ahh got it. Moved.
> 
> > >  };
> > >
> > >  /** struct xe_vm_pgtable_update_ops: page table update operations */
> > > @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > >  	 * slots are idle.
> > >  	 */
> > >  	bool wait_vm_kernel;
> > > +	/** @prl: embedded page reclaim list */
> > > +	struct xe_page_reclaim_list prl;
> > 
> > Same thing here, move just below "struct xe_exec_queue".
> > 
> > Matt
> > 
> 
> Moved.
> 
> Brian
> 
> > >  };
> > >
> > >  #endif
> > > --
> > > 2.51.2
> > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 18:34       ` Matthew Brost
@ 2025-11-25 19:01         ` Nguyen, Brian3
  2025-11-25 19:07           ` Matthew Brost
  0 siblings, 1 reply; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 19:01 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tuesday, November 25, 2025 10:34 AM, Matthew Brost wrote:
> On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> > On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > > The PRL is firstly owned by pt_update_ops and all other page
> > > > reclaim operations will point back to this PRL. PRL generates its
> > > > entries during the unbind page walker, updating the PRL.
> > > >
> > > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > > >
> > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > > >  6 files changed, 217 insertions(+)  create mode 100644
> > > > drivers/gpu/drm/xe/xe_page_reclaim.c
> > > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/Makefile
> > > > b/drivers/gpu/drm/xe/Makefile index e4b273b025d2..048e6c93271c
> > > > 100644
> > > > --- a/drivers/gpu/drm/xe/Makefile
> > > > +++ b/drivers/gpu/drm/xe/Makefile
> > > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > > >  	xe_oa.o \
> > > >  	xe_observation.o \
> > > >  	xe_pagefault.o \
> > > > +	xe_page_reclaim.o \
> > > >  	xe_pat.o \
> > > >  	xe_pci.o \
> > > >  	xe_pcode.o \
> > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > index 4389e5a76f89..4d83461e538b 100644
> > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > @@ -9,6 +9,7 @@
> > > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > > >
> > > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > > >
> > > >  #define GUC_GGTT_TOP		0xFEE00000
> > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > new file mode 100644
> > > > index 000000000000..a0d15efff58c
> > > > --- /dev/null
> > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > @@ -0,0 +1,52 @@
> > > > +// SPDX-License-Identifier: MIT
> > > > +/*
> > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > +
> > > > +#include <linux/bitfield.h>
> > > > +#include <linux/kref.h>
> > > > +#include <linux/mm.h>
> > > > +#include <linux/slab.h>
> > > > +
> > > > +#include "xe_page_reclaim.h"
> > > > +
> > > > +#include "regs/xe_gt_regs.h"
> > > > +#include "xe_assert.h"
> > > > +#include "xe_macros.h"
> > > > +
> > > > +/**
> > > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > > > + * @prl: Page reclaim list to reset
> > > > + *
> > > > + * Clears the entries pointer and marks the list as invalid so
> > > > + * future use know PRL is unusable. It is expected that the
> > > > +entries
> > > > + * have already been released.
> > > > + */
> > > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > > +*prl) {
> > > > +	prl->entries = NULL;
> > > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > > +
> > > > +/**
> > > > + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim
> > > > +list entries
> > > > + * @prl: Page reclaim list to allocate entries for
> > > > + *
> > > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > > + */
> > > > +int xe_page_reclaim_list_alloc_entries(struct
> > > > +xe_page_reclaim_list
> > > > +*prl) {
> > > > +	struct page *page;
> > > > +
> > > > +	XE_WARN_ON(prl->entries != NULL);
> > > > +	if (prl->entries)
> > > > +		return 0;
> > > > +
> > > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > +	if (page) {
> > > > +		prl->entries = page_address(page);
> > > > +		prl->num_entries = 0;
> > > > +	}
> > > > +
> > > > +	return page ? 0 : -ENOMEM;
> > > > +}
> > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > new file mode 100644
> > > > index 000000000000..d066d7d97f79
> > > > --- /dev/null
> > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > @@ -0,0 +1,49 @@
> > > > +/* SPDX-License-Identifier: MIT */
> > > > +/*
> > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > +
> > > > +#ifndef _XE_PAGE_RECLAIM_H_
> > > > +#define _XE_PAGE_RECLAIM_H_
> > > > +
> > > > +#include <linux/kref.h>
> > > > +#include <linux/mm.h>
> > > > +#include <linux/slab.h>
> > > > +#include <linux/types.h>
> > > > +#include <linux/workqueue.h>
> > > > +
> > > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > > +
> > > > +struct xe_guc_page_reclaim_entry {
> > > > +	u32 valid:1;
> > > > +	u32 reclamation_size:6;
> > > > +	u32 reserved:5;
> > > > +	u32 address_lo:20;
> > > > +	u32 address_hi:20;
> > > > +	u32 reserved1:12;
> > >
> > > This is wire interface with the GuC. Bitfields can based on
> > > endianess of the CPU. I know this is a iGPU feature for now but it
> > > could possibly change in the future, with that, to future proof can the layout of this be setup via defines / macros?
> > >
> >
> > Sure, I moved over to the typical FIELD_PREP/GENMASK macros used
> > elsewhere for the guc interfaces.
> >
> > > > +} __packed;
> > > > +
> > > > +struct xe_page_reclaim_list {
> > > > +	/** @entries: array of page reclaim entries, page allocated */
> > > > +	struct xe_guc_page_reclaim_entry *entries;
> > > > +	/** @num_entries: number of entries */
> > > > +	int num_entries;
> > > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > > +};
> > > > +
> > > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > > +*prl); int xe_page_reclaim_list_alloc_entries(struct
> > > > +xe_page_reclaim_list *prl); static inline void
> > > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > > +	if (entries)
> > > > +		get_page(virt_to_page(entries)); }
> > > > +
> > > > +static inline void xe_page_reclaim_entries_put(struct
> > > > +xe_guc_page_reclaim_entry *entries) {
> > > > +	if (entries)
> > > > +		put_page(virt_to_page(entries)); }
> > >
> > > Kernel doc for static inlines.
> > >
> >
> > Added.
> >
> > > > +
> > > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > > b/drivers/gpu/drm/xe/xe_pt.c index 884127b4d97d..532a047676d4
> > > > 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > @@ -12,6 +12,7 @@
> > > >  #include "xe_exec_queue.h"
> > > >  #include "xe_gt.h"
> > > >  #include "xe_migrate.h"
> > > > +#include "xe_page_reclaim.h"
> > > >  #include "xe_pt_types.h"
> > > >  #include "xe_pt_walk.h"
> > > >  #include "xe_res_cursor.h"
> > > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > > >  	/* Output */
> > > >  	/* @wupd: Structure to track the page-table updates we're building */
> > > >  	struct xe_walk_update wupd;
> > > > +
> > > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > > +	struct xe_page_reclaim_list *prl;
> > > >  };
> > > >
> > > >  /*
> > > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64
> > > > next,
> > > unsigned int level,
> > > >  	return false;
> > > >  }
> > > >
> > > > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > > > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > > > +	return pte->level == 1 && !pte->base.children; }
> > > > +
> > > > +/* page_size = 2^(reclamation_size + 12) */ #define
> > > > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > > 	\
> > > > +({									\
> > > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > > +	ilog2(page_size) - 12;						\
> > >
> > > s/12/XE_PTE_SHIFT ?
> > >
> >
> > Done.
> >
> > > > +})
> > > > +
> > > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > > +				   struct xe_page_reclaim_list *prl,
> > > > +				   u64 pte,
> > > > +				   struct xe_pt *xe_child)
> > >
> > > Nit, xe_pt can be on the same line as 'u64 pte'.
> > >
> >
> > Done.
> >
> > > > +{
> > > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > > +	u32 reclamation_size;
> > >
> > > Nit, I'd make the last variable declared on the stack for readability.
> > >
> >
> > Ahh got it, reclamation_size moved to after num_entries.
> >
> > > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > > +	int num_entries = prl->num_entries;
> > > > +
> > > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > > +	xe_tile_assert(tile, reclaim_entries);
> > > > +
> > > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > > +		return;
> > > > +
> > > > +	/* Overflow: mark as invalid through num_entries */
> > > > +	if (num_entries >= max_entries) {
> > > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	/**
> > > > +	 * reclamation_size indicates the size of the page to be
> > > > +	 * invalidated and flushed from non-coherent cache.
> > > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > > +	 * Only valid for these specific levels.
> > > > +	 */
> > > > +
> > > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > > +	else if (xe_child->level == 0)
> > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > > +	else if (is_large_pte(xe_child))
> > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /*
> > > > +reclamation_size = 2 */
> > >
> > > What happens if we have 1G page? That doesn't seem to be handled.
> > >
> >
> > Page reclamation hardware does not support 1G page. This should be
> > handled and fallback to standard TLB invalidation PPC flush. I can add
> 
> Make sense that we fallback. I am however not seeing where this fallback occurs.
> 

!! Ohh I got it now, I silently dropped the 1G pages... My bad. I'll follow the new
changes suggested below.

> > a comment somewhere discussing this but the format for PRL only
> > supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
> > mentioning the HW support being limited to these pages and rename the
> > is_large_pte to is_2m_pte.
> >
> > > > +	else
> > > > +		return;
> 
> I would think for the fallback, we'd set prl->num_entries to XE_PAGE_RECLAIM_INVALID_LIST here.
> 
> Maybe I'm missing something?
> 
> Matt
> 

Given the 1G page, I'll follow this idea. Invalidate the PRL, and then change the if statement in the
generate_reclaim_entry() caller to accept all PTE and invalidate it in this function above.

> > > > +
> > > > +	reclaim_entries[num_entries].valid = 1;
> > > > +	reclaim_entries[num_entries].reclamation_size =
> > > > +		reclamation_size;
> > > > +	reclaim_entries[num_entries].address_lo =
> > > > +		FIELD_GET(field_mask, phys_addr);
> > > > +	reclaim_entries[num_entries].address_hi =
> > > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > >
> > > As suggested above, use macros/defines here to setup the entry.
> > >
> >
> > Got it, moved over to using other standard define macros.
> >
> > > > +	prl->num_entries++;
> > > > +}
> > > > +
> > > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > >  				    unsigned int level, u64 addr, u64 next,
> > > >  				    struct xe_ptw **child,
> > > > @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct
> > > > xe_ptw
> > > *parent, pgoff_t offset,
> > > >  				    struct xe_pt_walk *walk)
> > > >  {
> > > >  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child),
> > > > base);
> > > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > > +		container_of(walk, typeof(*xe_walk), base);
> > > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > > >
> > > >  	XE_WARN_ON(!*child);
> > > >  	XE_WARN_ON(!level);
> > > >
> > > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > > +is_large_pte(xe_child))) {

So right here, I'll make the change to accept all the leafs of the walker and handle
the 1G case in generate_reclaim_entry().

Brian

> > >
> > > And also here? 1G pages are unhandled? Please explain.
> > >
> >
> > As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> > 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> >
> > > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > > +
> > > > +		for (pgoff_t i = 0; i < count; i++) {
> > > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64),
> > > u64);
> > > > +
> > > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > > +					       pte, xe_child);
> > > > +		}
> > > > +	}
> > > > +
> > > >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> > > >
> > > >  	return 0;
> > > > @@ -1654,6 +1738,8 @@ static unsigned int
> > > > xe_pt_stage_unbind(struct xe_tile *tile,  {
> > > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > > >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > > entries[0]);
> > > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > > >  		.base = {
> > > >  			.ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1751,7 @@ static
> > > > unsigned int xe_pt_stage_unbind(struct xe_tile
> > > *tile,
> > > >  		.modified_start = start,
> > > >  		.modified_end = end,
> > > >  		.wupd.entries = entries,
> > > > +		.prl = pt_update_op->prl,
> > > >  	};
> > > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > > >
> > > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > > >  			     struct xe_vma *vma)
> > > >  {
> > > > +	struct xe_device *xe = tile_to_xe(tile);
> > > >  	u32 current_op = pt_update_ops->current_op;
> > > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > > >ops[current_op];
> > > >  	int err;
> > > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > >  	pt_op->vma = vma;
> > > >  	pt_op->bind = false;
> > > >  	pt_op->rebind = false;
> > > > +	/* Maintain one PRL located in pt_update_ops that all others in
> > > > +unbind op
> > > reference */
> > > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > > +		if (err < 0)
> > > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > >
> > > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > >
> >
> > Yes. I'll drop this call for now then.
> >
> > > > +	}
> > > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > > +NULL;
> > > >
> > > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > > >  	if (err)
> > > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile
> > > > *tile,
> > > >
> > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > > >  						vma, NULL, pt_op->entries);
> > > > +	/* Free PRL if list declared as invalid */
> > > > +	if (pt_update_ops->prl.entries &&
> > > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > +		pt_op->prl = NULL;
> > > > +		pt_update_ops->prl.entries = NULL;
> > >
> > > Call xe_page_reclaim_list_invalidate for clarity?
> > >
> >
> > Updated.
> >
> > > > +	}
> > > >
> > > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > > >  				pt_op->num_entries, false);
> > > > @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
> > > >  	pt_op->vma = XE_INVALID_VMA;
> > > >  	pt_op->bind = false;
> > > >  	pt_op->rebind = false;
> > > > +	pt_op->prl = NULL;
> > > >
> > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > > >  						pt_op->entries);
> > > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > > xe_vm_pgtable_update_ops *pt_update_ops)
> > > >  	init_llist_head(&pt_update_ops->deferred);
> > > >  	pt_update_ops->start = ~0x0ull;
> > > >  	pt_update_ops->last = 0x0ull;
> > > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > >
> > > Can we introduce a function called xe_page_reclaim_list_init for
> > > clarity? It might do the same thing as
> > > xe_page_reclaim_list_invalidate but it would make this a little more
> > > clear. Likewise later in the series when a job is created, you can call xe_page_reclaim_list_init there too.
> > >
> >
> > Sure, I'll write another helper for this and modify both those PRL creation points.
> >
> > > >  }
> > > >
> > > >  /**
> > > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > > >  		&vops->pt_update_ops[tile->id];
> > > >  	int i;
> > > >
> > > > +	if (pt_update_ops->prl.entries) {
> > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > +	}
> > > > +
> > > >  	lockdep_assert_held(&vops->vm->lock);
> > > >  	xe_vm_assert_held(vops->vm);
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > index 881f01e14db8..26e5295f118e 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > @@ -8,6 +8,7 @@
> > > >
> > > >  #include <linux/types.h>
> > > >
> > > > +#include "xe_page_reclaim.h"
> > > >  #include "xe_pt_walk.h"
> > > >
> > > >  struct xe_bo;
> > > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > > >  	bool bind;
> > > >  	/** @rebind: is a rebind */
> > > >  	bool rebind;
> > > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > > +	struct xe_page_reclaim_list *prl;
> > >
> > > Can you move this above the bools in the layout of
> > > xe_vm_pgtable_update_op, likely just below "struct xe_vma".
> > >
> >
> > Ahh got it. Moved.
> >
> > > >  };
> > > >
> > > >  /** struct xe_vm_pgtable_update_ops: page table update operations
> > > > */ @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > > >  	 * slots are idle.
> > > >  	 */
> > > >  	bool wait_vm_kernel;
> > > > +	/** @prl: embedded page reclaim list */
> > > > +	struct xe_page_reclaim_list prl;
> > >
> > > Same thing here, move just below "struct xe_exec_queue".
> > >
> > > Matt
> > >
> >
> > Moved.
> >
> > Brian
> >
> > > >  };
> > > >
> > > >  #endif
> > > > --
> > > > 2.51.2
> > > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 19:01         ` Nguyen, Brian3
@ 2025-11-25 19:07           ` Matthew Brost
  2025-11-25 19:46             ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-25 19:07 UTC (permalink / raw)
  To: Nguyen, Brian3
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tue, Nov 25, 2025 at 12:01:25PM -0700, Nguyen, Brian3 wrote:
> On Tuesday, November 25, 2025 10:34 AM, Matthew Brost wrote:
> > On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> > > On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > > > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > > > The PRL is firstly owned by pt_update_ops and all other page
> > > > > reclaim operations will point back to this PRL. PRL generates its
> > > > > entries during the unbind page walker, updating the PRL.
> > > > >
> > > > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > > > >
> > > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > > > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > > > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > > > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > > > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > > > >  6 files changed, 217 insertions(+)  create mode 100644
> > > > > drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/Makefile
> > > > > b/drivers/gpu/drm/xe/Makefile index e4b273b025d2..048e6c93271c
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/xe/Makefile
> > > > > +++ b/drivers/gpu/drm/xe/Makefile
> > > > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > > > >  	xe_oa.o \
> > > > >  	xe_observation.o \
> > > > >  	xe_pagefault.o \
> > > > > +	xe_page_reclaim.o \
> > > > >  	xe_pat.o \
> > > > >  	xe_pci.o \
> > > > >  	xe_pcode.o \
> > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > index 4389e5a76f89..4d83461e538b 100644
> > > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > @@ -9,6 +9,7 @@
> > > > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > > > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > > > >
> > > > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > > > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > > > >
> > > > >  #define GUC_GGTT_TOP		0xFEE00000
> > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > new file mode 100644
> > > > > index 000000000000..a0d15efff58c
> > > > > --- /dev/null
> > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > @@ -0,0 +1,52 @@
> > > > > +// SPDX-License-Identifier: MIT
> > > > > +/*
> > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > +
> > > > > +#include <linux/bitfield.h>
> > > > > +#include <linux/kref.h>
> > > > > +#include <linux/mm.h>
> > > > > +#include <linux/slab.h>
> > > > > +
> > > > > +#include "xe_page_reclaim.h"
> > > > > +
> > > > > +#include "regs/xe_gt_regs.h"
> > > > > +#include "xe_assert.h"
> > > > > +#include "xe_macros.h"
> > > > > +
> > > > > +/**
> > > > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > > > > + * @prl: Page reclaim list to reset
> > > > > + *
> > > > > + * Clears the entries pointer and marks the list as invalid so
> > > > > + * future use know PRL is unusable. It is expected that the
> > > > > +entries
> > > > > + * have already been released.
> > > > > + */
> > > > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > > > +*prl) {
> > > > > +	prl->entries = NULL;
> > > > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > > > +
> > > > > +/**
> > > > > + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim
> > > > > +list entries
> > > > > + * @prl: Page reclaim list to allocate entries for
> > > > > + *
> > > > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > > > + */
> > > > > +int xe_page_reclaim_list_alloc_entries(struct
> > > > > +xe_page_reclaim_list
> > > > > +*prl) {
> > > > > +	struct page *page;
> > > > > +
> > > > > +	XE_WARN_ON(prl->entries != NULL);
> > > > > +	if (prl->entries)
> > > > > +		return 0;
> > > > > +
> > > > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > > +	if (page) {
> > > > > +		prl->entries = page_address(page);
> > > > > +		prl->num_entries = 0;
> > > > > +	}
> > > > > +
> > > > > +	return page ? 0 : -ENOMEM;
> > > > > +}
> > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > new file mode 100644
> > > > > index 000000000000..d066d7d97f79
> > > > > --- /dev/null
> > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > @@ -0,0 +1,49 @@
> > > > > +/* SPDX-License-Identifier: MIT */
> > > > > +/*
> > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > +
> > > > > +#ifndef _XE_PAGE_RECLAIM_H_
> > > > > +#define _XE_PAGE_RECLAIM_H_
> > > > > +
> > > > > +#include <linux/kref.h>
> > > > > +#include <linux/mm.h>
> > > > > +#include <linux/slab.h>
> > > > > +#include <linux/types.h>
> > > > > +#include <linux/workqueue.h>
> > > > > +
> > > > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > > > +
> > > > > +struct xe_guc_page_reclaim_entry {
> > > > > +	u32 valid:1;
> > > > > +	u32 reclamation_size:6;
> > > > > +	u32 reserved:5;
> > > > > +	u32 address_lo:20;
> > > > > +	u32 address_hi:20;
> > > > > +	u32 reserved1:12;
> > > >
> > > > This is wire interface with the GuC. Bitfields can based on
> > > > endianess of the CPU. I know this is a iGPU feature for now but it
> > > > could possibly change in the future, with that, to future proof can the layout of this be setup via defines / macros?
> > > >
> > >
> > > Sure, I moved over to the typical FIELD_PREP/GENMASK macros used
> > > elsewhere for the guc interfaces.
> > >
> > > > > +} __packed;
> > > > > +
> > > > > +struct xe_page_reclaim_list {
> > > > > +	/** @entries: array of page reclaim entries, page allocated */
> > > > > +	struct xe_guc_page_reclaim_entry *entries;
> > > > > +	/** @num_entries: number of entries */
> > > > > +	int num_entries;
> > > > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > > > +};
> > > > > +
> > > > > +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list
> > > > > +*prl); int xe_page_reclaim_list_alloc_entries(struct
> > > > > +xe_page_reclaim_list *prl); static inline void
> > > > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > > > +	if (entries)
> > > > > +		get_page(virt_to_page(entries)); }
> > > > > +
> > > > > +static inline void xe_page_reclaim_entries_put(struct
> > > > > +xe_guc_page_reclaim_entry *entries) {
> > > > > +	if (entries)
> > > > > +		put_page(virt_to_page(entries)); }
> > > >
> > > > Kernel doc for static inlines.
> > > >
> > >
> > > Added.
> > >
> > > > > +
> > > > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > > > b/drivers/gpu/drm/xe/xe_pt.c index 884127b4d97d..532a047676d4
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > > @@ -12,6 +12,7 @@
> > > > >  #include "xe_exec_queue.h"
> > > > >  #include "xe_gt.h"
> > > > >  #include "xe_migrate.h"
> > > > > +#include "xe_page_reclaim.h"
> > > > >  #include "xe_pt_types.h"
> > > > >  #include "xe_pt_walk.h"
> > > > >  #include "xe_res_cursor.h"
> > > > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > > > >  	/* Output */
> > > > >  	/* @wupd: Structure to track the page-table updates we're building */
> > > > >  	struct xe_walk_update wupd;
> > > > > +
> > > > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > > > +	struct xe_page_reclaim_list *prl;
> > > > >  };
> > > > >
> > > > >  /*
> > > > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr, u64
> > > > > next,
> > > > unsigned int level,
> > > > >  	return false;
> > > > >  }
> > > > >
> > > > > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > > > > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > > > > +	return pte->level == 1 && !pte->base.children; }
> > > > > +
> > > > > +/* page_size = 2^(reclamation_size + 12) */ #define
> > > > > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > > > 	\
> > > > > +({									\
> > > > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > > > +	ilog2(page_size) - 12;						\
> > > >
> > > > s/12/XE_PTE_SHIFT ?
> > > >
> > >
> > > Done.
> > >
> > > > > +})
> > > > > +
> > > > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > > > +				   struct xe_page_reclaim_list *prl,
> > > > > +				   u64 pte,
> > > > > +				   struct xe_pt *xe_child)
> > > >
> > > > Nit, xe_pt can be on the same line as 'u64 pte'.
> > > >
> > >
> > > Done.
> > >
> > > > > +{
> > > > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > > > +	u32 reclamation_size;
> > > >
> > > > Nit, I'd make the last variable declared on the stack for readability.
> > > >
> > >
> > > Ahh got it, reclamation_size moved to after num_entries.
> > >
> > > > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > > > +	int num_entries = prl->num_entries;
> > > > > +
> > > > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > > > +	xe_tile_assert(tile, reclaim_entries);
> > > > > +
> > > > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > > > +		return;
> > > > > +
> > > > > +	/* Overflow: mark as invalid through num_entries */
> > > > > +	if (num_entries >= max_entries) {
> > > > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	/**
> > > > > +	 * reclamation_size indicates the size of the page to be
> > > > > +	 * invalidated and flushed from non-coherent cache.
> > > > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > > > +	 * Only valid for these specific levels.
> > > > > +	 */
> > > > > +
> > > > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > > > +	else if (xe_child->level == 0)
> > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > > > +	else if (is_large_pte(xe_child))
> > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /*
> > > > > +reclamation_size = 2 */
> > > >
> > > > What happens if we have 1G page? That doesn't seem to be handled.
> > > >
> > >
> > > Page reclamation hardware does not support 1G page. This should be
> > > handled and fallback to standard TLB invalidation PPC flush. I can add
> > 
> > Make sense that we fallback. I am however not seeing where this fallback occurs.
> > 
> 
> !! Ohh I got it now, I silently dropped the 1G pages... My bad. I'll follow the new
> changes suggested below.
> 
> > > a comment somewhere discussing this but the format for PRL only
> > > supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
> > > mentioning the HW support being limited to these pages and rename the
> > > is_large_pte to is_2m_pte.
> > >
> > > > > +	else
> > > > > +		return;
> > 
> > I would think for the fallback, we'd set prl->num_entries to XE_PAGE_RECLAIM_INVALID_LIST here.
> > 
> > Maybe I'm missing something?
> > 
> > Matt
> > 
> 
> Given the 1G page, I'll follow this idea. Invalidate the PRL, and then change the if statement in the
> generate_reclaim_entry() caller to accept all PTE and invalidate it in this function above.
> 
> > > > > +
> > > > > +	reclaim_entries[num_entries].valid = 1;
> > > > > +	reclaim_entries[num_entries].reclamation_size =
> > > > > +		reclamation_size;
> > > > > +	reclaim_entries[num_entries].address_lo =
> > > > > +		FIELD_GET(field_mask, phys_addr);
> > > > > +	reclaim_entries[num_entries].address_hi =
> > > > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > > >
> > > > As suggested above, use macros/defines here to setup the entry.
> > > >
> > >
> > > Got it, moved over to using other standard define macros.
> > >
> > > > > +	prl->num_entries++;
> > > > > +}
> > > > > +
> > > > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > > >  				    unsigned int level, u64 addr, u64 next,
> > > > >  				    struct xe_ptw **child,
> > > > > @@ -1579,10 +1646,27 @@ static int xe_pt_stage_unbind_entry(struct
> > > > > xe_ptw
> > > > *parent, pgoff_t offset,
> > > > >  				    struct xe_pt_walk *walk)
> > > > >  {
> > > > >  	struct xe_pt *xe_child = container_of(*child, typeof(*xe_child),
> > > > > base);
> > > > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > > > +		container_of(walk, typeof(*xe_walk), base);
> > > > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > > > >
> > > > >  	XE_WARN_ON(!*child);
> > > > >  	XE_WARN_ON(!level);
> > > > >
> > > > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > > > +is_large_pte(xe_child))) {
> 
> So right here, I'll make the change to accept all the leafs of the walker and handle
> the 1G case in generate_reclaim_entry().
> 

It is possible we are even higher up page table tree too (e.g. with 57
bit VAs there are 2 level above 1G, 48 bits one level). We need to
handle those cases as fallbacks to cache flushing TLB invalidations too.

Matt

> Brian
> 
> > > >
> > > > And also here? 1G pages are unhandled? Please explain.
> > > >
> > >
> > > As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> > > 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> > >
> > > > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > > > +
> > > > > +		for (pgoff_t i = 0; i < count; i++) {
> > > > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64),
> > > > u64);
> > > > > +
> > > > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > > > +					       pte, xe_child);
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk);
> > > > >
> > > > >  	return 0;
> > > > > @@ -1654,6 +1738,8 @@ static unsigned int
> > > > > xe_pt_stage_unbind(struct xe_tile *tile,  {
> > > > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > > > >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > > > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > > > entries[0]);
> > > > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > > > >  		.base = {
> > > > >  			.ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1751,7 @@ static
> > > > > unsigned int xe_pt_stage_unbind(struct xe_tile
> > > > *tile,
> > > > >  		.modified_start = start,
> > > > >  		.modified_end = end,
> > > > >  		.wupd.entries = entries,
> > > > > +		.prl = pt_update_op->prl,
> > > > >  	};
> > > > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > > > >
> > > > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > > > >  			     struct xe_vma *vma)
> > > > >  {
> > > > > +	struct xe_device *xe = tile_to_xe(tile);
> > > > >  	u32 current_op = pt_update_ops->current_op;
> > > > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > > > >ops[current_op];
> > > > >  	int err;
> > > > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > >  	pt_op->vma = vma;
> > > > >  	pt_op->bind = false;
> > > > >  	pt_op->rebind = false;
> > > > > +	/* Maintain one PRL located in pt_update_ops that all others in
> > > > > +unbind op
> > > > reference */
> > > > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > > > +		if (err < 0)
> > > > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > >
> > > > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > > > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > > >
> > >
> > > Yes. I'll drop this call for now then.
> > >
> > > > > +	}
> > > > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > > > +NULL;
> > > > >
> > > > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > > > >  	if (err)
> > > > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct xe_tile
> > > > > *tile,
> > > > >
> > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > > > >  						vma, NULL, pt_op->entries);
> > > > > +	/* Free PRL if list declared as invalid */
> > > > > +	if (pt_update_ops->prl.entries &&
> > > > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > +		pt_op->prl = NULL;
> > > > > +		pt_update_ops->prl.entries = NULL;
> > > >
> > > > Call xe_page_reclaim_list_invalidate for clarity?
> > > >
> > >
> > > Updated.
> > >
> > > > > +	}
> > > > >
> > > > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > > > >  				pt_op->num_entries, false);
> > > > > @@ -1979,6 +2081,7 @@ static int unbind_range_prepare(struct xe_vm *vm,
> > > > >  	pt_op->vma = XE_INVALID_VMA;
> > > > >  	pt_op->bind = false;
> > > > >  	pt_op->rebind = false;
> > > > > +	pt_op->prl = NULL;
> > > > >
> > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > > > >  						pt_op->entries);
> > > > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > > > xe_vm_pgtable_update_ops *pt_update_ops)
> > > > >  	init_llist_head(&pt_update_ops->deferred);
> > > > >  	pt_update_ops->start = ~0x0ull;
> > > > >  	pt_update_ops->last = 0x0ull;
> > > > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > >
> > > > Can we introduce a function called xe_page_reclaim_list_init for
> > > > clarity? It might do the same thing as
> > > > xe_page_reclaim_list_invalidate but it would make this a little more
> > > > clear. Likewise later in the series when a job is created, you can call xe_page_reclaim_list_init there too.
> > > >
> > >
> > > Sure, I'll write another helper for this and modify both those PRL creation points.
> > >
> > > > >  }
> > > > >
> > > > >  /**
> > > > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > > > >  		&vops->pt_update_ops[tile->id];
> > > > >  	int i;
> > > > >
> > > > > +	if (pt_update_ops->prl.entries) {
> > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > +	}
> > > > > +
> > > > >  	lockdep_assert_held(&vops->vm->lock);
> > > > >  	xe_vm_assert_held(vops->vm);
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > index 881f01e14db8..26e5295f118e 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > @@ -8,6 +8,7 @@
> > > > >
> > > > >  #include <linux/types.h>
> > > > >
> > > > > +#include "xe_page_reclaim.h"
> > > > >  #include "xe_pt_walk.h"
> > > > >
> > > > >  struct xe_bo;
> > > > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > > > >  	bool bind;
> > > > >  	/** @rebind: is a rebind */
> > > > >  	bool rebind;
> > > > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > > > +	struct xe_page_reclaim_list *prl;
> > > >
> > > > Can you move this above the bools in the layout of
> > > > xe_vm_pgtable_update_op, likely just below "struct xe_vma".
> > > >
> > >
> > > Ahh got it. Moved.
> > >
> > > > >  };
> > > > >
> > > > >  /** struct xe_vm_pgtable_update_ops: page table update operations
> > > > > */ @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > > > >  	 * slots are idle.
> > > > >  	 */
> > > > >  	bool wait_vm_kernel;
> > > > > +	/** @prl: embedded page reclaim list */
> > > > > +	struct xe_page_reclaim_list prl;
> > > >
> > > > Same thing here, move just below "struct xe_exec_queue".
> > > >
> > > > Matt
> > > >
> > >
> > > Moved.
> > >
> > > Brian
> > >
> > > > >  };
> > > > >
> > > > >  #endif
> > > > > --
> > > > > 2.51.2
> > > > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 19:07           ` Matthew Brost
@ 2025-11-25 19:46             ` Nguyen, Brian3
  2025-11-25 22:35               ` Matthew Brost
  0 siblings, 1 reply; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-25 19:46 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tuesday, November 25, 2025 11:07 AM, Matthew Brost wrote:
> On Tue, Nov 25, 2025 at 12:01:25PM -0700, Nguyen, Brian3 wrote:
> > On Tuesday, November 25, 2025 10:34 AM, Matthew Brost wrote:
> > > On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> > > > On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > > > > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > > > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > > > > The PRL is firstly owned by pt_update_ops and all other page
> > > > > > reclaim operations will point back to this PRL. PRL generates
> > > > > > its entries during the unbind page walker, updating the PRL.
> > > > > >
> > > > > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > > > > >
> > > > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > > > > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > > > > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > > > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > > > > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > > > > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > > > > >  6 files changed, 217 insertions(+)  create mode 100644
> > > > > > drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/xe/Makefile
> > > > > > b/drivers/gpu/drm/xe/Makefile index e4b273b025d2..048e6c93271c
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/xe/Makefile
> > > > > > +++ b/drivers/gpu/drm/xe/Makefile
> > > > > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > > > > >  	xe_oa.o \
> > > > > >  	xe_observation.o \
> > > > > >  	xe_pagefault.o \
> > > > > > +	xe_page_reclaim.o \
> > > > > >  	xe_pat.o \
> > > > > >  	xe_pci.o \
> > > > > >  	xe_pcode.o \
> > > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > index 4389e5a76f89..4d83461e538b 100644
> > > > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > @@ -9,6 +9,7 @@
> > > > > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > > > > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > > > > >
> > > > > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > > > > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > > > > >
> > > > > >  #define GUC_GGTT_TOP		0xFEE00000
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..a0d15efff58c
> > > > > > --- /dev/null
> > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > @@ -0,0 +1,52 @@
> > > > > > +// SPDX-License-Identifier: MIT
> > > > > > +/*
> > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > +
> > > > > > +#include <linux/bitfield.h>
> > > > > > +#include <linux/kref.h>
> > > > > > +#include <linux/mm.h>
> > > > > > +#include <linux/slab.h>
> > > > > > +
> > > > > > +#include "xe_page_reclaim.h"
> > > > > > +
> > > > > > +#include "regs/xe_gt_regs.h"
> > > > > > +#include "xe_assert.h"
> > > > > > +#include "xe_macros.h"
> > > > > > +
> > > > > > +/**
> > > > > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > > > > > + * @prl: Page reclaim list to reset
> > > > > > + *
> > > > > > + * Clears the entries pointer and marks the list as invalid
> > > > > > +so
> > > > > > + * future use know PRL is unusable. It is expected that the
> > > > > > +entries
> > > > > > + * have already been released.
> > > > > > + */
> > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > +xe_page_reclaim_list
> > > > > > +*prl) {
> > > > > > +	prl->entries = NULL;
> > > > > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > > > > +
> > > > > > +/**
> > > > > > + * xe_page_reclaim_list_alloc_entries() - Allocate page
> > > > > > +reclaim list entries
> > > > > > + * @prl: Page reclaim list to allocate entries for
> > > > > > + *
> > > > > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > > > > + */
> > > > > > +int xe_page_reclaim_list_alloc_entries(struct
> > > > > > +xe_page_reclaim_list
> > > > > > +*prl) {
> > > > > > +	struct page *page;
> > > > > > +
> > > > > > +	XE_WARN_ON(prl->entries != NULL);
> > > > > > +	if (prl->entries)
> > > > > > +		return 0;
> > > > > > +
> > > > > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > > > +	if (page) {
> > > > > > +		prl->entries = page_address(page);
> > > > > > +		prl->num_entries = 0;
> > > > > > +	}
> > > > > > +
> > > > > > +	return page ? 0 : -ENOMEM;
> > > > > > +}
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > new file mode 100644
> > > > > > index 000000000000..d066d7d97f79
> > > > > > --- /dev/null
> > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > @@ -0,0 +1,49 @@
> > > > > > +/* SPDX-License-Identifier: MIT */
> > > > > > +/*
> > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > +
> > > > > > +#ifndef _XE_PAGE_RECLAIM_H_
> > > > > > +#define _XE_PAGE_RECLAIM_H_
> > > > > > +
> > > > > > +#include <linux/kref.h>
> > > > > > +#include <linux/mm.h>
> > > > > > +#include <linux/slab.h>
> > > > > > +#include <linux/types.h>
> > > > > > +#include <linux/workqueue.h>
> > > > > > +
> > > > > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > > > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > > > > +
> > > > > > +struct xe_guc_page_reclaim_entry {
> > > > > > +	u32 valid:1;
> > > > > > +	u32 reclamation_size:6;
> > > > > > +	u32 reserved:5;
> > > > > > +	u32 address_lo:20;
> > > > > > +	u32 address_hi:20;
> > > > > > +	u32 reserved1:12;
> > > > >
> > > > > This is wire interface with the GuC. Bitfields can based on
> > > > > endianess of the CPU. I know this is a iGPU feature for now but
> > > > > it could possibly change in the future, with that, to future proof can the layout of this be setup via defines / macros?
> > > > >
> > > >
> > > > Sure, I moved over to the typical FIELD_PREP/GENMASK macros used
> > > > elsewhere for the guc interfaces.
> > > >
> > > > > > +} __packed;
> > > > > > +
> > > > > > +struct xe_page_reclaim_list {
> > > > > > +	/** @entries: array of page reclaim entries, page allocated */
> > > > > > +	struct xe_guc_page_reclaim_entry *entries;
> > > > > > +	/** @num_entries: number of entries */
> > > > > > +	int num_entries;
> > > > > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > > > > +};
> > > > > > +
> > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > +xe_page_reclaim_list *prl); int
> > > > > > +xe_page_reclaim_list_alloc_entries(struct
> > > > > > +xe_page_reclaim_list *prl); static inline void
> > > > > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > > > > +	if (entries)
> > > > > > +		get_page(virt_to_page(entries)); }
> > > > > > +
> > > > > > +static inline void xe_page_reclaim_entries_put(struct
> > > > > > +xe_guc_page_reclaim_entry *entries) {
> > > > > > +	if (entries)
> > > > > > +		put_page(virt_to_page(entries)); }
> > > > >
> > > > > Kernel doc for static inlines.
> > > > >
> > > >
> > > > Added.
> > > >
> > > > > > +
> > > > > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > b/drivers/gpu/drm/xe/xe_pt.c index 884127b4d97d..532a047676d4
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > > > @@ -12,6 +12,7 @@
> > > > > >  #include "xe_exec_queue.h"
> > > > > >  #include "xe_gt.h"
> > > > > >  #include "xe_migrate.h"
> > > > > > +#include "xe_page_reclaim.h"
> > > > > >  #include "xe_pt_types.h"
> > > > > >  #include "xe_pt_walk.h"
> > > > > >  #include "xe_res_cursor.h"
> > > > > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > > > > >  	/* Output */
> > > > > >  	/* @wupd: Structure to track the page-table updates we're building */
> > > > > >  	struct xe_walk_update wupd;
> > > > > > +
> > > > > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > > > > +	struct xe_page_reclaim_list *prl;
> > > > > >  };
> > > > > >
> > > > > >  /*
> > > > > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr,
> > > > > > u64 next,
> > > > > unsigned int level,
> > > > > >  	return false;
> > > > > >  }
> > > > > >
> > > > > > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > > > > > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > > > > > +	return pte->level == 1 && !pte->base.children; }
> > > > > > +
> > > > > > +/* page_size = 2^(reclamation_size + 12) */ #define
> > > > > > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > > > > 	\
> > > > > > +({									\
> > > > > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > > > > +	ilog2(page_size) - 12;						\
> > > > >
> > > > > s/12/XE_PTE_SHIFT ?
> > > > >
> > > >
> > > > Done.
> > > >
> > > > > > +})
> > > > > > +
> > > > > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > > > > +				   struct xe_page_reclaim_list *prl,
> > > > > > +				   u64 pte,
> > > > > > +				   struct xe_pt *xe_child)
> > > > >
> > > > > Nit, xe_pt can be on the same line as 'u64 pte'.
> > > > >
> > > >
> > > > Done.
> > > >
> > > > > > +{
> > > > > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > > > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > > > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > > > > +	u32 reclamation_size;
> > > > >
> > > > > Nit, I'd make the last variable declared on the stack for readability.
> > > > >
> > > >
> > > > Ahh got it, reclamation_size moved to after num_entries.
> > > >
> > > > > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > > > > +	int num_entries = prl->num_entries;
> > > > > > +
> > > > > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > > > > +	xe_tile_assert(tile, reclaim_entries);
> > > > > > +
> > > > > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > > > > +		return;
> > > > > > +
> > > > > > +	/* Overflow: mark as invalid through num_entries */
> > > > > > +	if (num_entries >= max_entries) {
> > > > > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > > > > +		return;
> > > > > > +	}
> > > > > > +
> > > > > > +	/**
> > > > > > +	 * reclamation_size indicates the size of the page to be
> > > > > > +	 * invalidated and flushed from non-coherent cache.
> > > > > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > > > > +	 * Only valid for these specific levels.
> > > > > > +	 */
> > > > > > +
> > > > > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > > > > +	else if (xe_child->level == 0)
> > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > > > > +	else if (is_large_pte(xe_child))
> > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /*
> > > > > > +reclamation_size = 2 */
> > > > >
> > > > > What happens if we have 1G page? That doesn't seem to be handled.
> > > > >
> > > >
> > > > Page reclamation hardware does not support 1G page. This should be
> > > > handled and fallback to standard TLB invalidation PPC flush. I can
> > > > add
> > >
> > > Make sense that we fallback. I am however not seeing where this fallback occurs.
> > >
> >
> > !! Ohh I got it now, I silently dropped the 1G pages... My bad. I'll
> > follow the new changes suggested below.
> >
> > > > a comment somewhere discussing this but the format for PRL only
> > > > supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
> > > > mentioning the HW support being limited to these pages and rename
> > > > the is_large_pte to is_2m_pte.
> > > >
> > > > > > +	else
> > > > > > +		return;
> > >
> > > I would think for the fallback, we'd set prl->num_entries to XE_PAGE_RECLAIM_INVALID_LIST here.
> > >
> > > Maybe I'm missing something?
> > >
> > > Matt
> > >
> >
> > Given the 1G page, I'll follow this idea. Invalidate the PRL, and then
> > change the if statement in the
> > generate_reclaim_entry() caller to accept all PTE and invalidate it in this function above.
> >
> > > > > > +
> > > > > > +	reclaim_entries[num_entries].valid = 1;
> > > > > > +	reclaim_entries[num_entries].reclamation_size =
> > > > > > +		reclamation_size;
> > > > > > +	reclaim_entries[num_entries].address_lo =
> > > > > > +		FIELD_GET(field_mask, phys_addr);
> > > > > > +	reclaim_entries[num_entries].address_hi =
> > > > > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > > > >
> > > > > As suggested above, use macros/defines here to setup the entry.
> > > > >
> > > >
> > > > Got it, moved over to using other standard define macros.
> > > >
> > > > > > +	prl->num_entries++;
> > > > > > +}
> > > > > > +
> > > > > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > > > >  				    unsigned int level, u64 addr, u64 next,
> > > > > >  				    struct xe_ptw **child, @@ -1579,10 +1646,27 @@ static
> > > > > > int xe_pt_stage_unbind_entry(struct xe_ptw
> > > > > *parent, pgoff_t offset,
> > > > > >  				    struct xe_pt_walk *walk)  {
> > > > > >  	struct xe_pt *xe_child = container_of(*child,
> > > > > > typeof(*xe_child), base);
> > > > > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > > > > +		container_of(walk, typeof(*xe_walk), base);
> > > > > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > > > > >
> > > > > >  	XE_WARN_ON(!*child);
> > > > > >  	XE_WARN_ON(!level);
> > > > > >
> > > > > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > > > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > > > > +is_large_pte(xe_child))) {
> >
> > So right here, I'll make the change to accept all the leafs of the
> > walker and handle the 1G case in generate_reclaim_entry().
> >
> > Brian
> >
> 
> It is possible we are even higher up page table tree too (e.g. with 57 bit VAs there are 2 level above 1G, 48 bits one level). We need to
> handle those cases as fallbacks to cache flushing TLB invalidations too.
> 
> Matt
> 

Was planning on just making everything else not 4K, 64K, 2M pages default to invalidating PRL.
I believe that will handle these other levels as well? I am assuming these other levels with
48 bits and 57bit VA will still just look like leaf PTE with no children so I can use a simple
(!xe_child->base.children)?

> > > > >
> > > > > And also here? 1G pages are unhandled? Please explain.
> > > > >
> > > >
> > > > As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> > > > 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> > > >
> > > > > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > > > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > > > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > > > > +
> > > > > > +		for (pgoff_t i = 0; i < count; i++) {
> > > > > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) *
> > > > > > +sizeof(u64),
> > > > > u64);
> > > > > > +
> > > > > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > > > > +					       pte, xe_child);
> > > > > > +		}
> > > > > > +	}
> > > > > > +
> > > > > >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action,
> > > > > > walk);

Since we're on the topic of this section as well, how will xe_pt_check_kill() affect the page
walk here? Do we need to handle some case where the whole directroy is killed before we
look at the child pte? In that case, worthwhile to just invalidate PRL or attempt to walk it?

Brian

> > > > > >
> > > > > >  	return 0;
> > > > > > @@ -1654,6 +1738,8 @@ static unsigned int
> > > > > > xe_pt_stage_unbind(struct xe_tile *tile,  {
> > > > > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > > > > >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > > > > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > > > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > > > > entries[0]);
> > > > > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > > > > >  		.base = {
> > > > > >  			.ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1751,7 @@
> > > > > > static unsigned int xe_pt_stage_unbind(struct xe_tile
> > > > > *tile,
> > > > > >  		.modified_start = start,
> > > > > >  		.modified_end = end,
> > > > > >  		.wupd.entries = entries,
> > > > > > +		.prl = pt_update_op->prl,
> > > > > >  	};
> > > > > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > > > > >
> > > > > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > > > > >  			     struct xe_vma *vma)
> > > > > >  {
> > > > > > +	struct xe_device *xe = tile_to_xe(tile);
> > > > > >  	u32 current_op = pt_update_ops->current_op;
> > > > > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > > > > >ops[current_op];
> > > > > >  	int err;
> > > > > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > >  	pt_op->vma = vma;
> > > > > >  	pt_op->bind = false;
> > > > > >  	pt_op->rebind = false;
> > > > > > +	/* Maintain one PRL located in pt_update_ops that all others
> > > > > > +in unbind op
> > > > > reference */
> > > > > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > > > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > > > > +		if (err < 0)
> > > > > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > >
> > > > > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > > > > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > > > >
> > > >
> > > > Yes. I'll drop this call for now then.
> > > >
> > > > > > +	}
> > > > > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > > > > +NULL;
> > > > > >
> > > > > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > > > > >  	if (err)
> > > > > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct
> > > > > > xe_tile *tile,
> > > > > >
> > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > > > > >  						vma, NULL, pt_op->entries);
> > > > > > +	/* Free PRL if list declared as invalid */
> > > > > > +	if (pt_update_ops->prl.entries &&
> > > > > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > +		pt_op->prl = NULL;
> > > > > > +		pt_update_ops->prl.entries = NULL;
> > > > >
> > > > > Call xe_page_reclaim_list_invalidate for clarity?
> > > > >
> > > >
> > > > Updated.
> > > >
> > > > > > +	}
> > > > > >
> > > > > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > > > > >  				pt_op->num_entries, false); @@ -1979,6 +2081,7 @@ static
> > > > > > int unbind_range_prepare(struct xe_vm *vm,
> > > > > >  	pt_op->vma = XE_INVALID_VMA;
> > > > > >  	pt_op->bind = false;
> > > > > >  	pt_op->rebind = false;
> > > > > > +	pt_op->prl = NULL;
> > > > > >
> > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > > > > >  						pt_op->entries);
> > > > > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > > > > xe_vm_pgtable_update_ops *pt_update_ops)
> > > > > >  	init_llist_head(&pt_update_ops->deferred);
> > > > > >  	pt_update_ops->start = ~0x0ull;
> > > > > >  	pt_update_ops->last = 0x0ull;
> > > > > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > >
> > > > > Can we introduce a function called xe_page_reclaim_list_init for
> > > > > clarity? It might do the same thing as
> > > > > xe_page_reclaim_list_invalidate but it would make this a little
> > > > > more clear. Likewise later in the series when a job is created, you can call xe_page_reclaim_list_init there too.
> > > > >
> > > >
> > > > Sure, I'll write another helper for this and modify both those PRL creation points.
> > > >
> > > > > >  }
> > > > > >
> > > > > >  /**
> > > > > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > > > > >  		&vops->pt_update_ops[tile->id];
> > > > > >  	int i;
> > > > > >
> > > > > > +	if (pt_update_ops->prl.entries) {
> > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > > +	}
> > > > > > +
> > > > > >  	lockdep_assert_held(&vops->vm->lock);
> > > > > >  	xe_vm_assert_held(vops->vm);
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > index 881f01e14db8..26e5295f118e 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > @@ -8,6 +8,7 @@
> > > > > >
> > > > > >  #include <linux/types.h>
> > > > > >
> > > > > > +#include "xe_page_reclaim.h"
> > > > > >  #include "xe_pt_walk.h"
> > > > > >
> > > > > >  struct xe_bo;
> > > > > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > > > > >  	bool bind;
> > > > > >  	/** @rebind: is a rebind */
> > > > > >  	bool rebind;
> > > > > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > > > > +	struct xe_page_reclaim_list *prl;
> > > > >
> > > > > Can you move this above the bools in the layout of
> > > > > xe_vm_pgtable_update_op, likely just below "struct xe_vma".
> > > > >
> > > >
> > > > Ahh got it. Moved.
> > > >
> > > > > >  };
> > > > > >
> > > > > >  /** struct xe_vm_pgtable_update_ops: page table update
> > > > > > operations */ @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > > > > >  	 * slots are idle.
> > > > > >  	 */
> > > > > >  	bool wait_vm_kernel;
> > > > > > +	/** @prl: embedded page reclaim list */
> > > > > > +	struct xe_page_reclaim_list prl;
> > > > >
> > > > > Same thing here, move just below "struct xe_exec_queue".
> > > > >
> > > > > Matt
> > > > >
> > > >
> > > > Moved.
> > > >
> > > > Brian
> > > >
> > > > > >  };
> > > > > >
> > > > > >  #endif
> > > > > > --
> > > > > > 2.51.2
> > > > > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 19:46             ` Nguyen, Brian3
@ 2025-11-25 22:35               ` Matthew Brost
  2025-11-26  2:33                 ` Nguyen, Brian3
  0 siblings, 1 reply; 51+ messages in thread
From: Matthew Brost @ 2025-11-25 22:35 UTC (permalink / raw)
  To: Nguyen, Brian3
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tue, Nov 25, 2025 at 12:46:20PM -0700, Nguyen, Brian3 wrote:
> On Tuesday, November 25, 2025 11:07 AM, Matthew Brost wrote:
> > On Tue, Nov 25, 2025 at 12:01:25PM -0700, Nguyen, Brian3 wrote:
> > > On Tuesday, November 25, 2025 10:34 AM, Matthew Brost wrote:
> > > > On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> > > > > On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > > > > > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > > > > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > > > > > The PRL is firstly owned by pt_update_ops and all other page
> > > > > > > reclaim operations will point back to this PRL. PRL generates
> > > > > > > its entries during the unbind page walker, updating the PRL.
> > > > > > >
> > > > > > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > > > > > >
> > > > > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > > > > > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > > > > > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > > > > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > > > > > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > > > > > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > > > > > >  6 files changed, 217 insertions(+)  create mode 100644
> > > > > > > drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/xe/Makefile
> > > > > > > b/drivers/gpu/drm/xe/Makefile index e4b273b025d2..048e6c93271c
> > > > > > > 100644
> > > > > > > --- a/drivers/gpu/drm/xe/Makefile
> > > > > > > +++ b/drivers/gpu/drm/xe/Makefile
> > > > > > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > > > > > >  	xe_oa.o \
> > > > > > >  	xe_observation.o \
> > > > > > >  	xe_pagefault.o \
> > > > > > > +	xe_page_reclaim.o \
> > > > > > >  	xe_pat.o \
> > > > > > >  	xe_pci.o \
> > > > > > >  	xe_pcode.o \
> > > > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > index 4389e5a76f89..4d83461e538b 100644
> > > > > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > @@ -9,6 +9,7 @@
> > > > > > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > > > > > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > > > > > >
> > > > > > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > > > > > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > > > > > >
> > > > > > >  #define GUC_GGTT_TOP		0xFEE00000
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > new file mode 100644
> > > > > > > index 000000000000..a0d15efff58c
> > > > > > > --- /dev/null
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > @@ -0,0 +1,52 @@
> > > > > > > +// SPDX-License-Identifier: MIT
> > > > > > > +/*
> > > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > > +
> > > > > > > +#include <linux/bitfield.h>
> > > > > > > +#include <linux/kref.h>
> > > > > > > +#include <linux/mm.h>
> > > > > > > +#include <linux/slab.h>
> > > > > > > +
> > > > > > > +#include "xe_page_reclaim.h"
> > > > > > > +
> > > > > > > +#include "regs/xe_gt_regs.h"
> > > > > > > +#include "xe_assert.h"
> > > > > > > +#include "xe_macros.h"
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid
> > > > > > > + * @prl: Page reclaim list to reset
> > > > > > > + *
> > > > > > > + * Clears the entries pointer and marks the list as invalid
> > > > > > > +so
> > > > > > > + * future use know PRL is unusable. It is expected that the
> > > > > > > +entries
> > > > > > > + * have already been released.
> > > > > > > + */
> > > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > > +xe_page_reclaim_list
> > > > > > > +*prl) {
> > > > > > > +	prl->entries = NULL;
> > > > > > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * xe_page_reclaim_list_alloc_entries() - Allocate page
> > > > > > > +reclaim list entries
> > > > > > > + * @prl: Page reclaim list to allocate entries for
> > > > > > > + *
> > > > > > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > > > > > + */
> > > > > > > +int xe_page_reclaim_list_alloc_entries(struct
> > > > > > > +xe_page_reclaim_list
> > > > > > > +*prl) {
> > > > > > > +	struct page *page;
> > > > > > > +
> > > > > > > +	XE_WARN_ON(prl->entries != NULL);
> > > > > > > +	if (prl->entries)
> > > > > > > +		return 0;
> > > > > > > +
> > > > > > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > > > > +	if (page) {
> > > > > > > +		prl->entries = page_address(page);
> > > > > > > +		prl->num_entries = 0;
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	return page ? 0 : -ENOMEM;
> > > > > > > +}
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > new file mode 100644
> > > > > > > index 000000000000..d066d7d97f79
> > > > > > > --- /dev/null
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > @@ -0,0 +1,49 @@
> > > > > > > +/* SPDX-License-Identifier: MIT */
> > > > > > > +/*
> > > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > > +
> > > > > > > +#ifndef _XE_PAGE_RECLAIM_H_
> > > > > > > +#define _XE_PAGE_RECLAIM_H_
> > > > > > > +
> > > > > > > +#include <linux/kref.h>
> > > > > > > +#include <linux/mm.h>
> > > > > > > +#include <linux/slab.h>
> > > > > > > +#include <linux/types.h>
> > > > > > > +#include <linux/workqueue.h>
> > > > > > > +
> > > > > > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > > > > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > > > > > +
> > > > > > > +struct xe_guc_page_reclaim_entry {
> > > > > > > +	u32 valid:1;
> > > > > > > +	u32 reclamation_size:6;
> > > > > > > +	u32 reserved:5;
> > > > > > > +	u32 address_lo:20;
> > > > > > > +	u32 address_hi:20;
> > > > > > > +	u32 reserved1:12;
> > > > > >
> > > > > > This is wire interface with the GuC. Bitfields can based on
> > > > > > endianess of the CPU. I know this is a iGPU feature for now but
> > > > > > it could possibly change in the future, with that, to future proof can the layout of this be setup via defines / macros?
> > > > > >
> > > > >
> > > > > Sure, I moved over to the typical FIELD_PREP/GENMASK macros used
> > > > > elsewhere for the guc interfaces.
> > > > >
> > > > > > > +} __packed;
> > > > > > > +
> > > > > > > +struct xe_page_reclaim_list {
> > > > > > > +	/** @entries: array of page reclaim entries, page allocated */
> > > > > > > +	struct xe_guc_page_reclaim_entry *entries;
> > > > > > > +	/** @num_entries: number of entries */
> > > > > > > +	int num_entries;
> > > > > > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > > > > > +};
> > > > > > > +
> > > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > > +xe_page_reclaim_list *prl); int
> > > > > > > +xe_page_reclaim_list_alloc_entries(struct
> > > > > > > +xe_page_reclaim_list *prl); static inline void
> > > > > > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > > > > > +	if (entries)
> > > > > > > +		get_page(virt_to_page(entries)); }
> > > > > > > +
> > > > > > > +static inline void xe_page_reclaim_entries_put(struct
> > > > > > > +xe_guc_page_reclaim_entry *entries) {
> > > > > > > +	if (entries)
> > > > > > > +		put_page(virt_to_page(entries)); }
> > > > > >
> > > > > > Kernel doc for static inlines.
> > > > > >
> > > > >
> > > > > Added.
> > > > >
> > > > > > > +
> > > > > > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > b/drivers/gpu/drm/xe/xe_pt.c index 884127b4d97d..532a047676d4
> > > > > > > 100644
> > > > > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > @@ -12,6 +12,7 @@
> > > > > > >  #include "xe_exec_queue.h"
> > > > > > >  #include "xe_gt.h"
> > > > > > >  #include "xe_migrate.h"
> > > > > > > +#include "xe_page_reclaim.h"
> > > > > > >  #include "xe_pt_types.h"
> > > > > > >  #include "xe_pt_walk.h"
> > > > > > >  #include "xe_res_cursor.h"
> > > > > > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > > > > > >  	/* Output */
> > > > > > >  	/* @wupd: Structure to track the page-table updates we're building */
> > > > > > >  	struct xe_walk_update wupd;
> > > > > > > +
> > > > > > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > > > > > +	struct xe_page_reclaim_list *prl;
> > > > > > >  };
> > > > > > >
> > > > > > >  /*
> > > > > > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64 addr,
> > > > > > > u64 next,
> > > > > > unsigned int level,
> > > > > > >  	return false;
> > > > > > >  }
> > > > > > >
> > > > > > > +/* Huge 2MB leaf lives directly in a level-1 table and has no
> > > > > > > +children */ static bool is_large_pte(struct xe_pt *pte) {
> > > > > > > +	return pte->level == 1 && !pte->base.children; }
> > > > > > > +
> > > > > > > +/* page_size = 2^(reclamation_size + 12) */ #define
> > > > > > > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > > > > > 	\
> > > > > > > +({									\
> > > > > > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > > > > > +	ilog2(page_size) - 12;						\
> > > > > >
> > > > > > s/12/XE_PTE_SHIFT ?
> > > > > >
> > > > >
> > > > > Done.
> > > > >
> > > > > > > +})
> > > > > > > +
> > > > > > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > > > > > +				   struct xe_page_reclaim_list *prl,
> > > > > > > +				   u64 pte,
> > > > > > > +				   struct xe_pt *xe_child)
> > > > > >
> > > > > > Nit, xe_pt can be on the same line as 'u64 pte'.
> > > > > >
> > > > >
> > > > > Done.
> > > > >
> > > > > > > +{
> > > > > > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > > > > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > > > > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > > > > > +	u32 reclamation_size;
> > > > > >
> > > > > > Nit, I'd make the last variable declared on the stack for readability.
> > > > > >
> > > > >
> > > > > Ahh got it, reclamation_size moved to after num_entries.
> > > > >
> > > > > > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > > > > > +	int num_entries = prl->num_entries;
> > > > > > > +
> > > > > > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > > > > > +	xe_tile_assert(tile, reclaim_entries);
> > > > > > > +
> > > > > > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > > > > > +		return;
> > > > > > > +
> > > > > > > +	/* Overflow: mark as invalid through num_entries */
> > > > > > > +	if (num_entries >= max_entries) {
> > > > > > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > > > > > +		return;
> > > > > > > +	}
> > > > > > > +
> > > > > > > +	/**
> > > > > > > +	 * reclamation_size indicates the size of the page to be
> > > > > > > +	 * invalidated and flushed from non-coherent cache.
> > > > > > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > > > > > +	 * Only valid for these specific levels.
> > > > > > > +	 */
> > > > > > > +
> > > > > > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > > > > > +	else if (xe_child->level == 0)
> > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > > > > > +	else if (is_large_pte(xe_child))
> > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /*
> > > > > > > +reclamation_size = 2 */
> > > > > >
> > > > > > What happens if we have 1G page? That doesn't seem to be handled.
> > > > > >
> > > > >
> > > > > Page reclamation hardware does not support 1G page. This should be
> > > > > handled and fallback to standard TLB invalidation PPC flush. I can
> > > > > add
> > > >
> > > > Make sense that we fallback. I am however not seeing where this fallback occurs.
> > > >
> > >
> > > !! Ohh I got it now, I silently dropped the 1G pages... My bad. I'll
> > > follow the new changes suggested below.
> > >
> > > > > a comment somewhere discussing this but the format for PRL only
> > > > > supports 4K, 64K, and 2M pages to reclaim. I'll add a comment here
> > > > > mentioning the HW support being limited to these pages and rename
> > > > > the is_large_pte to is_2m_pte.
> > > > >
> > > > > > > +	else
> > > > > > > +		return;
> > > >
> > > > I would think for the fallback, we'd set prl->num_entries to XE_PAGE_RECLAIM_INVALID_LIST here.
> > > >
> > > > Maybe I'm missing something?
> > > >
> > > > Matt
> > > >
> > >
> > > Given the 1G page, I'll follow this idea. Invalidate the PRL, and then
> > > change the if statement in the
> > > generate_reclaim_entry() caller to accept all PTE and invalidate it in this function above.
> > >
> > > > > > > +
> > > > > > > +	reclaim_entries[num_entries].valid = 1;
> > > > > > > +	reclaim_entries[num_entries].reclamation_size =
> > > > > > > +		reclamation_size;
> > > > > > > +	reclaim_entries[num_entries].address_lo =
> > > > > > > +		FIELD_GET(field_mask, phys_addr);
> > > > > > > +	reclaim_entries[num_entries].address_hi =
> > > > > > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > > > > >
> > > > > > As suggested above, use macros/defines here to setup the entry.
> > > > > >
> > > > >
> > > > > Got it, moved over to using other standard define macros.
> > > > >
> > > > > > > +	prl->num_entries++;
> > > > > > > +}
> > > > > > > +
> > > > > > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > > > > >  				    unsigned int level, u64 addr, u64 next,
> > > > > > >  				    struct xe_ptw **child, @@ -1579,10 +1646,27 @@ static
> > > > > > > int xe_pt_stage_unbind_entry(struct xe_ptw
> > > > > > *parent, pgoff_t offset,
> > > > > > >  				    struct xe_pt_walk *walk)  {
> > > > > > >  	struct xe_pt *xe_child = container_of(*child,
> > > > > > > typeof(*xe_child), base);
> > > > > > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > > > > > +		container_of(walk, typeof(*xe_walk), base);
> > > > > > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > > > > > >
> > > > > > >  	XE_WARN_ON(!*child);
> > > > > > >  	XE_WARN_ON(!level);
> > > > > > >
> > > > > > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > > > > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > > > > > +is_large_pte(xe_child))) {
> > >
> > > So right here, I'll make the change to accept all the leafs of the
> > > walker and handle the 1G case in generate_reclaim_entry().
> > >
> > > Brian
> > >
> > 
> > It is possible we are even higher up page table tree too (e.g. with 57 bit VAs there are 2 level above 1G, 48 bits one level). We need to
> > handle those cases as fallbacks to cache flushing TLB invalidations too.
> > 
> > Matt
> > 
> 
> Was planning on just making everything else not 4K, 64K, 2M pages default to invalidating PRL.
> I believe that will handle these other levels as well? I am assuming these other levels with
> 48 bits and 57bit VA will still just look like leaf PTE with no children so I can use a simple
> (!xe_child->base.children)?
> 
> > > > > >
> > > > > > And also here? 1G pages are unhandled? Please explain.
> > > > > >
> > > > >
> > > > > As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> > > > > 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> > > > >
> > > > > > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > > > > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > > > > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > > > > > +
> > > > > > > +		for (pgoff_t i = 0; i < count; i++) {
> > > > > > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) *
> > > > > > > +sizeof(u64),
> > > > > > u64);
> > > > > > > +
> > > > > > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > > > > > +					       pte, xe_child);
> > > > > > > +		}
> > > > > > > +	}
> > > > > > > +
> > > > > > >  	xe_pt_check_kill(addr, next, level - 1, xe_child, action,
> > > > > > > walk);
> 
> Since we're on the topic of this section as well, how will xe_pt_check_kill() affect the page
> walk here? Do we need to handle some case where the whole directroy is killed before we
> look at the child pte? In that case, worthwhile to just invalidate PRL or attempt to walk it?
> 

This is what aborts the walk at higher levels. So I think here if we
abort the walk at anything above level 1, we'd need to invalidate PRL.
So I believe if xe_pt_check_kill returns true + level > 1, we invalidate
the PRL but perhaps that isn't needed at num_entries would be zero.

Matt
 
> Brian
> 
> > > > > > >
> > > > > > >  	return 0;
> > > > > > > @@ -1654,6 +1738,8 @@ static unsigned int
> > > > > > > xe_pt_stage_unbind(struct xe_tile *tile,  {
> > > > > > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > > > > > >  	u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma);
> > > > > > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > > > > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > > > > > entries[0]);
> > > > > > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > > > > > >  		.base = {
> > > > > > >  			.ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1751,7 @@
> > > > > > > static unsigned int xe_pt_stage_unbind(struct xe_tile
> > > > > > *tile,
> > > > > > >  		.modified_start = start,
> > > > > > >  		.modified_end = end,
> > > > > > >  		.wupd.entries = entries,
> > > > > > > +		.prl = pt_update_op->prl,
> > > > > > >  	};
> > > > > > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > > > > > >
> > > > > > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > > > > > >  			     struct xe_vma *vma)
> > > > > > >  {
> > > > > > > +	struct xe_device *xe = tile_to_xe(tile);
> > > > > > >  	u32 current_op = pt_update_ops->current_op;
> > > > > > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > > > > > >ops[current_op];
> > > > > > >  	int err;
> > > > > > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > > >  	pt_op->vma = vma;
> > > > > > >  	pt_op->bind = false;
> > > > > > >  	pt_op->rebind = false;
> > > > > > > +	/* Maintain one PRL located in pt_update_ops that all others
> > > > > > > +in unbind op
> > > > > > reference */
> > > > > > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > > > > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > > > > > +		if (err < 0)
> > > > > > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > >
> > > > > > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > > > > > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > > > > >
> > > > >
> > > > > Yes. I'll drop this call for now then.
> > > > >
> > > > > > > +	}
> > > > > > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > > > > > +NULL;
> > > > > > >
> > > > > > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > > > > > >  	if (err)
> > > > > > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct
> > > > > > > xe_tile *tile,
> > > > > > >
> > > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > > > > > >  						vma, NULL, pt_op->entries);
> > > > > > > +	/* Free PRL if list declared as invalid */
> > > > > > > +	if (pt_update_ops->prl.entries &&
> > > > > > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > > +		pt_op->prl = NULL;
> > > > > > > +		pt_update_ops->prl.entries = NULL;
> > > > > >
> > > > > > Call xe_page_reclaim_list_invalidate for clarity?
> > > > > >
> > > > >
> > > > > Updated.
> > > > >
> > > > > > > +	}
> > > > > > >
> > > > > > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > > > > > >  				pt_op->num_entries, false); @@ -1979,6 +2081,7 @@ static
> > > > > > > int unbind_range_prepare(struct xe_vm *vm,
> > > > > > >  	pt_op->vma = XE_INVALID_VMA;
> > > > > > >  	pt_op->bind = false;
> > > > > > >  	pt_op->rebind = false;
> > > > > > > +	pt_op->prl = NULL;
> > > > > > >
> > > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > > > > > >  						pt_op->entries);
> > > > > > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > > > > > xe_vm_pgtable_update_ops *pt_update_ops)
> > > > > > >  	init_llist_head(&pt_update_ops->deferred);
> > > > > > >  	pt_update_ops->start = ~0x0ull;
> > > > > > >  	pt_update_ops->last = 0x0ull;
> > > > > > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > >
> > > > > > Can we introduce a function called xe_page_reclaim_list_init for
> > > > > > clarity? It might do the same thing as
> > > > > > xe_page_reclaim_list_invalidate but it would make this a little
> > > > > > more clear. Likewise later in the series when a job is created, you can call xe_page_reclaim_list_init there too.
> > > > > >
> > > > >
> > > > > Sure, I'll write another helper for this and modify both those PRL creation points.
> > > > >
> > > > > > >  }
> > > > > > >
> > > > > > >  /**
> > > > > > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > > > > > >  		&vops->pt_update_ops[tile->id];
> > > > > > >  	int i;
> > > > > > >
> > > > > > > +	if (pt_update_ops->prl.entries) {
> > > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > > > +	}
> > > > > > > +
> > > > > > >  	lockdep_assert_held(&vops->vm->lock);
> > > > > > >  	xe_vm_assert_held(vops->vm);
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > index 881f01e14db8..26e5295f118e 100644
> > > > > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > @@ -8,6 +8,7 @@
> > > > > > >
> > > > > > >  #include <linux/types.h>
> > > > > > >
> > > > > > > +#include "xe_page_reclaim.h"
> > > > > > >  #include "xe_pt_walk.h"
> > > > > > >
> > > > > > >  struct xe_bo;
> > > > > > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > > > > > >  	bool bind;
> > > > > > >  	/** @rebind: is a rebind */
> > > > > > >  	bool rebind;
> > > > > > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > > > > > +	struct xe_page_reclaim_list *prl;
> > > > > >
> > > > > > Can you move this above the bools in the layout of
> > > > > > xe_vm_pgtable_update_op, likely just below "struct xe_vma".
> > > > > >
> > > > >
> > > > > Ahh got it. Moved.
> > > > >
> > > > > > >  };
> > > > > > >
> > > > > > >  /** struct xe_vm_pgtable_update_ops: page table update
> > > > > > > operations */ @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > > > > > >  	 * slots are idle.
> > > > > > >  	 */
> > > > > > >  	bool wait_vm_kernel;
> > > > > > > +	/** @prl: embedded page reclaim list */
> > > > > > > +	struct xe_page_reclaim_list prl;
> > > > > >
> > > > > > Same thing here, move just below "struct xe_exec_queue".
> > > > > >
> > > > > > Matt
> > > > > >
> > > > >
> > > > > Moved.
> > > > >
> > > > > Brian
> > > > >
> > > > > > >  };
> > > > > > >
> > > > > > >  #endif
> > > > > > > --
> > > > > > > 2.51.2
> > > > > > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 06/11] drm/xe: Create page reclaim list on unbind
  2025-11-25 22:35               ` Matthew Brost
@ 2025-11-26  2:33                 ` Nguyen, Brian3
  0 siblings, 0 replies; 51+ messages in thread
From: Nguyen, Brian3 @ 2025-11-26  2:33 UTC (permalink / raw)
  To: Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, Upadhyay, Tejas, Lin, Shuicheng,
	Summers, Stuart

On Tuesday, November 25, 2025 2:35 PM, Matthew Brost wrote:
> On Tue, Nov 25, 2025 at 12:46:20PM -0700, Nguyen, Brian3 wrote:
> > On Tuesday, November 25, 2025 11:07 AM, Matthew Brost wrote:
> > > On Tue, Nov 25, 2025 at 12:01:25PM -0700, Nguyen, Brian3 wrote:
> > > > On Tuesday, November 25, 2025 10:34 AM, Matthew Brost wrote:
> > > > > On Tue, Nov 25, 2025 at 04:18:19AM -0700, Nguyen, Brian3 wrote:
> > > > > > On Saturday, November 22, 2025 11:18 AM, Matthew Brost wrote:
> > > > > > > On Tue, Nov 18, 2025 at 05:05:47PM +0800, Brian Nguyen wrote:
> > > > > > > > Page reclaim list (PRL) is preparation work for the page reclaim feature.
> > > > > > > > The PRL is firstly owned by pt_update_ops and all other
> > > > > > > > page reclaim operations will point back to this PRL. PRL
> > > > > > > > generates its entries during the unbind page walker, updating the PRL.
> > > > > > > >
> > > > > > > > This PRL is restricted to a 4K page, so 512 page entries at most.
> > > > > > > >
> > > > > > > > Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
> > > > > > > > ---
> > > > > > > >  drivers/gpu/drm/xe/Makefile           |   1 +
> > > > > > > >  drivers/gpu/drm/xe/regs/xe_gtt_defs.h |   1 +
> > > > > > > >  drivers/gpu/drm/xe/xe_page_reclaim.c  |  52 ++++++++++++
> > > > > > > > drivers/gpu/drm/xe/xe_page_reclaim.h  |  49 ++++++++++++
> > > > > > > >  drivers/gpu/drm/xe/xe_pt.c            | 109 ++++++++++++++++++++++++++
> > > > > > > >  drivers/gpu/drm/xe/xe_pt_types.h      |   5 ++
> > > > > > > >  6 files changed, 217 insertions(+)  create mode 100644
> > > > > > > > drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > >  create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/xe/Makefile
> > > > > > > > b/drivers/gpu/drm/xe/Makefile index
> > > > > > > > e4b273b025d2..048e6c93271c
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/Makefile
> > > > > > > > +++ b/drivers/gpu/drm/xe/Makefile
> > > > > > > > @@ -95,6 +95,7 @@ xe-y += xe_bb.o \
> > > > > > > >  	xe_oa.o \
> > > > > > > >  	xe_observation.o \
> > > > > > > >  	xe_pagefault.o \
> > > > > > > > +	xe_page_reclaim.o \
> > > > > > > >  	xe_pat.o \
> > > > > > > >  	xe_pci.o \
> > > > > > > >  	xe_pcode.o \
> > > > > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > > index 4389e5a76f89..4d83461e538b 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h
> > > > > > > > @@ -9,6 +9,7 @@
> > > > > > > >  #define XELPG_GGTT_PTE_PAT0	BIT_ULL(52)
> > > > > > > >  #define XELPG_GGTT_PTE_PAT1	BIT_ULL(53)
> > > > > > > >
> > > > > > > > +#define XE_PTE_ADDR_MASK	GENMASK_ULL(51, 12)
> > > > > > > >  #define GGTT_PTE_VFID		GENMASK_ULL(11, 2)
> > > > > > > >
> > > > > > > >  #define GUC_GGTT_TOP		0xFEE00000
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > > new file mode 100644
> > > > > > > > index 000000000000..a0d15efff58c
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c
> > > > > > > > @@ -0,0 +1,52 @@
> > > > > > > > +// SPDX-License-Identifier: MIT
> > > > > > > > +/*
> > > > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > > > +
> > > > > > > > +#include <linux/bitfield.h> #include <linux/kref.h>
> > > > > > > > +#include <linux/mm.h> #include <linux/slab.h>
> > > > > > > > +
> > > > > > > > +#include "xe_page_reclaim.h"
> > > > > > > > +
> > > > > > > > +#include "regs/xe_gt_regs.h"
> > > > > > > > +#include "xe_assert.h"
> > > > > > > > +#include "xe_macros.h"
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * xe_page_reclaim_list_invalidate() - Mark a PRL as
> > > > > > > > +invalid
> > > > > > > > + * @prl: Page reclaim list to reset
> > > > > > > > + *
> > > > > > > > + * Clears the entries pointer and marks the list as
> > > > > > > > +invalid so
> > > > > > > > + * future use know PRL is unusable. It is expected that
> > > > > > > > +the entries
> > > > > > > > + * have already been released.
> > > > > > > > + */
> > > > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > > > +xe_page_reclaim_list
> > > > > > > > +*prl) {
> > > > > > > > +	prl->entries = NULL;
> > > > > > > > +	prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; }
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * xe_page_reclaim_list_alloc_entries() - Allocate page
> > > > > > > > +reclaim list entries
> > > > > > > > + * @prl: Page reclaim list to allocate entries for
> > > > > > > > + *
> > > > > > > > + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL.
> > > > > > > > + */
> > > > > > > > +int xe_page_reclaim_list_alloc_entries(struct
> > > > > > > > +xe_page_reclaim_list
> > > > > > > > +*prl) {
> > > > > > > > +	struct page *page;
> > > > > > > > +
> > > > > > > > +	XE_WARN_ON(prl->entries != NULL);
> > > > > > > > +	if (prl->entries)
> > > > > > > > +		return 0;
> > > > > > > > +
> > > > > > > > +	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > > > > > > > +	if (page) {
> > > > > > > > +		prl->entries = page_address(page);
> > > > > > > > +		prl->num_entries = 0;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > > +	return page ? 0 : -ENOMEM; }
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > > b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > > new file mode 100644
> > > > > > > > index 000000000000..d066d7d97f79
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h
> > > > > > > > @@ -0,0 +1,49 @@
> > > > > > > > +/* SPDX-License-Identifier: MIT */
> > > > > > > > +/*
> > > > > > > > + * Copyright (c) 2025 Intel Corporation  */
> > > > > > > > +
> > > > > > > > +#ifndef _XE_PAGE_RECLAIM_H_ #define _XE_PAGE_RECLAIM_H_
> > > > > > > > +
> > > > > > > > +#include <linux/kref.h>
> > > > > > > > +#include <linux/mm.h>
> > > > > > > > +#include <linux/slab.h>
> > > > > > > > +#include <linux/types.h>
> > > > > > > > +#include <linux/workqueue.h>
> > > > > > > > +
> > > > > > > > +#define XE_PAGE_RECLAIM_MAX_ENTRIES	512
> > > > > > > > +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE	SZ_4K
> > > > > > > > +
> > > > > > > > +struct xe_guc_page_reclaim_entry {
> > > > > > > > +	u32 valid:1;
> > > > > > > > +	u32 reclamation_size:6;
> > > > > > > > +	u32 reserved:5;
> > > > > > > > +	u32 address_lo:20;
> > > > > > > > +	u32 address_hi:20;
> > > > > > > > +	u32 reserved1:12;
> > > > > > >
> > > > > > > This is wire interface with the GuC. Bitfields can based on
> > > > > > > endianess of the CPU. I know this is a iGPU feature for now
> > > > > > > but it could possibly change in the future, with that, to future proof can the layout of this be setup via defines / macros?
> > > > > > >
> > > > > >
> > > > > > Sure, I moved over to the typical FIELD_PREP/GENMASK macros
> > > > > > used elsewhere for the guc interfaces.
> > > > > >
> > > > > > > > +} __packed;
> > > > > > > > +
> > > > > > > > +struct xe_page_reclaim_list {
> > > > > > > > +	/** @entries: array of page reclaim entries, page allocated */
> > > > > > > > +	struct xe_guc_page_reclaim_entry *entries;
> > > > > > > > +	/** @num_entries: number of entries */
> > > > > > > > +	int num_entries;
> > > > > > > > +#define XE_PAGE_RECLAIM_INVALID_LIST	-1
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > +void xe_page_reclaim_list_invalidate(struct
> > > > > > > > +xe_page_reclaim_list *prl); int
> > > > > > > > +xe_page_reclaim_list_alloc_entries(struct
> > > > > > > > +xe_page_reclaim_list *prl); static inline void
> > > > > > > > +xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) {
> > > > > > > > +	if (entries)
> > > > > > > > +		get_page(virt_to_page(entries)); }
> > > > > > > > +
> > > > > > > > +static inline void xe_page_reclaim_entries_put(struct
> > > > > > > > +xe_guc_page_reclaim_entry *entries) {
> > > > > > > > +	if (entries)
> > > > > > > > +		put_page(virt_to_page(entries)); }
> > > > > > >
> > > > > > > Kernel doc for static inlines.
> > > > > > >
> > > > > >
> > > > > > Added.
> > > > > >
> > > > > > > > +
> > > > > > > > +#endif	/* _XE_PAGE_RECLAIM_H_ */
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > > b/drivers/gpu/drm/xe/xe_pt.c index
> > > > > > > > 884127b4d97d..532a047676d4
> > > > > > > > 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > > > > > > @@ -12,6 +12,7 @@
> > > > > > > >  #include "xe_exec_queue.h"
> > > > > > > >  #include "xe_gt.h"
> > > > > > > >  #include "xe_migrate.h"
> > > > > > > > +#include "xe_page_reclaim.h"
> > > > > > > >  #include "xe_pt_types.h"
> > > > > > > >  #include "xe_pt_walk.h"
> > > > > > > >  #include "xe_res_cursor.h"
> > > > > > > > @@ -1538,6 +1539,9 @@ struct xe_pt_stage_unbind_walk {
> > > > > > > >  	/* Output */
> > > > > > > >  	/* @wupd: Structure to track the page-table updates we're building */
> > > > > > > >  	struct xe_walk_update wupd;
> > > > > > > > +
> > > > > > > > +	/** @prl: Backing pointer to page reclaim list in pt_update_ops */
> > > > > > > > +	struct xe_page_reclaim_list *prl;
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  /*
> > > > > > > > @@ -1572,6 +1576,69 @@ static bool xe_pt_check_kill(u64
> > > > > > > > addr,
> > > > > > > > u64 next,
> > > > > > > unsigned int level,
> > > > > > > >  	return false;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +/* Huge 2MB leaf lives directly in a level-1 table and
> > > > > > > > +has no children */ static bool is_large_pte(struct xe_pt *pte) {
> > > > > > > > +	return pte->level == 1 && !pte->base.children; }
> > > > > > > > +
> > > > > > > > +/* page_size = 2^(reclamation_size + 12) */ #define
> > > > > > > > +COMPUTE_RECLAIM_ADDRESS_MASK(page_size)
> > > > > > > 	\
> > > > > > > > +({									\
> > > > > > > > +	BUILD_BUG_ON(!__builtin_constant_p(page_size));			\
> > > > > > > > +	ilog2(page_size) - 12;						\
> > > > > > >
> > > > > > > s/12/XE_PTE_SHIFT ?
> > > > > > >
> > > > > >
> > > > > > Done.
> > > > > >
> > > > > > > > +})
> > > > > > > > +
> > > > > > > > +static void generate_reclaim_entry(struct xe_tile *tile,
> > > > > > > > +				   struct xe_page_reclaim_list *prl,
> > > > > > > > +				   u64 pte,
> > > > > > > > +				   struct xe_pt *xe_child)
> > > > > > >
> > > > > > > Nit, xe_pt can be on the same line as 'u64 pte'.
> > > > > > >
> > > > > >
> > > > > > Done.
> > > > > >
> > > > > > > > +{
> > > > > > > > +	struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries;
> > > > > > > > +	u64 phys_addr = pte & XE_PTE_ADDR_MASK;
> > > > > > > > +	const u64 field_mask = GENMASK_ULL(19, 0);
> > > > > > > > +	u32 reclamation_size;
> > > > > > >
> > > > > > > Nit, I'd make the last variable declared on the stack for readability.
> > > > > > >
> > > > > >
> > > > > > Ahh got it, reclamation_size moved to after num_entries.
> > > > > >
> > > > > > > > +	const uint max_entries = XE_PAGE_RECLAIM_MAX_ENTRIES;
> > > > > > > > +	int num_entries = prl->num_entries;
> > > > > > > > +
> > > > > > > > +	xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL);
> > > > > > > > +	xe_tile_assert(tile, reclaim_entries);
> > > > > > > > +
> > > > > > > > +	if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST)
> > > > > > > > +		return;
> > > > > > > > +
> > > > > > > > +	/* Overflow: mark as invalid through num_entries */
> > > > > > > > +	if (num_entries >= max_entries) {
> > > > > > > > +		prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST;
> > > > > > > > +		return;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > > +	/**
> > > > > > > > +	 * reclamation_size indicates the size of the page to be
> > > > > > > > +	 * invalidated and flushed from non-coherent cache.
> > > > > > > > +	 * Page size is computed as 2^(reclamation_size+12) bytes.
> > > > > > > > +	 * Only valid for these specific levels.
> > > > > > > > +	 */
> > > > > > > > +
> > > > > > > > +	if (xe_child->level == 0 && !(pte & XE_PTE_PS64))
> > > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */
> > > > > > > > +	else if (xe_child->level == 0)
> > > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 1 */
> > > > > > > > +	else if (is_large_pte(xe_child))
> > > > > > > > +		reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M);
> > > > > > > > +/* reclamation_size = 2 */
> > > > > > >
> > > > > > > What happens if we have 1G page? That doesn't seem to be handled.
> > > > > > >
> > > > > >
> > > > > > Page reclamation hardware does not support 1G page. This
> > > > > > should be handled and fallback to standard TLB invalidation
> > > > > > PPC flush. I can add
> > > > >
> > > > > Make sense that we fallback. I am however not seeing where this fallback occurs.
> > > > >
> > > >
> > > > !! Ohh I got it now, I silently dropped the 1G pages... My bad.
> > > > I'll follow the new changes suggested below.
> > > >
> > > > > > a comment somewhere discussing this but the format for PRL
> > > > > > only supports 4K, 64K, and 2M pages to reclaim. I'll add a
> > > > > > comment here mentioning the HW support being limited to these
> > > > > > pages and rename the is_large_pte to is_2m_pte.
> > > > > >
> > > > > > > > +	else
> > > > > > > > +		return;
> > > > >
> > > > > I would think for the fallback, we'd set prl->num_entries to XE_PAGE_RECLAIM_INVALID_LIST here.
> > > > >
> > > > > Maybe I'm missing something?
> > > > >
> > > > > Matt
> > > > >
> > > >
> > > > Given the 1G page, I'll follow this idea. Invalidate the PRL, and
> > > > then change the if statement in the
> > > > generate_reclaim_entry() caller to accept all PTE and invalidate it in this function above.
> > > >
> > > > > > > > +
> > > > > > > > +	reclaim_entries[num_entries].valid = 1;
> > > > > > > > +	reclaim_entries[num_entries].reclamation_size =
> > > > > > > > +		reclamation_size;
> > > > > > > > +	reclaim_entries[num_entries].address_lo =
> > > > > > > > +		FIELD_GET(field_mask, phys_addr);
> > > > > > > > +	reclaim_entries[num_entries].address_hi =
> > > > > > > > +		FIELD_GET(field_mask, phys_addr >> 20);
> > > > > > >
> > > > > > > As suggested above, use macros/defines here to setup the entry.
> > > > > > >
> > > > > >
> > > > > > Got it, moved over to using other standard define macros.
> > > > > >
> > > > > > > > +	prl->num_entries++;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset,
> > > > > > > >  				    unsigned int level, u64 addr, u64 next,
> > > > > > > >  				    struct xe_ptw **child, @@ -1579,10 +1646,27 @@
> > > > > > > > static int xe_pt_stage_unbind_entry(struct xe_ptw
> > > > > > > *parent, pgoff_t offset,
> > > > > > > >  				    struct xe_pt_walk *walk)  {
> > > > > > > >  	struct xe_pt *xe_child = container_of(*child,
> > > > > > > > typeof(*xe_child), base);
> > > > > > > > +	struct xe_pt_stage_unbind_walk *xe_walk =
> > > > > > > > +		container_of(walk, typeof(*xe_walk), base);
> > > > > > > > +	struct xe_device *xe = tile_to_xe(xe_walk->tile);
> > > > > > > >
> > > > > > > >  	XE_WARN_ON(!*child);
> > > > > > > >  	XE_WARN_ON(!level);
> > > > > > > >
> > > > > > > > +	/* 4K and 64K Pages are level 0, large pte needs additional handling. */
> > > > > > > > +	if (xe_walk->prl && (xe_child->level == 0 ||
> > > > > > > > +is_large_pte(xe_child))) {
> > > >
> > > > So right here, I'll make the change to accept all the leafs of the
> > > > walker and handle the 1G case in generate_reclaim_entry().
> > > >
> > > > Brian
> > > >
> > >
> > > It is possible we are even higher up page table tree too (e.g. with
> > > 57 bit VAs there are 2 level above 1G, 48 bits one level). We need to handle those cases as fallbacks to cache flushing TLB
> invalidations too.
> > >
> > > Matt
> > >
> >
> > Was planning on just making everything else not 4K, 64K, 2M pages default to invalidating PRL.
> > I believe that will handle these other levels as well? I am assuming
> > these other levels with
> > 48 bits and 57bit VA will still just look like leaf PTE with no
> > children so I can use a simple (!xe_child->base.children)?
> >
> > > > > > >
> > > > > > > And also here? 1G pages are unhandled? Please explain.
> > > > > > >
> > > > > >
> > > > > > As stated above, page reclamation only supports 4K, 64K, and 2M pages.
> > > > > > 1G page will have to fallback to the standard tlb invalidation with PPC flush.
> > > > > >
> > > > > > > > +		struct iosys_map *leaf_map = &xe_child->bo->vmap;
> > > > > > > > +		pgoff_t first = xe_pt_offset(addr, 0, walk);
> > > > > > > > +		pgoff_t count = xe_pt_num_entries(addr, next, 0, walk);
> > > > > > > > +
> > > > > > > > +		for (pgoff_t i = 0; i < count; i++) {
> > > > > > > > +			u64 pte = xe_map_rd(xe, leaf_map, (first + i) *
> > > > > > > > +sizeof(u64),
> > > > > > > u64);
> > > > > > > > +
> > > > > > > > +			generate_reclaim_entry(xe_walk->tile, xe_walk->prl,
> > > > > > > > +					       pte, xe_child);
> > > > > > > > +		}
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > >  	xe_pt_check_kill(addr, next, level - 1, xe_child,
> > > > > > > > action, walk);
> >
> > Since we're on the topic of this section as well, how will
> > xe_pt_check_kill() affect the page walk here? Do we need to handle
> > some case where the whole directroy is killed before we look at the child pte? In that case, worthwhile to just invalidate PRL or
> attempt to walk it?
> >
> > Brian
> >
> 
> This is what aborts the walk at higher levels. So I think here if we abort the walk at anything above level 1, we'd need to invalidate PRL.
> So I believe if xe_pt_check_kill returns true + level > 1, we invalidate the PRL but perhaps that isn't needed at num_entries would be
> zero.
> 
> Matt
> 

Okay got it. num_entries I'm assuming here is the xe_child->num_live and not the PRL num_entries.
For the level > 1 check here, isn't it better to check of the one being has any children and if so,
invalidate the PRL? I'll go with adding both checks for now. (level > 1 && xe_child->base.children)
Thanks.

Brian

> > > > > > > >
> > > > > > > >  	return 0;
> > > > > > > > @@ -1654,6 +1738,8 @@ static unsigned int
> > > > > > > > xe_pt_stage_unbind(struct xe_tile *tile,  {
> > > > > > > >  	u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma);
> > > > > > > >  	u64 end = range ? xe_svm_range_end(range) :
> > > > > > > > xe_vma_end(vma);
> > > > > > > > +	struct xe_vm_pgtable_update_op *pt_update_op =
> > > > > > > > +		container_of(entries, struct xe_vm_pgtable_update_op,
> > > > > > > entries[0]);
> > > > > > > >  	struct xe_pt_stage_unbind_walk xe_walk = {
> > > > > > > >  		.base = {
> > > > > > > >  			.ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1751,7 @@
> > > > > > > > static unsigned int xe_pt_stage_unbind(struct xe_tile
> > > > > > > *tile,
> > > > > > > >  		.modified_start = start,
> > > > > > > >  		.modified_end = end,
> > > > > > > >  		.wupd.entries = entries,
> > > > > > > > +		.prl = pt_update_op->prl,
> > > > > > > >  	};
> > > > > > > >  	struct xe_pt *pt = vm->pt_root[tile->id];
> > > > > > > >
> > > > > > > > @@ -1897,6 +1984,7 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > > > >  			     struct xe_vm_pgtable_update_ops *pt_update_ops,
> > > > > > > >  			     struct xe_vma *vma)  {
> > > > > > > > +	struct xe_device *xe = tile_to_xe(tile);
> > > > > > > >  	u32 current_op = pt_update_ops->current_op;
> > > > > > > >  	struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops-
> > > > > > > >ops[current_op];
> > > > > > > >  	int err;
> > > > > > > > @@ -1914,6 +2002,13 @@ static int unbind_op_prepare(struct xe_tile *tile,
> > > > > > > >  	pt_op->vma = vma;
> > > > > > > >  	pt_op->bind = false;
> > > > > > > >  	pt_op->rebind = false;
> > > > > > > > +	/* Maintain one PRL located in pt_update_ops that all
> > > > > > > > +others in unbind op
> > > > > > > reference */
> > > > > > > > +	if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) {
> > > > > > > > +		err = xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl);
> > > > > > > > +		if (err < 0)
> > > > > > > > +			xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > > >
> > > > > > > I don't think you need to call xe_page_reclaim_list_invalidate, right?
> > > > > > > If xe_page_reclaim_list_alloc_entries fails the prl should be in the init state.
> > > > > > >
> > > > > >
> > > > > > Yes. I'll drop this call for now then.
> > > > > >
> > > > > > > > +	}
> > > > > > > > +	pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl :
> > > > > > > > +NULL;
> > > > > > > >
> > > > > > > >  	err = vma_reserve_fences(tile_to_xe(tile), vma);
> > > > > > > >  	if (err)
> > > > > > > > @@ -1921,6 +2016,13 @@ static int unbind_op_prepare(struct
> > > > > > > > xe_tile *tile,
> > > > > > > >
> > > > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma),
> > > > > > > >  						vma, NULL, pt_op->entries);
> > > > > > > > +	/* Free PRL if list declared as invalid */
> > > > > > > > +	if (pt_update_ops->prl.entries &&
> > > > > > > > +	    pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) {
> > > > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > > > +		pt_op->prl = NULL;
> > > > > > > > +		pt_update_ops->prl.entries = NULL;
> > > > > > >
> > > > > > > Call xe_page_reclaim_list_invalidate for clarity?
> > > > > > >
> > > > > >
> > > > > > Updated.
> > > > > >
> > > > > > > > +	}
> > > > > > > >
> > > > > > > >  	xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries,
> > > > > > > >  				pt_op->num_entries, false); @@ -1979,6 +2081,7 @@
> > > > > > > > static int unbind_range_prepare(struct xe_vm *vm,
> > > > > > > >  	pt_op->vma = XE_INVALID_VMA;
> > > > > > > >  	pt_op->bind = false;
> > > > > > > >  	pt_op->rebind = false;
> > > > > > > > +	pt_op->prl = NULL;
> > > > > > > >
> > > > > > > >  	pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range,
> > > > > > > >  						pt_op->entries);
> > > > > > > > @@ -2096,6 +2199,7 @@ xe_pt_update_ops_init(struct
> > > > > > > xe_vm_pgtable_update_ops *pt_update_ops)
> > > > > > > >  	init_llist_head(&pt_update_ops->deferred);
> > > > > > > >  	pt_update_ops->start = ~0x0ull;
> > > > > > > >  	pt_update_ops->last = 0x0ull;
> > > > > > > > +	xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > > >
> > > > > > > Can we introduce a function called xe_page_reclaim_list_init
> > > > > > > for clarity? It might do the same thing as
> > > > > > > xe_page_reclaim_list_invalidate but it would make this a
> > > > > > > little more clear. Likewise later in the series when a job is created, you can call xe_page_reclaim_list_init there too.
> > > > > > >
> > > > > >
> > > > > > Sure, I'll write another helper for this and modify both those PRL creation points.
> > > > > >
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  /**
> > > > > > > > @@ -2518,6 +2622,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops)
> > > > > > > >  		&vops->pt_update_ops[tile->id];
> > > > > > > >  	int i;
> > > > > > > >
> > > > > > > > +	if (pt_update_ops->prl.entries) {
> > > > > > > > +		xe_page_reclaim_entries_put(pt_update_ops->prl.entries);
> > > > > > > > +		xe_page_reclaim_list_invalidate(&pt_update_ops->prl);
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > >  	lockdep_assert_held(&vops->vm->lock);
> > > > > > > >  	xe_vm_assert_held(vops->vm);
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > > b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > > index 881f01e14db8..26e5295f118e 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h
> > > > > > > > @@ -8,6 +8,7 @@
> > > > > > > >
> > > > > > > >  #include <linux/types.h>
> > > > > > > >
> > > > > > > > +#include "xe_page_reclaim.h"
> > > > > > > >  #include "xe_pt_walk.h"
> > > > > > > >
> > > > > > > >  struct xe_bo;
> > > > > > > > @@ -85,6 +86,8 @@ struct xe_vm_pgtable_update_op {
> > > > > > > >  	bool bind;
> > > > > > > >  	/** @rebind: is a rebind */
> > > > > > > >  	bool rebind;
> > > > > > > > +	/** @prl: Backing pointer to page reclaim list of pt_update_ops */
> > > > > > > > +	struct xe_page_reclaim_list *prl;
> > > > > > >
> > > > > > > Can you move this above the bools in the layout of
> > > > > > > xe_vm_pgtable_update_op, likely just below "struct xe_vma".
> > > > > > >
> > > > > >
> > > > > > Ahh got it. Moved.
> > > > > >
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  /** struct xe_vm_pgtable_update_ops: page table update
> > > > > > > > operations */ @@ -119,6 +122,8 @@ struct xe_vm_pgtable_update_ops {
> > > > > > > >  	 * slots are idle.
> > > > > > > >  	 */
> > > > > > > >  	bool wait_vm_kernel;
> > > > > > > > +	/** @prl: embedded page reclaim list */
> > > > > > > > +	struct xe_page_reclaim_list prl;
> > > > > > >
> > > > > > > Same thing here, move just below "struct xe_exec_queue".
> > > > > > >
> > > > > > > Matt
> > > > > > >
> > > > > >
> > > > > > Moved.
> > > > > >
> > > > > > Brian
> > > > > >
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  #endif
> > > > > > > > --
> > > > > > > > 2.51.2
> > > > > > > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2025-11-26  2:34 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-18  9:05 [PATCH 00/11] Page Reclamation Support for Xe3p Platforms Brian Nguyen
2025-11-18  9:05 ` [PATCH 01/11] [DO, NOT, REVIEW] drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers Brian Nguyen
2025-11-18  9:05 ` [PATCH 02/11] drm/xe: Reset tlb fence timeout on invalid seqno received Brian Nguyen
2025-11-21 17:23   ` Lin, Shuicheng
2025-11-22  1:53     ` Nguyen, Brian3
2025-11-22 18:25   ` Matthew Brost
2025-11-25 11:01     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 03/11] drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush Brian Nguyen
2025-11-21 18:02   ` Lin, Shuicheng
2025-11-22  1:54     ` Nguyen, Brian3
2025-11-22 19:32   ` Matthew Brost
2025-11-25 11:07     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 04/11] drm/xe: Add page reclamation info to device info Brian Nguyen
2025-11-21 18:15   ` Lin, Shuicheng
2025-11-22 18:31   ` Matthew Brost
2025-11-18  9:05 ` [PATCH 05/11] drm/xe/guc: Add page reclamation interface to GuC Brian Nguyen
2025-11-21 18:32   ` Lin, Shuicheng
2025-11-22  1:56     ` Nguyen, Brian3
2025-11-22 18:39       ` Matthew Brost
2025-11-25 11:13         ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 06/11] drm/xe: Create page reclaim list on unbind Brian Nguyen
2025-11-21 21:29   ` Lin, Shuicheng
2025-11-22  1:57     ` Nguyen, Brian3
2025-11-22 19:18   ` Matthew Brost
2025-11-25 11:18     ` Nguyen, Brian3
2025-11-25 18:34       ` Matthew Brost
2025-11-25 19:01         ` Nguyen, Brian3
2025-11-25 19:07           ` Matthew Brost
2025-11-25 19:46             ` Nguyen, Brian3
2025-11-25 22:35               ` Matthew Brost
2025-11-26  2:33                 ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 07/11] drm/xe: Suballocate BO for page reclaim Brian Nguyen
2025-11-22 19:42   ` Matthew Brost
2025-11-25 11:20     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 08/11] drm/xe: Prep page reclaim in tlb inval job Brian Nguyen
2025-11-22 13:52   ` Michal Wajdeczko
2025-11-25 11:20     ` Nguyen, Brian3
2025-11-18  9:05 ` [PATCH 09/11] drm/xe: Append page reclamation action to tlb inval Brian Nguyen
2025-11-18  9:05 ` [PATCH 10/11] drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim Brian Nguyen
2025-11-24 12:29   ` Matthew Auld
2025-11-25  6:12     ` Nguyen, Brian3
2025-11-25 11:48     ` Upadhyay, Tejas
2025-11-25 13:05       ` Upadhyay, Tejas
2025-11-18  9:05 ` [PATCH 11/11] drm/xe: Add debugfs support for page reclamation Brian Nguyen
2025-11-21 22:32   ` Lin, Shuicheng
2025-11-22  1:57     ` Nguyen, Brian3
2025-11-22 14:18   ` Michal Wajdeczko
2025-11-25 11:21     ` Nguyen, Brian3
2025-11-18  9:52 ` ✗ CI.checkpatch: warning for Page Reclamation Support for Xe3p Platforms Patchwork
2025-11-18  9:53 ` ✓ CI.KUnit: success " Patchwork
2025-11-18 13:02 ` ✗ Xe.CI.Full: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox