* [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe
@ 2025-10-16 20:48 Matthew Brost
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
` (10 more replies)
0 siblings, 11 replies; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
At XDC, we discussed that drivers should avoid accessing DRM scheduler
internals, misusing DRM scheduler locks, and adopt a well-defined
pending job list iterator. This series proposes the necessary changes to
the DRM scheduler to bring Xe in line with that agreement and updates Xe
to use the new DRM scheduler API.
While here, cleanup LR queue handling in Xe too.
v2:
- Fix checkpatch / naming issues
v3:
- Only allow pending job list iterator to be called on stopped schedulers
- Cleanup LR queue handling / fix a few misselanous Xe scheduler issues
Matt
Matthew Brost (7):
drm/sched: Add pending job list iterator
drm/sched: Add several job helpers to avoid drivers touching scheduler
state
drm/xe: Add dedicated message lock
drm/xe: Stop abusing DRM scheduler internals
drm/xe: Do not deregister queues in TDR
drm/xe: Remove special casing for LR queues in submission
drm/xe: Only toggle scheduling in TDR if GuC is running
drivers/gpu/drm/scheduler/sched_main.c | 4 +-
drivers/gpu/drm/xe/xe_gpu_scheduler.c | 9 +-
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 38 +--
drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 +
drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 2 -
drivers/gpu/drm/xe/xe_guc_submit.c | 252 ++-----------------
drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 -
drivers/gpu/drm/xe/xe_hw_fence.c | 16 --
drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
include/drm/gpu_scheduler.h | 80 ++++++
10 files changed, 117 insertions(+), 299 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3 1/7] drm/sched: Add pending job list iterator
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-15 1:25 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state Matthew Brost
` (9 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Stop open coding pending job list in drivers. Add pending job list
iterator which safely walks DRM scheduler list asserting DRM scheduler
is stopped.
v2:
- Fix checkpatch (CI)
v3:
- Drop locked version (Christian)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
include/drm/gpu_scheduler.h | 52 +++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index fb88301b3c45..7f31eba3bd61 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -698,4 +698,56 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
struct drm_gpu_scheduler **sched_list,
unsigned int num_sched_list);
+/* Inlines */
+
+/**
+ * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
+ * @sched: DRM scheduler associated with pending job iterator
+ */
+struct drm_sched_pending_job_iter {
+ struct drm_gpu_scheduler *sched;
+};
+
+/* Drivers should never call this directly */
+static inline struct drm_sched_pending_job_iter
+__drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
+{
+ struct drm_sched_pending_job_iter iter = {
+ .sched = sched,
+ };
+
+ WARN_ON(!READ_ONCE(sched->pause_submit));
+ return iter;
+}
+
+/* Drivers should never call this directly */
+static inline void
+__drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
+{
+ WARN_ON(!READ_ONCE(iter.sched->pause_submit));
+}
+
+DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
+ __drm_sched_pending_job_iter_end(_T),
+ __drm_sched_pending_job_iter_begin(__sched),
+ struct drm_gpu_scheduler *__sched);
+static inline void *
+class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T)
+{ return _T; }
+#define class_drm_sched_pending_job_iter_is_conditional false
+
+/**
+ * drm_sched_for_each_pending_job() - Iterator for each pending job in scheduler
+ * @__job: Current pending job being iterated over
+ * @__sched: DRM scheduler to iterate over pending jobs
+ * @__entity: DRM scheduler entity to filter jobs, NULL indicates no filter
+ *
+ * Iterator for each pending job in scheduler, filtering on an entity, and
+ * enforcing scheduler is fully stopped
+ */
+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
+ for_each_if(!(__entity) || (__job)->entity == (__entity))
+
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-17 19:57 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 3/7] drm/xe: Add dedicated message lock Matthew Brost
` (8 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Add helpers to see if scheduler is stopped and a jobs signaled state.
Expected to be used driver side on recovery and debug flows.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
include/drm/gpu_scheduler.h | 32 ++++++++++++++++++++++++--
2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 46119aacb809..69bd6e482268 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -344,7 +344,7 @@ drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched,
*/
static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
{
- if (!READ_ONCE(sched->pause_submit))
+ if (!drm_sched_is_stopped(sched))
queue_work(sched->submit_wq, &sched->work_run_job);
}
@@ -354,7 +354,7 @@ static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
*/
static void drm_sched_run_free_queue(struct drm_gpu_scheduler *sched)
{
- if (!READ_ONCE(sched->pause_submit))
+ if (!drm_sched_is_stopped(sched))
queue_work(sched->submit_wq, &sched->work_free_job);
}
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 7f31eba3bd61..d1a2d7f61c1d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -700,6 +700,17 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
/* Inlines */
+/**
+ * drm_sched_is_stopped() - DRM is stopped
+ * @sched: DRM scheduler
+ *
+ * Return: True if sched is stopped, False otherwise
+ */
+static inline bool drm_sched_is_stopped(struct drm_gpu_scheduler *sched)
+{
+ return READ_ONCE(sched->pause_submit);
+}
+
/**
* struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
* @sched: DRM scheduler associated with pending job iterator
@@ -716,7 +727,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
.sched = sched,
};
- WARN_ON(!READ_ONCE(sched->pause_submit));
+ WARN_ON(!drm_sched_is_stopped(sched));
return iter;
}
@@ -724,7 +735,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
static inline void
__drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
{
- WARN_ON(!READ_ONCE(iter.sched->pause_submit));
+ WARN_ON(!drm_sched_is_stopped(iter.sched));
}
DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
@@ -750,4 +761,21 @@ class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T
list_for_each_entry((__job), &(__sched)->pending_list, list) \
for_each_if(!(__entity) || (__job)->entity == (__entity))
+/**
+ * drm_sched_job_is_signaled() - DRM scheduler job is signaled
+ * @job: DRM scheduler job
+ *
+ * Determine if DRM scheduler job is signaled. DRM scheduler should be stopped
+ * to obtain a stable snapshot of state.
+ *
+ * Return: True if job is signaled, False otherwise
+ */
+static inline bool drm_sched_job_is_signaled(struct drm_sched_job *job)
+{
+ struct drm_sched_fence *s_fence = job->s_fence;
+
+ WARN_ON(!drm_sched_is_stopped(job->sched));
+ return dma_fence_is_signaled(&s_fence->finished);
+}
+
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 3/7] drm/xe: Add dedicated message lock
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
2025-10-16 20:48 ` [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-17 19:58 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals Matthew Brost
` (7 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Stop abusing DRM scheduler job list lock for messages, add dedicated
message lock.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_gpu_scheduler.c | 5 +++--
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 4 ++--
drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 ++
3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
index f91e06d03511..f4f23317191f 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
@@ -77,6 +77,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
};
sched->ops = xe_ops;
+ spin_lock_init(&sched->msg_lock);
INIT_LIST_HEAD(&sched->msgs);
INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
@@ -117,7 +118,7 @@ void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
struct xe_sched_msg *msg)
{
- lockdep_assert_held(&sched->base.job_list_lock);
+ lockdep_assert_held(&sched->msg_lock);
list_add_tail(&msg->link, &sched->msgs);
xe_sched_process_msg_queue(sched);
@@ -131,7 +132,7 @@ void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
struct xe_sched_msg *msg)
{
- lockdep_assert_held(&sched->base.job_list_lock);
+ lockdep_assert_held(&sched->msg_lock);
list_add(&msg->link, &sched->msgs);
xe_sched_process_msg_queue(sched);
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
index 9955397aaaa9..b971b6b69419 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
@@ -33,12 +33,12 @@ void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
static inline void xe_sched_msg_lock(struct xe_gpu_scheduler *sched)
{
- spin_lock(&sched->base.job_list_lock);
+ spin_lock(&sched->msg_lock);
}
static inline void xe_sched_msg_unlock(struct xe_gpu_scheduler *sched)
{
- spin_unlock(&sched->base.job_list_lock);
+ spin_unlock(&sched->msg_lock);
}
static inline void xe_sched_stop(struct xe_gpu_scheduler *sched)
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
index 6731b13da8bb..63d9bf92583c 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
@@ -47,6 +47,8 @@ struct xe_gpu_scheduler {
const struct xe_sched_backend_ops *ops;
/** @msgs: list of messages to be processed in @work_process_msg */
struct list_head msgs;
+ /** @msg_lock: Message lock */
+ spinlock_t msg_lock;
/** @work_process_msg: processes messages */
struct work_struct work_process_msg;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (2 preceding siblings ...)
2025-10-16 20:48 ` [PATCH v3 3/7] drm/xe: Add dedicated message lock Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-18 6:39 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR Matthew Brost
` (6 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Use new pending job list iterator and new helper functions in Xe to
avoid reaching into DRM scheduler internals.
Part of this change involves removing pending jobs debug information
from debugfs and devcoredump. As agreed, the pending job list should
only be accessed when the scheduler is stopped. However, it's not
straightforward to determine whether the scheduler is stopped from the
shared debugfs/devcoredump code path. Additionally, the pending job list
provides little useful information, as pending jobs can be inferred from
seqnos and ring head/tail positions. Therefore, this debug information
is being removed.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_gpu_scheduler.c | 4 +-
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 34 +++--------
drivers/gpu/drm/xe/xe_guc_submit.c | 74 ++++--------------------
drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
drivers/gpu/drm/xe/xe_hw_fence.c | 16 -----
drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
6 files changed, 20 insertions(+), 121 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
index f4f23317191f..9c8004d5dd91 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
@@ -7,7 +7,7 @@
static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
{
- if (!READ_ONCE(sched->base.pause_submit))
+ if (!drm_sched_is_stopped(&sched->base))
queue_work(sched->base.submit_wq, &sched->work_process_msg);
}
@@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct work_struct *w)
container_of(w, struct xe_gpu_scheduler, work_process_msg);
struct xe_sched_msg *msg;
- if (READ_ONCE(sched->base.pause_submit))
+ if (drm_sched_is_stopped(&sched->base))
return;
msg = xe_sched_get_msg(sched);
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
index b971b6b69419..583372a78140 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
@@ -55,14 +55,10 @@ static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
{
struct drm_sched_job *s_job;
- list_for_each_entry(s_job, &sched->base.pending_list, list) {
- struct drm_sched_fence *s_fence = s_job->s_fence;
- struct dma_fence *hw_fence = s_fence->parent;
-
+ drm_sched_for_each_pending_job(s_job, &sched->base, NULL)
if (to_xe_sched_job(s_job)->skip_emit ||
- (hw_fence && !dma_fence_is_signaled(hw_fence)))
+ !drm_sched_job_is_signaled(s_job))
sched->base.ops->run_job(s_job);
- }
}
static inline bool
@@ -71,14 +67,6 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
return drm_sched_invalidate_job(&job->drm, threshold);
}
-static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
- struct xe_sched_job *job)
-{
- spin_lock(&sched->base.job_list_lock);
- list_add(&job->drm.list, &sched->base.pending_list);
- spin_unlock(&sched->base.job_list_lock);
-}
-
/**
* xe_sched_first_pending_job() - Find first pending job which is unsignaled
* @sched: Xe GPU scheduler
@@ -88,21 +76,13 @@ static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
static inline
struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
{
- struct xe_sched_job *job, *r_job = NULL;
-
- spin_lock(&sched->base.job_list_lock);
- list_for_each_entry(job, &sched->base.pending_list, drm.list) {
- struct drm_sched_fence *s_fence = job->drm.s_fence;
- struct dma_fence *hw_fence = s_fence->parent;
+ struct drm_sched_job *job;
- if (hw_fence && !dma_fence_is_signaled(hw_fence)) {
- r_job = job;
- break;
- }
- }
- spin_unlock(&sched->base.job_list_lock);
+ drm_sched_for_each_pending_job(job, &sched->base, NULL)
+ if (!drm_sched_job_is_signaled(job))
+ return to_xe_sched_job(job);
- return r_job;
+ return NULL;
}
static inline int
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 0ef67d3523a7..680696efc434 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1032,7 +1032,7 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
struct xe_exec_queue *q = ge->q;
struct xe_guc *guc = exec_queue_to_guc(q);
struct xe_gpu_scheduler *sched = &ge->sched;
- struct xe_sched_job *job;
+ struct drm_sched_job *job;
bool wedged = false;
xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
@@ -1091,16 +1091,10 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
- xe_hw_fence_irq_stop(q->fence_irq);
+ drm_sched_for_each_pending_job(job, &sched->base, NULL)
+ xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
xe_sched_submission_start(sched);
-
- spin_lock(&sched->base.job_list_lock);
- list_for_each_entry(job, &sched->base.pending_list, drm.list)
- xe_sched_job_set_error(job, -ECANCELED);
- spin_unlock(&sched->base.job_list_lock);
-
- xe_hw_fence_irq_start(q->fence_irq);
}
#define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
@@ -1219,7 +1213,7 @@ static enum drm_gpu_sched_stat
guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
{
struct xe_sched_job *job = to_xe_sched_job(drm_job);
- struct xe_sched_job *tmp_job;
+ struct drm_sched_job *tmp_job;
struct xe_exec_queue *q = job->q;
struct xe_gpu_scheduler *sched = &q->guc->sched;
struct xe_guc *guc = exec_queue_to_guc(q);
@@ -1228,7 +1222,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
unsigned int fw_ref;
int err = -ETIME;
pid_t pid = -1;
- int i = 0;
bool wedged = false, skip_timeout_check;
xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
@@ -1395,28 +1388,15 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
__deregister_exec_queue(guc, q);
}
- /* Stop fence signaling */
- xe_hw_fence_irq_stop(q->fence_irq);
+ /* Mark all outstanding jobs as bad, thus completing them */
+ xe_sched_job_set_error(job, err);
+ drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
+ xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
- /*
- * Fence state now stable, stop / start scheduler which cleans up any
- * fences that are complete
- */
- xe_sched_add_pending_job(sched, job);
xe_sched_submission_start(sched);
-
xe_guc_exec_queue_trigger_cleanup(q);
- /* Mark all outstanding jobs as bad, thus completing them */
- spin_lock(&sched->base.job_list_lock);
- list_for_each_entry(tmp_job, &sched->base.pending_list, drm.list)
- xe_sched_job_set_error(tmp_job, !i++ ? err : -ECANCELED);
- spin_unlock(&sched->base.job_list_lock);
-
- /* Start fence signaling */
- xe_hw_fence_irq_start(q->fence_irq);
-
- return DRM_GPU_SCHED_STAT_RESET;
+ return DRM_GPU_SCHED_STAT_NO_HANG;
sched_enable:
set_exec_queue_pending_tdr_exit(q);
@@ -2244,7 +2224,7 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc,
struct drm_sched_job *s_job;
struct xe_sched_job *job = NULL;
- list_for_each_entry(s_job, &sched->base.pending_list, list) {
+ drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
job = to_xe_sched_job(s_job);
xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d",
@@ -2349,7 +2329,7 @@ void xe_guc_submit_unpause(struct xe_guc *guc)
* created after resfix done.
*/
if (q->guc->id != index ||
- !READ_ONCE(q->guc->sched.base.pause_submit))
+ !drm_sched_is_stopped(&q->guc->sched.base))
continue;
guc_exec_queue_unpause(guc, q);
@@ -2771,30 +2751,6 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
if (snapshot->parallel_execution)
guc_exec_queue_wq_snapshot_capture(q, snapshot);
- spin_lock(&sched->base.job_list_lock);
- snapshot->pending_list_size = list_count_nodes(&sched->base.pending_list);
- snapshot->pending_list = kmalloc_array(snapshot->pending_list_size,
- sizeof(struct pending_list_snapshot),
- GFP_ATOMIC);
-
- if (snapshot->pending_list) {
- struct xe_sched_job *job_iter;
-
- i = 0;
- list_for_each_entry(job_iter, &sched->base.pending_list, drm.list) {
- snapshot->pending_list[i].seqno =
- xe_sched_job_seqno(job_iter);
- snapshot->pending_list[i].fence =
- dma_fence_is_signaled(job_iter->fence) ? 1 : 0;
- snapshot->pending_list[i].finished =
- dma_fence_is_signaled(&job_iter->drm.s_fence->finished)
- ? 1 : 0;
- i++;
- }
- }
-
- spin_unlock(&sched->base.job_list_lock);
-
return snapshot;
}
@@ -2852,13 +2808,6 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
if (snapshot->parallel_execution)
guc_exec_queue_wq_snapshot_print(snapshot, p);
-
- for (i = 0; snapshot->pending_list && i < snapshot->pending_list_size;
- i++)
- drm_printf(p, "\tJob: seqno=%d, fence=%d, finished=%d\n",
- snapshot->pending_list[i].seqno,
- snapshot->pending_list[i].fence,
- snapshot->pending_list[i].finished);
}
/**
@@ -2881,7 +2830,6 @@ void xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *s
xe_lrc_snapshot_free(snapshot->lrc[i]);
kfree(snapshot->lrc);
}
- kfree(snapshot->pending_list);
kfree(snapshot);
}
diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h
index dc7456c34583..0b08c79cf3b9 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h
@@ -61,12 +61,6 @@ struct guc_submit_parallel_scratch {
u32 wq[WQ_SIZE / sizeof(u32)];
};
-struct pending_list_snapshot {
- u32 seqno;
- bool fence;
- bool finished;
-};
-
/**
* struct xe_guc_submit_exec_queue_snapshot - Snapshot for devcoredump
*/
@@ -134,11 +128,6 @@ struct xe_guc_submit_exec_queue_snapshot {
/** @wq: Workqueue Items */
u32 wq[WQ_SIZE / sizeof(u32)];
} parallel;
-
- /** @pending_list_size: Size of the pending list snapshot array */
- int pending_list_size;
- /** @pending_list: snapshot of the pending list info */
- struct pending_list_snapshot *pending_list;
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c
index b2a0c46dfcd4..e65dfcdfdbc5 100644
--- a/drivers/gpu/drm/xe/xe_hw_fence.c
+++ b/drivers/gpu/drm/xe/xe_hw_fence.c
@@ -110,22 +110,6 @@ void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq)
irq_work_queue(&irq->work);
}
-void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq)
-{
- spin_lock_irq(&irq->lock);
- irq->enabled = false;
- spin_unlock_irq(&irq->lock);
-}
-
-void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq)
-{
- spin_lock_irq(&irq->lock);
- irq->enabled = true;
- spin_unlock_irq(&irq->lock);
-
- irq_work_queue(&irq->work);
-}
-
void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
struct xe_hw_fence_irq *irq, const char *name)
{
diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h
index f13a1c4982c7..599492c13f80 100644
--- a/drivers/gpu/drm/xe/xe_hw_fence.h
+++ b/drivers/gpu/drm/xe/xe_hw_fence.h
@@ -17,8 +17,6 @@ void xe_hw_fence_module_exit(void);
void xe_hw_fence_irq_init(struct xe_hw_fence_irq *irq);
void xe_hw_fence_irq_finish(struct xe_hw_fence_irq *irq);
void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq);
-void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq);
-void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq);
void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
struct xe_hw_fence_irq *irq, const char *name);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (3 preceding siblings ...)
2025-10-16 20:48 ` [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-18 6:41 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission Matthew Brost
` (5 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Deregistering queues in the TDR introduces unnecessary complexity,
requiring reference counting tricks to function correctly. All that's
needed in the TDR is to kick the queue off the hardware, which is
achieved by disabling scheduling. Queue deregistration should be handled
in a single, well-defined point in the cleanup path, tied to the queue's
reference count.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_guc_submit.c | 57 +++---------------------------
1 file changed, 5 insertions(+), 52 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 680696efc434..ab0f1a2d4871 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -69,9 +69,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
#define EXEC_QUEUE_STATE_WEDGED (1 << 8)
#define EXEC_QUEUE_STATE_BANNED (1 << 9)
#define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10)
-#define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11)
-#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 12)
-#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 13)
+#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 11)
+#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 12)
static bool exec_queue_registered(struct xe_exec_queue *q)
{
@@ -218,21 +217,6 @@ static void clear_exec_queue_check_timeout(struct xe_exec_queue *q)
atomic_and(~EXEC_QUEUE_STATE_CHECK_TIMEOUT, &q->guc->state);
}
-static bool exec_queue_extra_ref(struct xe_exec_queue *q)
-{
- return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_EXTRA_REF;
-}
-
-static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
-{
- atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
-}
-
-static void clear_exec_queue_extra_ref(struct xe_exec_queue *q)
-{
- atomic_and(~EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
-}
-
static bool exec_queue_pending_resume(struct xe_exec_queue *q)
{
return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
@@ -1190,25 +1174,6 @@ static void disable_scheduling(struct xe_exec_queue *q, bool immediate)
G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, 1);
}
-static void __deregister_exec_queue(struct xe_guc *guc, struct xe_exec_queue *q)
-{
- u32 action[] = {
- XE_GUC_ACTION_DEREGISTER_CONTEXT,
- q->guc->id,
- };
-
- xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
- xe_gt_assert(guc_to_gt(guc), exec_queue_registered(q));
- xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_enable(q));
- xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_disable(q));
-
- set_exec_queue_destroyed(q);
- trace_xe_exec_queue_deregister(q);
-
- xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
- G2H_LEN_DW_DEREGISTER_CONTEXT, 1);
-}
-
static enum drm_gpu_sched_stat
guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
{
@@ -1326,8 +1291,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
xe_devcoredump(q, job,
"Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
q->guc->id, ret, xe_guc_read_stopped(guc));
- set_exec_queue_extra_ref(q);
- xe_exec_queue_get(q); /* GT reset owns this */
set_exec_queue_banned(q);
xe_gt_reset_async(q->gt);
xe_sched_tdr_queue_imm(sched);
@@ -1380,13 +1343,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
}
}
- /* Finish cleaning up exec queue via deregister */
set_exec_queue_banned(q);
- if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
- set_exec_queue_extra_ref(q);
- xe_exec_queue_get(q);
- __deregister_exec_queue(guc, q);
- }
/* Mark all outstanding jobs as bad, thus completing them */
xe_sched_job_set_error(job, err);
@@ -1928,7 +1885,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
/* Clean up lost G2H + reset engine state */
if (exec_queue_registered(q)) {
- if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
+ if (xe_exec_queue_is_lr(q))
xe_exec_queue_put(q);
else if (exec_queue_destroyed(q))
__guc_exec_queue_destroy(guc, q);
@@ -2062,11 +2019,7 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
if (exec_queue_destroyed(q) && exec_queue_registered(q)) {
clear_exec_queue_destroyed(q);
- if (exec_queue_extra_ref(q))
- xe_exec_queue_put(q);
- else
- q->guc->needs_cleanup = true;
- clear_exec_queue_extra_ref(q);
+ q->guc->needs_cleanup = true;
xe_gt_dbg(guc_to_gt(guc), "Replay CLEANUP - guc_id=%d",
q->guc->id);
}
@@ -2483,7 +2436,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
clear_exec_queue_registered(q);
- if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
+ if (xe_exec_queue_is_lr(q))
xe_exec_queue_put(q);
else
__guc_exec_queue_destroy(guc, q);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (4 preceding siblings ...)
2025-10-16 20:48 ` [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-18 6:45 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running Matthew Brost
` (4 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
Now that LR jobs are tracked by the DRM scheduler, there's no longer a
need to special-case LR queues. This change removes all LR
queue-specific handling, including dedicated TDR logic, reference
counting schemes, and other related mechanisms.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 2 -
drivers/gpu/drm/xe/xe_guc_submit.c | 129 +------------------
2 files changed, 7 insertions(+), 124 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
index a3b034e4b205..fd0915ed8eb1 100644
--- a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
@@ -33,8 +33,6 @@ struct xe_guc_exec_queue {
*/
#define MAX_STATIC_MSG_TYPE 3
struct xe_sched_msg static_msgs[MAX_STATIC_MSG_TYPE];
- /** @lr_tdr: long running TDR worker */
- struct work_struct lr_tdr;
/** @destroy_async: do final destroy async from this worker */
struct work_struct destroy_async;
/** @resume_time: time of last resume */
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index ab0f1a2d4871..bb1f2929441c 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -674,14 +674,6 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
parallel_write(xe, map, wq_desc.wq_status, WQ_STATUS_ACTIVE);
}
- /*
- * We must keep a reference for LR engines if engine is registered with
- * the GuC as jobs signal immediately and can't destroy an engine if the
- * GuC has a reference to it.
- */
- if (xe_exec_queue_is_lr(q))
- xe_exec_queue_get(q);
-
set_exec_queue_registered(q);
trace_xe_exec_queue_register(q);
if (xe_exec_queue_is_parallel(q))
@@ -854,7 +846,7 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
struct xe_sched_job *job = to_xe_sched_job(drm_job);
struct xe_exec_queue *q = job->q;
struct xe_guc *guc = exec_queue_to_guc(q);
- bool lr = xe_exec_queue_is_lr(q), killed_or_banned_or_wedged =
+ bool killed_or_banned_or_wedged =
exec_queue_killed_or_banned_or_wedged(q);
xe_gt_assert(guc_to_gt(guc), !(exec_queue_destroyed(q) || exec_queue_pending_disable(q)) ||
@@ -871,15 +863,6 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
job->skip_emit = false;
}
- /*
- * We don't care about job-fence ordering in LR VMs because these fences
- * are never exported; they are used solely to keep jobs on the pending
- * list. Once a queue enters an error state, there's no need to track
- * them.
- */
- if (killed_or_banned_or_wedged && lr)
- xe_sched_job_set_error(job, -ECANCELED);
-
return job->fence;
}
@@ -923,8 +906,7 @@ static void disable_scheduling_deregister(struct xe_guc *guc,
xe_gt_warn(q->gt, "Pending enable/disable failed to respond\n");
xe_sched_submission_start(sched);
xe_gt_reset_async(q->gt);
- if (!xe_exec_queue_is_lr(q))
- xe_sched_tdr_queue_imm(sched);
+ xe_sched_tdr_queue_imm(sched);
return;
}
@@ -950,10 +932,7 @@ static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q)
/** to wakeup xe_wait_user_fence ioctl if exec queue is reset */
wake_up_all(&xe->ufence_wq);
- if (xe_exec_queue_is_lr(q))
- queue_work(guc_to_gt(guc)->ordered_wq, &q->guc->lr_tdr);
- else
- xe_sched_tdr_queue_imm(&q->guc->sched);
+ xe_sched_tdr_queue_imm(&q->guc->sched);
}
/**
@@ -1009,78 +988,6 @@ static bool guc_submit_hint_wedged(struct xe_guc *guc)
return true;
}
-static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
-{
- struct xe_guc_exec_queue *ge =
- container_of(w, struct xe_guc_exec_queue, lr_tdr);
- struct xe_exec_queue *q = ge->q;
- struct xe_guc *guc = exec_queue_to_guc(q);
- struct xe_gpu_scheduler *sched = &ge->sched;
- struct drm_sched_job *job;
- bool wedged = false;
-
- xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
-
- if (vf_recovery(guc))
- return;
-
- trace_xe_exec_queue_lr_cleanup(q);
-
- if (!exec_queue_killed(q))
- wedged = guc_submit_hint_wedged(exec_queue_to_guc(q));
-
- /* Kill the run_job / process_msg entry points */
- xe_sched_submission_stop(sched);
-
- /*
- * Engine state now mostly stable, disable scheduling / deregister if
- * needed. This cleanup routine might be called multiple times, where
- * the actual async engine deregister drops the final engine ref.
- * Calling disable_scheduling_deregister will mark the engine as
- * destroyed and fire off the CT requests to disable scheduling /
- * deregister, which we only want to do once. We also don't want to mark
- * the engine as pending_disable again as this may race with the
- * xe_guc_deregister_done_handler() which treats it as an unexpected
- * state.
- */
- if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
- struct xe_guc *guc = exec_queue_to_guc(q);
- int ret;
-
- set_exec_queue_banned(q);
- disable_scheduling_deregister(guc, q);
-
- /*
- * Must wait for scheduling to be disabled before signalling
- * any fences, if GT broken the GT reset code should signal us.
- */
- ret = wait_event_timeout(guc->ct.wq,
- !exec_queue_pending_disable(q) ||
- xe_guc_read_stopped(guc) ||
- vf_recovery(guc), HZ * 5);
- if (vf_recovery(guc))
- return;
-
- if (!ret) {
- xe_gt_warn(q->gt, "Schedule disable failed to respond, guc_id=%d\n",
- q->guc->id);
- xe_devcoredump(q, NULL, "Schedule disable failed to respond, guc_id=%d\n",
- q->guc->id);
- xe_sched_submission_start(sched);
- xe_gt_reset_async(q->gt);
- return;
- }
- }
-
- if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
- xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
-
- drm_sched_for_each_pending_job(job, &sched->base, NULL)
- xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
-
- xe_sched_submission_start(sched);
-}
-
#define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
static bool check_timeout(struct xe_exec_queue *q, struct xe_sched_job *job)
@@ -1150,8 +1057,7 @@ static void enable_scheduling(struct xe_exec_queue *q)
xe_gt_warn(guc_to_gt(guc), "Schedule enable failed to respond");
set_exec_queue_banned(q);
xe_gt_reset_async(q->gt);
- if (!xe_exec_queue_is_lr(q))
- xe_sched_tdr_queue_imm(&q->guc->sched);
+ xe_sched_tdr_queue_imm(&q->guc->sched);
}
}
@@ -1189,8 +1095,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
pid_t pid = -1;
bool wedged = false, skip_timeout_check;
- xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
-
/*
* TDR has fired before free job worker. Common if exec queue
* immediately closed after last fence signaled. Add back to pending
@@ -1395,8 +1299,6 @@ static void __guc_exec_queue_destroy_async(struct work_struct *w)
xe_pm_runtime_get(guc_to_xe(guc));
trace_xe_exec_queue_destroy(q);
- if (xe_exec_queue_is_lr(q))
- cancel_work_sync(&ge->lr_tdr);
/* Confirm no work left behind accessing device structures */
cancel_delayed_work_sync(&ge->sched.base.work_tdr);
@@ -1629,9 +1531,6 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
if (err)
goto err_sched;
- if (xe_exec_queue_is_lr(q))
- INIT_WORK(&q->guc->lr_tdr, xe_guc_exec_queue_lr_cleanup);
-
mutex_lock(&guc->submission_state.lock);
err = alloc_guc_id(guc, q);
@@ -1885,9 +1784,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
/* Clean up lost G2H + reset engine state */
if (exec_queue_registered(q)) {
- if (xe_exec_queue_is_lr(q))
- xe_exec_queue_put(q);
- else if (exec_queue_destroyed(q))
+ if (exec_queue_destroyed(q))
__guc_exec_queue_destroy(guc, q);
}
if (q->guc->suspend_pending) {
@@ -1917,9 +1814,6 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
trace_xe_sched_job_ban(job);
ban = true;
}
- } else if (xe_exec_queue_is_lr(q) &&
- !xe_lrc_ring_is_idle(q->lrc[0])) {
- ban = true;
}
if (ban) {
@@ -2002,8 +1896,6 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
if (pending_enable && !pending_resume &&
!exec_queue_pending_tdr_exit(q)) {
clear_exec_queue_registered(q);
- if (xe_exec_queue_is_lr(q))
- xe_exec_queue_put(q);
xe_gt_dbg(guc_to_gt(guc), "Replay REGISTER - guc_id=%d",
q->guc->id);
}
@@ -2060,10 +1952,7 @@ static void guc_exec_queue_pause(struct xe_guc *guc, struct xe_exec_queue *q)
/* Stop scheduling + flush any DRM scheduler operations */
xe_sched_submission_stop(sched);
- if (xe_exec_queue_is_lr(q))
- cancel_work_sync(&q->guc->lr_tdr);
- else
- cancel_delayed_work_sync(&sched->base.work_tdr);
+ cancel_delayed_work_sync(&sched->base.work_tdr);
guc_exec_queue_revert_pending_state_change(guc, q);
@@ -2435,11 +2324,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
trace_xe_exec_queue_deregister_done(q);
clear_exec_queue_registered(q);
-
- if (xe_exec_queue_is_lr(q))
- xe_exec_queue_put(q);
- else
- __guc_exec_queue_destroy(guc, q);
+ __guc_exec_queue_destroy(guc, q);
}
int xe_guc_deregister_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (5 preceding siblings ...)
2025-10-16 20:48 ` [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission Matthew Brost
@ 2025-10-16 20:48 ` Matthew Brost
2025-11-15 1:01 ` Niranjana Vishwanathapura
2025-10-16 20:55 ` ✗ CI.checkpatch: warning for Fix DRM scheduler layering violations in Xe (rev3) Patchwork
` (3 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-10-16 20:48 UTC (permalink / raw)
To: intel-xe, dri-devel; +Cc: christian.koenig, pstanner, dakr
If the firmware is not running during TDR (e.g., when the driver is
unloading), there's no need to toggle scheduling in the GuC. In such
cases, skip this step.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_guc_submit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index bb1f2929441c..ea0cfd866981 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1146,7 +1146,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
if (exec_queue_reset(q))
err = -EIO;
- if (!exec_queue_destroyed(q)) {
+ if (!exec_queue_destroyed(q) && xe_uc_fw_is_running(&guc->fw)) {
/*
* Wait for any pending G2H to flush out before
* modifying state
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* ✗ CI.checkpatch: warning for Fix DRM scheduler layering violations in Xe (rev3)
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (6 preceding siblings ...)
2025-10-16 20:48 ` [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running Matthew Brost
@ 2025-10-16 20:55 ` Patchwork
2025-10-16 20:56 ` ✓ CI.KUnit: success " Patchwork
` (2 subsequent siblings)
10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-16 20:55 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe
== Series Details ==
Series: Fix DRM scheduler layering violations in Xe (rev3)
URL : https://patchwork.freedesktop.org/series/155314/
State : warning
== Summary ==
+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
fbd08a78c3a3bb17964db2a326514c69c1dca660
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 7aafb4b1d67f2a1f3cf5ece9403ed23bd8525235
Author: Matthew Brost <matthew.brost@intel.com>
Date: Thu Oct 16 13:48:26 2025 -0700
drm/xe: Only toggle scheduling in TDR if GuC is running
If the firmware is not running during TDR (e.g., when the driver is
unloading), there's no need to toggle scheduling in the GuC. In such
cases, skip this step.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
+ /mt/dim checkpatch 1f32baee68e7a3010d6092c3303516354f7b2298 drm-intel
373c2a0263bd drm/sched: Add pending job list iterator
-:72: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#72: FILE: include/drm/gpu_scheduler.h:748:
+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
+ for_each_if(!(__entity) || (__job)->entity == (__entity))
BUT SEE:
do {} while (0) advice is over-stated in a few situations:
The more obvious case is macros, like MODULE_PARM_DESC, invoked at
file-scope, where C disallows code (it must be in functions). See
$exceptions if you have one to add by name.
More troublesome is declarative macros used at top of new scope,
like DECLARE_PER_CPU. These might just compile with a do-while-0
wrapper, but would be incorrect. Most of these are handled by
detecting struct,union,etc declaration primitives in $exceptions.
Theres also macros called inside an if (block), which "return" an
expression. These cannot do-while, and need a ({}) wrapper.
Enjoy this qualification while we work to improve our heuristics.
-:72: CHECK:MACRO_ARG_REUSE: Macro argument reuse '__job' - possible side-effects?
#72: FILE: include/drm/gpu_scheduler.h:748:
+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
+ for_each_if(!(__entity) || (__job)->entity == (__entity))
-:72: CHECK:MACRO_ARG_REUSE: Macro argument reuse '__sched' - possible side-effects?
#72: FILE: include/drm/gpu_scheduler.h:748:
+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
+ for_each_if(!(__entity) || (__job)->entity == (__entity))
-:72: CHECK:MACRO_ARG_REUSE: Macro argument reuse '__entity' - possible side-effects?
#72: FILE: include/drm/gpu_scheduler.h:748:
+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
+ for_each_if(!(__entity) || (__job)->entity == (__entity))
total: 1 errors, 0 warnings, 3 checks, 56 lines checked
38ce964a0a6a drm/sched: Add several job helpers to avoid drivers touching scheduler state
df6d14c1e2fc drm/xe: Add dedicated message lock
cd167e8c3001 drm/xe: Stop abusing DRM scheduler internals
a8cf1d3e7a32 drm/xe: Do not deregister queues in TDR
ab6b79ce99a0 drm/xe: Remove special casing for LR queues in submission
7aafb4b1d67f drm/xe: Only toggle scheduling in TDR if GuC is running
^ permalink raw reply [flat|nested] 31+ messages in thread
* ✓ CI.KUnit: success for Fix DRM scheduler layering violations in Xe (rev3)
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (7 preceding siblings ...)
2025-10-16 20:55 ` ✗ CI.checkpatch: warning for Fix DRM scheduler layering violations in Xe (rev3) Patchwork
@ 2025-10-16 20:56 ` Patchwork
2025-10-16 21:36 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-17 18:43 ` ✗ Xe.CI.Full: failure " Patchwork
10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-16 20:56 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe
== Series Details ==
Series: Fix DRM scheduler layering violations in Xe (rev3)
URL : https://patchwork.freedesktop.org/series/155314/
State : success
== Summary ==
+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[20:55:24] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[20:55:28] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[20:55:59] Starting KUnit Kernel (1/1)...
[20:55:59] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[20:55:59] ================== guc_buf (11 subtests) ===================
[20:55:59] [PASSED] test_smallest
[20:55:59] [PASSED] test_largest
[20:55:59] [PASSED] test_granular
[20:55:59] [PASSED] test_unique
[20:55:59] [PASSED] test_overlap
[20:55:59] [PASSED] test_reusable
[20:55:59] [PASSED] test_too_big
[20:55:59] [PASSED] test_flush
[20:55:59] [PASSED] test_lookup
[20:55:59] [PASSED] test_data
[20:55:59] [PASSED] test_class
[20:55:59] ===================== [PASSED] guc_buf =====================
[20:55:59] =================== guc_dbm (7 subtests) ===================
[20:55:59] [PASSED] test_empty
[20:55:59] [PASSED] test_default
[20:55:59] ======================== test_size ========================
[20:55:59] [PASSED] 4
[20:55:59] [PASSED] 8
[20:55:59] [PASSED] 32
[20:55:59] [PASSED] 256
[20:55:59] ==================== [PASSED] test_size ====================
[20:55:59] ======================= test_reuse ========================
[20:55:59] [PASSED] 4
[20:55:59] [PASSED] 8
[20:55:59] [PASSED] 32
[20:55:59] [PASSED] 256
[20:55:59] =================== [PASSED] test_reuse ====================
[20:55:59] =================== test_range_overlap ====================
[20:55:59] [PASSED] 4
[20:55:59] [PASSED] 8
[20:55:59] [PASSED] 32
[20:55:59] [PASSED] 256
[20:55:59] =============== [PASSED] test_range_overlap ================
[20:55:59] =================== test_range_compact ====================
[20:55:59] [PASSED] 4
[20:55:59] [PASSED] 8
[20:55:59] [PASSED] 32
[20:55:59] [PASSED] 256
[20:55:59] =============== [PASSED] test_range_compact ================
[20:55:59] ==================== test_range_spare =====================
[20:55:59] [PASSED] 4
[20:55:59] [PASSED] 8
[20:55:59] [PASSED] 32
[20:55:59] [PASSED] 256
[20:55:59] ================ [PASSED] test_range_spare =================
[20:55:59] ===================== [PASSED] guc_dbm =====================
[20:55:59] =================== guc_idm (6 subtests) ===================
[20:55:59] [PASSED] bad_init
[20:55:59] [PASSED] no_init
[20:55:59] [PASSED] init_fini
[20:55:59] [PASSED] check_used
[20:55:59] [PASSED] check_quota
[20:55:59] [PASSED] check_all
[20:55:59] ===================== [PASSED] guc_idm =====================
[20:55:59] ================== no_relay (3 subtests) ===================
[20:55:59] [PASSED] xe_drops_guc2pf_if_not_ready
[20:55:59] [PASSED] xe_drops_guc2vf_if_not_ready
[20:55:59] [PASSED] xe_rejects_send_if_not_ready
[20:55:59] ==================== [PASSED] no_relay =====================
[20:55:59] ================== pf_relay (14 subtests) ==================
[20:55:59] [PASSED] pf_rejects_guc2pf_too_short
[20:55:59] [PASSED] pf_rejects_guc2pf_too_long
[20:55:59] [PASSED] pf_rejects_guc2pf_no_payload
[20:55:59] [PASSED] pf_fails_no_payload
[20:55:59] [PASSED] pf_fails_bad_origin
[20:55:59] [PASSED] pf_fails_bad_type
[20:55:59] [PASSED] pf_txn_reports_error
[20:55:59] [PASSED] pf_txn_sends_pf2guc
[20:55:59] [PASSED] pf_sends_pf2guc
[20:55:59] [SKIPPED] pf_loopback_nop
[20:55:59] [SKIPPED] pf_loopback_echo
[20:55:59] [SKIPPED] pf_loopback_fail
[20:55:59] [SKIPPED] pf_loopback_busy
[20:55:59] [SKIPPED] pf_loopback_retry
[20:55:59] ==================== [PASSED] pf_relay =====================
[20:55:59] ================== vf_relay (3 subtests) ===================
[20:55:59] [PASSED] vf_rejects_guc2vf_too_short
[20:55:59] [PASSED] vf_rejects_guc2vf_too_long
[20:55:59] [PASSED] vf_rejects_guc2vf_no_payload
[20:55:59] ==================== [PASSED] vf_relay =====================
[20:55:59] ===================== lmtt (1 subtest) =====================
[20:55:59] ======================== test_ops =========================
[20:55:59] [PASSED] 2-level
[20:55:59] [PASSED] multi-level
[20:55:59] ==================== [PASSED] test_ops =====================
[20:55:59] ====================== [PASSED] lmtt =======================
[20:55:59] ================= pf_service (11 subtests) =================
[20:55:59] [PASSED] pf_negotiate_any
[20:55:59] [PASSED] pf_negotiate_base_match
[20:55:59] [PASSED] pf_negotiate_base_newer
[20:55:59] [PASSED] pf_negotiate_base_next
[20:55:59] [SKIPPED] pf_negotiate_base_older
[20:55:59] [PASSED] pf_negotiate_base_prev
[20:55:59] [PASSED] pf_negotiate_latest_match
[20:55:59] [PASSED] pf_negotiate_latest_newer
[20:55:59] [PASSED] pf_negotiate_latest_next
[20:55:59] [SKIPPED] pf_negotiate_latest_older
[20:55:59] [SKIPPED] pf_negotiate_latest_prev
[20:55:59] =================== [PASSED] pf_service ====================
[20:55:59] ================= xe_guc_g2g (2 subtests) ==================
[20:55:59] ============== xe_live_guc_g2g_kunit_default ==============
[20:55:59] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[20:55:59] ============== xe_live_guc_g2g_kunit_allmem ===============
[20:55:59] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[20:55:59] =================== [SKIPPED] xe_guc_g2g ===================
[20:55:59] =================== xe_mocs (2 subtests) ===================
[20:55:59] ================ xe_live_mocs_kernel_kunit ================
[20:55:59] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[20:55:59] ================ xe_live_mocs_reset_kunit =================
[20:55:59] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[20:55:59] ==================== [SKIPPED] xe_mocs =====================
[20:55:59] ================= xe_migrate (2 subtests) ==================
[20:55:59] ================= xe_migrate_sanity_kunit =================
[20:55:59] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[20:55:59] ================== xe_validate_ccs_kunit ==================
[20:55:59] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[20:55:59] =================== [SKIPPED] xe_migrate ===================
[20:55:59] ================== xe_dma_buf (1 subtest) ==================
[20:55:59] ==================== xe_dma_buf_kunit =====================
[20:55:59] ================ [SKIPPED] xe_dma_buf_kunit ================
[20:55:59] =================== [SKIPPED] xe_dma_buf ===================
[20:55:59] ================= xe_bo_shrink (1 subtest) =================
[20:55:59] =================== xe_bo_shrink_kunit ====================
[20:55:59] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[20:55:59] ================== [SKIPPED] xe_bo_shrink ==================
[20:55:59] ==================== xe_bo (2 subtests) ====================
[20:55:59] ================== xe_ccs_migrate_kunit ===================
[20:55:59] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[20:55:59] ==================== xe_bo_evict_kunit ====================
[20:55:59] =============== [SKIPPED] xe_bo_evict_kunit ================
[20:55:59] ===================== [SKIPPED] xe_bo ======================
[20:55:59] ==================== args (11 subtests) ====================
[20:55:59] [PASSED] count_args_test
[20:55:59] [PASSED] call_args_example
[20:55:59] [PASSED] call_args_test
[20:55:59] [PASSED] drop_first_arg_example
[20:55:59] [PASSED] drop_first_arg_test
[20:55:59] [PASSED] first_arg_example
[20:55:59] [PASSED] first_arg_test
[20:55:59] [PASSED] last_arg_example
[20:55:59] [PASSED] last_arg_test
[20:55:59] [PASSED] pick_arg_example
[20:55:59] [PASSED] sep_comma_example
[20:55:59] ====================== [PASSED] args =======================
[20:55:59] =================== xe_pci (3 subtests) ====================
[20:55:59] ==================== check_graphics_ip ====================
[20:55:59] [PASSED] 12.00 Xe_LP
[20:55:59] [PASSED] 12.10 Xe_LP+
[20:55:59] [PASSED] 12.55 Xe_HPG
[20:55:59] [PASSED] 12.60 Xe_HPC
[20:55:59] [PASSED] 12.70 Xe_LPG
[20:55:59] [PASSED] 12.71 Xe_LPG
[20:55:59] [PASSED] 12.74 Xe_LPG+
[20:55:59] [PASSED] 20.01 Xe2_HPG
[20:55:59] [PASSED] 20.02 Xe2_HPG
[20:55:59] [PASSED] 20.04 Xe2_LPG
[20:55:59] [PASSED] 30.00 Xe3_LPG
[20:55:59] [PASSED] 30.01 Xe3_LPG
[20:55:59] [PASSED] 30.03 Xe3_LPG
[20:55:59] ================ [PASSED] check_graphics_ip ================
[20:55:59] ===================== check_media_ip ======================
[20:55:59] [PASSED] 12.00 Xe_M
[20:55:59] [PASSED] 12.55 Xe_HPM
[20:55:59] [PASSED] 13.00 Xe_LPM+
[20:55:59] [PASSED] 13.01 Xe2_HPM
[20:55:59] [PASSED] 20.00 Xe2_LPM
[20:55:59] [PASSED] 30.00 Xe3_LPM
[20:55:59] [PASSED] 30.02 Xe3_LPM
[20:55:59] ================= [PASSED] check_media_ip ==================
[20:55:59] ================= check_platform_gt_count =================
[20:55:59] [PASSED] 0x9A60 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A68 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A70 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A40 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A49 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A59 (TIGERLAKE)
[20:55:59] [PASSED] 0x9A78 (TIGERLAKE)
[20:55:59] [PASSED] 0x9AC0 (TIGERLAKE)
[20:55:59] [PASSED] 0x9AC9 (TIGERLAKE)
[20:55:59] [PASSED] 0x9AD9 (TIGERLAKE)
[20:55:59] [PASSED] 0x9AF8 (TIGERLAKE)
[20:55:59] [PASSED] 0x4C80 (ROCKETLAKE)
[20:55:59] [PASSED] 0x4C8A (ROCKETLAKE)
[20:55:59] [PASSED] 0x4C8B (ROCKETLAKE)
[20:55:59] [PASSED] 0x4C8C (ROCKETLAKE)
[20:55:59] [PASSED] 0x4C90 (ROCKETLAKE)
[20:55:59] [PASSED] 0x4C9A (ROCKETLAKE)
[20:55:59] [PASSED] 0x4680 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4682 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4688 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x468A (ALDERLAKE_S)
[20:55:59] [PASSED] 0x468B (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4690 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4692 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4693 (ALDERLAKE_S)
[20:55:59] [PASSED] 0x46A0 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46A1 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46A2 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46A3 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46A6 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46A8 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46AA (ALDERLAKE_P)
[20:55:59] [PASSED] 0x462A (ALDERLAKE_P)
[20:55:59] [PASSED] 0x4626 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x4628 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46B0 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46B1 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46B2 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46B3 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46C0 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46C1 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46C2 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46C3 (ALDERLAKE_P)
[20:55:59] [PASSED] 0x46D0 (ALDERLAKE_N)
[20:55:59] [PASSED] 0x46D1 (ALDERLAKE_N)
[20:55:59] [PASSED] 0x46D2 (ALDERLAKE_N)
[20:55:59] [PASSED] 0x46D3 (ALDERLAKE_N)
[20:55:59] [PASSED] 0x46D4 (ALDERLAKE_N)
[20:55:59] [PASSED] 0xA721 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7A1 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7A9 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7AC (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7AD (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA720 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7A0 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7A8 (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7AA (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA7AB (ALDERLAKE_P)
[20:55:59] [PASSED] 0xA780 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA781 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA782 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA783 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA788 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA789 (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA78A (ALDERLAKE_S)
[20:55:59] [PASSED] 0xA78B (ALDERLAKE_S)
[20:55:59] [PASSED] 0x4905 (DG1)
[20:55:59] [PASSED] 0x4906 (DG1)
[20:55:59] [PASSED] 0x4907 (DG1)
[20:55:59] [PASSED] 0x4908 (DG1)
[20:55:59] [PASSED] 0x4909 (DG1)
[20:55:59] [PASSED] 0x56C0 (DG2)
[20:55:59] [PASSED] 0x56C2 (DG2)
[20:55:59] [PASSED] 0x56C1 (DG2)
[20:55:59] [PASSED] 0x7D51 (METEORLAKE)
[20:55:59] [PASSED] 0x7DD1 (METEORLAKE)
[20:55:59] [PASSED] 0x7D41 (METEORLAKE)
[20:55:59] [PASSED] 0x7D67 (METEORLAKE)
[20:55:59] [PASSED] 0xB640 (METEORLAKE)
[20:55:59] [PASSED] 0x56A0 (DG2)
[20:55:59] [PASSED] 0x56A1 (DG2)
[20:55:59] [PASSED] 0x56A2 (DG2)
[20:55:59] [PASSED] 0x56BE (DG2)
[20:55:59] [PASSED] 0x56BF (DG2)
[20:55:59] [PASSED] 0x5690 (DG2)
[20:55:59] [PASSED] 0x5691 (DG2)
[20:55:59] [PASSED] 0x5692 (DG2)
[20:55:59] [PASSED] 0x56A5 (DG2)
[20:55:59] [PASSED] 0x56A6 (DG2)
[20:55:59] [PASSED] 0x56B0 (DG2)
[20:55:59] [PASSED] 0x56B1 (DG2)
[20:55:59] [PASSED] 0x56BA (DG2)
[20:55:59] [PASSED] 0x56BB (DG2)
[20:55:59] [PASSED] 0x56BC (DG2)
[20:55:59] [PASSED] 0x56BD (DG2)
[20:55:59] [PASSED] 0x5693 (DG2)
[20:55:59] [PASSED] 0x5694 (DG2)
[20:55:59] [PASSED] 0x5695 (DG2)
[20:55:59] [PASSED] 0x56A3 (DG2)
[20:55:59] [PASSED] 0x56A4 (DG2)
[20:55:59] [PASSED] 0x56B2 (DG2)
[20:55:59] [PASSED] 0x56B3 (DG2)
[20:55:59] [PASSED] 0x5696 (DG2)
[20:55:59] [PASSED] 0x5697 (DG2)
[20:55:59] [PASSED] 0xB69 (PVC)
[20:55:59] [PASSED] 0xB6E (PVC)
[20:55:59] [PASSED] 0xBD4 (PVC)
[20:55:59] [PASSED] 0xBD5 (PVC)
[20:55:59] [PASSED] 0xBD6 (PVC)
[20:55:59] [PASSED] 0xBD7 (PVC)
[20:55:59] [PASSED] 0xBD8 (PVC)
[20:55:59] [PASSED] 0xBD9 (PVC)
[20:55:59] [PASSED] 0xBDA (PVC)
[20:55:59] [PASSED] 0xBDB (PVC)
[20:55:59] [PASSED] 0xBE0 (PVC)
[20:55:59] [PASSED] 0xBE1 (PVC)
[20:55:59] [PASSED] 0xBE5 (PVC)
[20:55:59] [PASSED] 0x7D40 (METEORLAKE)
[20:55:59] [PASSED] 0x7D45 (METEORLAKE)
[20:55:59] [PASSED] 0x7D55 (METEORLAKE)
[20:55:59] [PASSED] 0x7D60 (METEORLAKE)
[20:55:59] [PASSED] 0x7DD5 (METEORLAKE)
[20:55:59] [PASSED] 0x6420 (LUNARLAKE)
[20:55:59] [PASSED] 0x64A0 (LUNARLAKE)
[20:55:59] [PASSED] 0x64B0 (LUNARLAKE)
[20:55:59] [PASSED] 0xE202 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE209 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE20B (BATTLEMAGE)
[20:55:59] [PASSED] 0xE20C (BATTLEMAGE)
[20:55:59] [PASSED] 0xE20D (BATTLEMAGE)
[20:55:59] [PASSED] 0xE210 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE211 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE212 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE216 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE220 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE221 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE222 (BATTLEMAGE)
[20:55:59] [PASSED] 0xE223 (BATTLEMAGE)
[20:55:59] [PASSED] 0xB080 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB081 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB082 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB083 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB084 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB085 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB086 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB087 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB08F (PANTHERLAKE)
[20:55:59] [PASSED] 0xB090 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB0A0 (PANTHERLAKE)
[20:55:59] [PASSED] 0xB0B0 (PANTHERLAKE)
[20:55:59] [PASSED] 0xFD80 (PANTHERLAKE)
[20:55:59] [PASSED] 0xFD81 (PANTHERLAKE)
[20:55:59] ============= [PASSED] check_platform_gt_count =============
[20:55:59] ===================== [PASSED] xe_pci ======================
[20:55:59] =================== xe_rtp (2 subtests) ====================
[20:55:59] =============== xe_rtp_process_to_sr_tests ================
[20:55:59] [PASSED] coalesce-same-reg
[20:55:59] [PASSED] no-match-no-add
[20:55:59] [PASSED] match-or
[20:55:59] [PASSED] match-or-xfail
[20:55:59] [PASSED] no-match-no-add-multiple-rules
[20:55:59] [PASSED] two-regs-two-entries
[20:55:59] [PASSED] clr-one-set-other
[20:55:59] [PASSED] set-field
[20:55:59] [PASSED] conflict-duplicate
[20:55:59] [PASSED] conflict-not-disjoint
[20:55:59] [PASSED] conflict-reg-type
[20:55:59] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[20:55:59] ================== xe_rtp_process_tests ===================
[20:55:59] [PASSED] active1
[20:55:59] [PASSED] active2
[20:55:59] [PASSED] active-inactive
[20:55:59] [PASSED] inactive-active
[20:55:59] [PASSED] inactive-1st_or_active-inactive
[20:55:59] [PASSED] inactive-2nd_or_active-inactive
[20:55:59] [PASSED] inactive-last_or_active-inactive
[20:55:59] [PASSED] inactive-no_or_active-inactive
[20:55:59] ============== [PASSED] xe_rtp_process_tests ===============
[20:55:59] ===================== [PASSED] xe_rtp ======================
[20:55:59] ==================== xe_wa (1 subtest) =====================
[20:55:59] ======================== xe_wa_gt =========================
[20:55:59] [PASSED] TIGERLAKE B0
[20:55:59] [PASSED] DG1 A0
[20:55:59] [PASSED] DG1 B0
[20:55:59] [PASSED] ALDERLAKE_S A0
[20:55:59] [PASSED] ALDERLAKE_S B0
stty: 'standard input': Inappropriate ioctl for device
[20:55:59] [PASSED] ALDERLAKE_S C0
[20:55:59] [PASSED] ALDERLAKE_S D0
[20:55:59] [PASSED] ALDERLAKE_P A0
[20:55:59] [PASSED] ALDERLAKE_P B0
[20:55:59] [PASSED] ALDERLAKE_P C0
[20:55:59] [PASSED] ALDERLAKE_S RPLS D0
[20:55:59] [PASSED] ALDERLAKE_P RPLU E0
[20:55:59] [PASSED] DG2 G10 C0
[20:55:59] [PASSED] DG2 G11 B1
[20:55:59] [PASSED] DG2 G12 A1
[20:55:59] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[20:55:59] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[20:55:59] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[20:55:59] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[20:55:59] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[20:55:59] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[20:55:59] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[20:55:59] ==================== [PASSED] xe_wa_gt =====================
[20:55:59] ====================== [PASSED] xe_wa ======================
[20:55:59] ============================================================
[20:55:59] Testing complete. Ran 306 tests: passed: 288, skipped: 18
[20:55:59] Elapsed time: 34.864s total, 4.268s configuring, 30.229s building, 0.311s running
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[20:55:59] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[20:56:01] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[20:56:26] Starting KUnit Kernel (1/1)...
[20:56:26] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[20:56:26] ============ drm_test_pick_cmdline (2 subtests) ============
[20:56:26] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[20:56:26] =============== drm_test_pick_cmdline_named ===============
[20:56:26] [PASSED] NTSC
[20:56:26] [PASSED] NTSC-J
[20:56:26] [PASSED] PAL
[20:56:26] [PASSED] PAL-M
[20:56:26] =========== [PASSED] drm_test_pick_cmdline_named ===========
[20:56:26] ============== [PASSED] drm_test_pick_cmdline ==============
[20:56:26] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[20:56:26] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[20:56:26] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[20:56:26] =========== drm_validate_clone_mode (2 subtests) ===========
[20:56:26] ============== drm_test_check_in_clone_mode ===============
[20:56:26] [PASSED] in_clone_mode
[20:56:26] [PASSED] not_in_clone_mode
[20:56:26] ========== [PASSED] drm_test_check_in_clone_mode ===========
[20:56:26] =============== drm_test_check_valid_clones ===============
[20:56:26] [PASSED] not_in_clone_mode
[20:56:26] [PASSED] valid_clone
[20:56:26] [PASSED] invalid_clone
[20:56:26] =========== [PASSED] drm_test_check_valid_clones ===========
[20:56:26] ============= [PASSED] drm_validate_clone_mode =============
[20:56:26] ============= drm_validate_modeset (1 subtest) =============
[20:56:26] [PASSED] drm_test_check_connector_changed_modeset
[20:56:26] ============== [PASSED] drm_validate_modeset ===============
[20:56:26] ====== drm_test_bridge_get_current_state (2 subtests) ======
[20:56:26] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[20:56:26] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[20:56:26] ======== [PASSED] drm_test_bridge_get_current_state ========
[20:56:26] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[20:56:26] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[20:56:26] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[20:56:26] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[20:56:26] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[20:56:26] ============== drm_bridge_alloc (2 subtests) ===============
[20:56:26] [PASSED] drm_test_drm_bridge_alloc_basic
[20:56:26] [PASSED] drm_test_drm_bridge_alloc_get_put
[20:56:26] ================ [PASSED] drm_bridge_alloc =================
[20:56:26] ================== drm_buddy (8 subtests) ==================
[20:56:26] [PASSED] drm_test_buddy_alloc_limit
[20:56:26] [PASSED] drm_test_buddy_alloc_optimistic
[20:56:26] [PASSED] drm_test_buddy_alloc_pessimistic
[20:56:26] [PASSED] drm_test_buddy_alloc_pathological
[20:56:26] [PASSED] drm_test_buddy_alloc_contiguous
[20:56:26] [PASSED] drm_test_buddy_alloc_clear
[20:56:26] [PASSED] drm_test_buddy_alloc_range_bias
[20:56:26] [PASSED] drm_test_buddy_fragmentation_performance
[20:56:26] ==================== [PASSED] drm_buddy ====================
[20:56:26] ============= drm_cmdline_parser (40 subtests) =============
[20:56:26] [PASSED] drm_test_cmdline_force_d_only
[20:56:26] [PASSED] drm_test_cmdline_force_D_only_dvi
[20:56:26] [PASSED] drm_test_cmdline_force_D_only_hdmi
[20:56:26] [PASSED] drm_test_cmdline_force_D_only_not_digital
[20:56:26] [PASSED] drm_test_cmdline_force_e_only
[20:56:26] [PASSED] drm_test_cmdline_res
[20:56:26] [PASSED] drm_test_cmdline_res_vesa
[20:56:26] [PASSED] drm_test_cmdline_res_vesa_rblank
[20:56:26] [PASSED] drm_test_cmdline_res_rblank
[20:56:26] [PASSED] drm_test_cmdline_res_bpp
[20:56:26] [PASSED] drm_test_cmdline_res_refresh
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[20:56:26] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[20:56:26] [PASSED] drm_test_cmdline_res_margins_force_on
[20:56:26] [PASSED] drm_test_cmdline_res_vesa_margins
[20:56:26] [PASSED] drm_test_cmdline_name
[20:56:26] [PASSED] drm_test_cmdline_name_bpp
[20:56:26] [PASSED] drm_test_cmdline_name_option
[20:56:26] [PASSED] drm_test_cmdline_name_bpp_option
[20:56:26] [PASSED] drm_test_cmdline_rotate_0
[20:56:26] [PASSED] drm_test_cmdline_rotate_90
[20:56:26] [PASSED] drm_test_cmdline_rotate_180
[20:56:26] [PASSED] drm_test_cmdline_rotate_270
[20:56:26] [PASSED] drm_test_cmdline_hmirror
[20:56:26] [PASSED] drm_test_cmdline_vmirror
[20:56:26] [PASSED] drm_test_cmdline_margin_options
[20:56:26] [PASSED] drm_test_cmdline_multiple_options
[20:56:26] [PASSED] drm_test_cmdline_bpp_extra_and_option
[20:56:26] [PASSED] drm_test_cmdline_extra_and_option
[20:56:26] [PASSED] drm_test_cmdline_freestanding_options
[20:56:26] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[20:56:26] [PASSED] drm_test_cmdline_panel_orientation
[20:56:26] ================ drm_test_cmdline_invalid =================
[20:56:26] [PASSED] margin_only
[20:56:26] [PASSED] interlace_only
[20:56:26] [PASSED] res_missing_x
[20:56:26] [PASSED] res_missing_y
[20:56:26] [PASSED] res_bad_y
[20:56:26] [PASSED] res_missing_y_bpp
[20:56:26] [PASSED] res_bad_bpp
[20:56:26] [PASSED] res_bad_refresh
[20:56:26] [PASSED] res_bpp_refresh_force_on_off
[20:56:26] [PASSED] res_invalid_mode
[20:56:26] [PASSED] res_bpp_wrong_place_mode
[20:56:26] [PASSED] name_bpp_refresh
[20:56:26] [PASSED] name_refresh
[20:56:26] [PASSED] name_refresh_wrong_mode
[20:56:26] [PASSED] name_refresh_invalid_mode
[20:56:26] [PASSED] rotate_multiple
[20:56:26] [PASSED] rotate_invalid_val
[20:56:26] [PASSED] rotate_truncated
[20:56:26] [PASSED] invalid_option
[20:56:26] [PASSED] invalid_tv_option
[20:56:26] [PASSED] truncated_tv_option
[20:56:26] ============ [PASSED] drm_test_cmdline_invalid =============
[20:56:26] =============== drm_test_cmdline_tv_options ===============
[20:56:26] [PASSED] NTSC
[20:56:26] [PASSED] NTSC_443
[20:56:26] [PASSED] NTSC_J
[20:56:26] [PASSED] PAL
[20:56:26] [PASSED] PAL_M
[20:56:26] [PASSED] PAL_N
[20:56:26] [PASSED] SECAM
[20:56:26] [PASSED] MONO_525
[20:56:26] [PASSED] MONO_625
[20:56:26] =========== [PASSED] drm_test_cmdline_tv_options ===========
[20:56:26] =============== [PASSED] drm_cmdline_parser ================
[20:56:26] ========== drmm_connector_hdmi_init (20 subtests) ==========
[20:56:26] [PASSED] drm_test_connector_hdmi_init_valid
[20:56:26] [PASSED] drm_test_connector_hdmi_init_bpc_8
[20:56:26] [PASSED] drm_test_connector_hdmi_init_bpc_10
[20:56:26] [PASSED] drm_test_connector_hdmi_init_bpc_12
[20:56:26] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[20:56:26] [PASSED] drm_test_connector_hdmi_init_bpc_null
[20:56:26] [PASSED] drm_test_connector_hdmi_init_formats_empty
[20:56:26] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[20:56:26] === drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[20:56:26] [PASSED] supported_formats=0x9 yuv420_allowed=1
[20:56:26] [PASSED] supported_formats=0x9 yuv420_allowed=0
[20:56:26] [PASSED] supported_formats=0x3 yuv420_allowed=1
[20:56:26] [PASSED] supported_formats=0x3 yuv420_allowed=0
[20:56:26] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[20:56:26] [PASSED] drm_test_connector_hdmi_init_null_ddc
[20:56:26] [PASSED] drm_test_connector_hdmi_init_null_product
[20:56:26] [PASSED] drm_test_connector_hdmi_init_null_vendor
[20:56:26] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[20:56:26] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[20:56:26] [PASSED] drm_test_connector_hdmi_init_product_valid
[20:56:26] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[20:56:26] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[20:56:26] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[20:56:26] ========= drm_test_connector_hdmi_init_type_valid =========
[20:56:26] [PASSED] HDMI-A
[20:56:26] [PASSED] HDMI-B
[20:56:26] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[20:56:26] ======== drm_test_connector_hdmi_init_type_invalid ========
[20:56:26] [PASSED] Unknown
[20:56:26] [PASSED] VGA
[20:56:26] [PASSED] DVI-I
[20:56:26] [PASSED] DVI-D
[20:56:26] [PASSED] DVI-A
[20:56:26] [PASSED] Composite
[20:56:26] [PASSED] SVIDEO
[20:56:26] [PASSED] LVDS
[20:56:26] [PASSED] Component
[20:56:26] [PASSED] DIN
[20:56:26] [PASSED] DP
[20:56:26] [PASSED] TV
[20:56:26] [PASSED] eDP
[20:56:26] [PASSED] Virtual
[20:56:26] [PASSED] DSI
[20:56:26] [PASSED] DPI
[20:56:26] [PASSED] Writeback
[20:56:26] [PASSED] SPI
[20:56:26] [PASSED] USB
[20:56:26] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[20:56:26] ============ [PASSED] drmm_connector_hdmi_init =============
[20:56:26] ============= drmm_connector_init (3 subtests) =============
[20:56:26] [PASSED] drm_test_drmm_connector_init
[20:56:26] [PASSED] drm_test_drmm_connector_init_null_ddc
[20:56:26] ========= drm_test_drmm_connector_init_type_valid =========
[20:56:26] [PASSED] Unknown
[20:56:26] [PASSED] VGA
[20:56:26] [PASSED] DVI-I
[20:56:26] [PASSED] DVI-D
[20:56:26] [PASSED] DVI-A
[20:56:26] [PASSED] Composite
[20:56:26] [PASSED] SVIDEO
[20:56:26] [PASSED] LVDS
[20:56:26] [PASSED] Component
[20:56:26] [PASSED] DIN
[20:56:26] [PASSED] DP
[20:56:26] [PASSED] HDMI-A
[20:56:26] [PASSED] HDMI-B
[20:56:26] [PASSED] TV
[20:56:26] [PASSED] eDP
[20:56:26] [PASSED] Virtual
[20:56:26] [PASSED] DSI
[20:56:26] [PASSED] DPI
[20:56:26] [PASSED] Writeback
[20:56:26] [PASSED] SPI
[20:56:26] [PASSED] USB
[20:56:26] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[20:56:26] =============== [PASSED] drmm_connector_init ===============
[20:56:26] ========= drm_connector_dynamic_init (6 subtests) ==========
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_init
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_init_properties
[20:56:26] ===== drm_test_drm_connector_dynamic_init_type_valid ======
[20:56:26] [PASSED] Unknown
[20:56:26] [PASSED] VGA
[20:56:26] [PASSED] DVI-I
[20:56:26] [PASSED] DVI-D
[20:56:26] [PASSED] DVI-A
[20:56:26] [PASSED] Composite
[20:56:26] [PASSED] SVIDEO
[20:56:26] [PASSED] LVDS
[20:56:26] [PASSED] Component
[20:56:26] [PASSED] DIN
[20:56:26] [PASSED] DP
[20:56:26] [PASSED] HDMI-A
[20:56:26] [PASSED] HDMI-B
[20:56:26] [PASSED] TV
[20:56:26] [PASSED] eDP
[20:56:26] [PASSED] Virtual
[20:56:26] [PASSED] DSI
[20:56:26] [PASSED] DPI
[20:56:26] [PASSED] Writeback
[20:56:26] [PASSED] SPI
[20:56:26] [PASSED] USB
[20:56:26] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[20:56:26] ======== drm_test_drm_connector_dynamic_init_name =========
[20:56:26] [PASSED] Unknown
[20:56:26] [PASSED] VGA
[20:56:26] [PASSED] DVI-I
[20:56:26] [PASSED] DVI-D
[20:56:26] [PASSED] DVI-A
[20:56:26] [PASSED] Composite
[20:56:26] [PASSED] SVIDEO
[20:56:26] [PASSED] LVDS
[20:56:26] [PASSED] Component
[20:56:26] [PASSED] DIN
[20:56:26] [PASSED] DP
[20:56:26] [PASSED] HDMI-A
[20:56:26] [PASSED] HDMI-B
[20:56:26] [PASSED] TV
[20:56:26] [PASSED] eDP
[20:56:26] [PASSED] Virtual
[20:56:26] [PASSED] DSI
[20:56:26] [PASSED] DPI
[20:56:26] [PASSED] Writeback
[20:56:26] [PASSED] SPI
[20:56:26] [PASSED] USB
[20:56:26] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[20:56:26] =========== [PASSED] drm_connector_dynamic_init ============
[20:56:26] ==== drm_connector_dynamic_register_early (4 subtests) =====
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[20:56:26] ====== [PASSED] drm_connector_dynamic_register_early =======
[20:56:26] ======= drm_connector_dynamic_register (7 subtests) ========
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[20:56:26] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[20:56:26] ========= [PASSED] drm_connector_dynamic_register ==========
[20:56:26] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[20:56:26] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[20:56:26] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[20:56:26] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[20:56:26] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[20:56:26] ========== drm_test_get_tv_mode_from_name_valid ===========
[20:56:26] [PASSED] NTSC
[20:56:26] [PASSED] NTSC-443
[20:56:26] [PASSED] NTSC-J
[20:56:26] [PASSED] PAL
[20:56:26] [PASSED] PAL-M
[20:56:26] [PASSED] PAL-N
[20:56:26] [PASSED] SECAM
[20:56:26] [PASSED] Mono
[20:56:26] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[20:56:26] [PASSED] drm_test_get_tv_mode_from_name_truncated
[20:56:26] ============ [PASSED] drm_get_tv_mode_from_name ============
[20:56:26] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[20:56:26] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[20:56:26] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid =
[20:56:26] [PASSED] VIC 96
[20:56:26] [PASSED] VIC 97
[20:56:26] [PASSED] VIC 101
[20:56:26] [PASSED] VIC 102
[20:56:26] [PASSED] VIC 106
[20:56:26] [PASSED] VIC 107
[20:56:26] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[20:56:26] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[20:56:26] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[20:56:26] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[20:56:26] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[20:56:26] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[20:56:26] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[20:56:26] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[20:56:26] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name ====
[20:56:26] [PASSED] Automatic
[20:56:26] [PASSED] Full
[20:56:26] [PASSED] Limited 16:235
[20:56:26] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[20:56:26] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[20:56:26] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[20:56:26] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[20:56:26] === drm_test_drm_hdmi_connector_get_output_format_name ====
[20:56:26] [PASSED] RGB
[20:56:26] [PASSED] YUV 4:2:0
[20:56:26] [PASSED] YUV 4:2:2
[20:56:26] [PASSED] YUV 4:4:4
[20:56:26] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[20:56:26] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[20:56:26] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[20:56:26] ============= drm_damage_helper (21 subtests) ==============
[20:56:26] [PASSED] drm_test_damage_iter_no_damage
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_src_moved
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_not_visible
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[20:56:26] [PASSED] drm_test_damage_iter_no_damage_no_fb
[20:56:26] [PASSED] drm_test_damage_iter_simple_damage
[20:56:26] [PASSED] drm_test_damage_iter_single_damage
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_outside_src
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_src_moved
[20:56:26] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[20:56:26] [PASSED] drm_test_damage_iter_damage
[20:56:26] [PASSED] drm_test_damage_iter_damage_one_intersect
[20:56:26] [PASSED] drm_test_damage_iter_damage_one_outside
[20:56:26] [PASSED] drm_test_damage_iter_damage_src_moved
[20:56:26] [PASSED] drm_test_damage_iter_damage_not_visible
[20:56:26] ================ [PASSED] drm_damage_helper ================
[20:56:26] ============== drm_dp_mst_helper (3 subtests) ==============
[20:56:26] ============== drm_test_dp_mst_calc_pbn_mode ==============
[20:56:26] [PASSED] Clock 154000 BPP 30 DSC disabled
[20:56:26] [PASSED] Clock 234000 BPP 30 DSC disabled
[20:56:26] [PASSED] Clock 297000 BPP 24 DSC disabled
[20:56:26] [PASSED] Clock 332880 BPP 24 DSC enabled
[20:56:26] [PASSED] Clock 324540 BPP 24 DSC enabled
[20:56:26] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[20:56:26] ============== drm_test_dp_mst_calc_pbn_div ===============
[20:56:26] [PASSED] Link rate 2000000 lane count 4
[20:56:26] [PASSED] Link rate 2000000 lane count 2
[20:56:26] [PASSED] Link rate 2000000 lane count 1
[20:56:26] [PASSED] Link rate 1350000 lane count 4
[20:56:26] [PASSED] Link rate 1350000 lane count 2
[20:56:26] [PASSED] Link rate 1350000 lane count 1
[20:56:26] [PASSED] Link rate 1000000 lane count 4
[20:56:26] [PASSED] Link rate 1000000 lane count 2
[20:56:26] [PASSED] Link rate 1000000 lane count 1
[20:56:26] [PASSED] Link rate 810000 lane count 4
[20:56:26] [PASSED] Link rate 810000 lane count 2
[20:56:26] [PASSED] Link rate 810000 lane count 1
[20:56:26] [PASSED] Link rate 540000 lane count 4
[20:56:26] [PASSED] Link rate 540000 lane count 2
[20:56:26] [PASSED] Link rate 540000 lane count 1
[20:56:26] [PASSED] Link rate 270000 lane count 4
[20:56:26] [PASSED] Link rate 270000 lane count 2
[20:56:26] [PASSED] Link rate 270000 lane count 1
[20:56:26] [PASSED] Link rate 162000 lane count 4
[20:56:26] [PASSED] Link rate 162000 lane count 2
[20:56:26] [PASSED] Link rate 162000 lane count 1
[20:56:26] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[20:56:26] ========= drm_test_dp_mst_sideband_msg_req_decode =========
[20:56:26] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[20:56:26] [PASSED] DP_POWER_UP_PHY with port number
[20:56:26] [PASSED] DP_POWER_DOWN_PHY with port number
[20:56:26] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[20:56:26] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[20:56:26] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[20:56:26] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[20:56:26] [PASSED] DP_QUERY_PAYLOAD with port number
[20:56:26] [PASSED] DP_QUERY_PAYLOAD with VCPI
[20:56:26] [PASSED] DP_REMOTE_DPCD_READ with port number
[20:56:26] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[20:56:26] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[20:56:26] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[20:56:26] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[20:56:26] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[20:56:26] [PASSED] DP_REMOTE_I2C_READ with port number
[20:56:26] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[20:56:26] [PASSED] DP_REMOTE_I2C_READ with transactions array
[20:56:26] [PASSED] DP_REMOTE_I2C_WRITE with port number
[20:56:26] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[20:56:26] [PASSED] DP_REMOTE_I2C_WRITE with data array
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[20:56:26] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[20:56:26] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[20:56:26] ================ [PASSED] drm_dp_mst_helper ================
[20:56:26] ================== drm_exec (7 subtests) ===================
[20:56:26] [PASSED] sanitycheck
[20:56:26] [PASSED] test_lock
[20:56:26] [PASSED] test_lock_unlock
[20:56:26] [PASSED] test_duplicates
[20:56:26] [PASSED] test_prepare
[20:56:26] [PASSED] test_prepare_array
[20:56:26] [PASSED] test_multiple_loops
[20:56:26] ==================== [PASSED] drm_exec =====================
[20:56:26] =========== drm_format_helper_test (17 subtests) ===========
[20:56:26] ============== drm_test_fb_xrgb8888_to_gray8 ==============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[20:56:26] ============= drm_test_fb_xrgb8888_to_rgb332 ==============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[20:56:26] ============= drm_test_fb_xrgb8888_to_rgb565 ==============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[20:56:26] ============ drm_test_fb_xrgb8888_to_xrgb1555 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[20:56:26] ============ drm_test_fb_xrgb8888_to_argb1555 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[20:56:26] ============ drm_test_fb_xrgb8888_to_rgba5551 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[20:56:26] ============= drm_test_fb_xrgb8888_to_rgb888 ==============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[20:56:26] ============= drm_test_fb_xrgb8888_to_bgr888 ==============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[20:56:26] ============ drm_test_fb_xrgb8888_to_argb8888 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[20:56:26] =========== drm_test_fb_xrgb8888_to_xrgb2101010 ===========
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[20:56:26] =========== drm_test_fb_xrgb8888_to_argb2101010 ===========
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[20:56:26] ============== drm_test_fb_xrgb8888_to_mono ===============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[20:56:26] ==================== drm_test_fb_swab =====================
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ================ [PASSED] drm_test_fb_swab =================
[20:56:26] ============ drm_test_fb_xrgb8888_to_xbgr8888 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[20:56:26] ============ drm_test_fb_xrgb8888_to_abgr8888 =============
[20:56:26] [PASSED] single_pixel_source_buffer
[20:56:26] [PASSED] single_pixel_clip_rectangle
[20:56:26] [PASSED] well_known_colors
[20:56:26] [PASSED] destination_pitch
[20:56:26] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[20:56:26] ================= drm_test_fb_clip_offset =================
[20:56:26] [PASSED] pass through
[20:56:26] [PASSED] horizontal offset
[20:56:26] [PASSED] vertical offset
[20:56:26] [PASSED] horizontal and vertical offset
[20:56:26] [PASSED] horizontal offset (custom pitch)
[20:56:26] [PASSED] vertical offset (custom pitch)
[20:56:26] [PASSED] horizontal and vertical offset (custom pitch)
[20:56:26] ============= [PASSED] drm_test_fb_clip_offset =============
[20:56:26] =================== drm_test_fb_memcpy ====================
[20:56:26] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[20:56:26] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[20:56:26] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[20:56:26] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[20:56:26] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[20:56:26] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[20:56:26] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[20:56:26] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[20:56:26] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[20:56:26] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[20:56:26] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[20:56:26] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[20:56:26] =============== [PASSED] drm_test_fb_memcpy ================
[20:56:26] ============= [PASSED] drm_format_helper_test ==============
[20:56:26] ================= drm_format (18 subtests) =================
[20:56:26] [PASSED] drm_test_format_block_width_invalid
[20:56:26] [PASSED] drm_test_format_block_width_one_plane
[20:56:26] [PASSED] drm_test_format_block_width_two_plane
[20:56:26] [PASSED] drm_test_format_block_width_three_plane
[20:56:26] [PASSED] drm_test_format_block_width_tiled
[20:56:26] [PASSED] drm_test_format_block_height_invalid
[20:56:26] [PASSED] drm_test_format_block_height_one_plane
[20:56:26] [PASSED] drm_test_format_block_height_two_plane
[20:56:26] [PASSED] drm_test_format_block_height_three_plane
[20:56:26] [PASSED] drm_test_format_block_height_tiled
[20:56:26] [PASSED] drm_test_format_min_pitch_invalid
[20:56:26] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[20:56:26] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[20:56:26] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[20:56:26] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[20:56:26] [PASSED] drm_test_format_min_pitch_two_plane
[20:56:26] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[20:56:26] [PASSED] drm_test_format_min_pitch_tiled
[20:56:26] =================== [PASSED] drm_format ====================
[20:56:26] ============== drm_framebuffer (10 subtests) ===============
[20:56:26] ========== drm_test_framebuffer_check_src_coords ==========
[20:56:26] [PASSED] Success: source fits into fb
[20:56:26] [PASSED] Fail: overflowing fb with x-axis coordinate
[20:56:26] [PASSED] Fail: overflowing fb with y-axis coordinate
[20:56:26] [PASSED] Fail: overflowing fb with source width
[20:56:26] [PASSED] Fail: overflowing fb with source height
[20:56:26] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[20:56:26] [PASSED] drm_test_framebuffer_cleanup
[20:56:26] =============== drm_test_framebuffer_create ===============
[20:56:26] [PASSED] ABGR8888 normal sizes
[20:56:26] [PASSED] ABGR8888 max sizes
[20:56:26] [PASSED] ABGR8888 pitch greater than min required
[20:56:26] [PASSED] ABGR8888 pitch less than min required
[20:56:26] [PASSED] ABGR8888 Invalid width
[20:56:26] [PASSED] ABGR8888 Invalid buffer handle
[20:56:26] [PASSED] No pixel format
[20:56:26] [PASSED] ABGR8888 Width 0
[20:56:26] [PASSED] ABGR8888 Height 0
[20:56:26] [PASSED] ABGR8888 Out of bound height * pitch combination
[20:56:26] [PASSED] ABGR8888 Large buffer offset
[20:56:26] [PASSED] ABGR8888 Buffer offset for inexistent plane
[20:56:26] [PASSED] ABGR8888 Invalid flag
[20:56:26] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[20:56:26] [PASSED] ABGR8888 Valid buffer modifier
[20:56:26] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[20:56:26] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] NV12 Normal sizes
[20:56:26] [PASSED] NV12 Max sizes
[20:56:26] [PASSED] NV12 Invalid pitch
[20:56:26] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[20:56:26] [PASSED] NV12 different modifier per-plane
[20:56:26] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[20:56:26] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] NV12 Modifier for inexistent plane
[20:56:26] [PASSED] NV12 Handle for inexistent plane
[20:56:26] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[20:56:26] [PASSED] YVU420 Normal sizes
[20:56:26] [PASSED] YVU420 Max sizes
[20:56:26] [PASSED] YVU420 Invalid pitch
[20:56:26] [PASSED] YVU420 Different pitches
[20:56:26] [PASSED] YVU420 Different buffer offsets/pitches
[20:56:26] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[20:56:26] [PASSED] YVU420 Valid modifier
[20:56:26] [PASSED] YVU420 Different modifiers per plane
[20:56:26] [PASSED] YVU420 Modifier for inexistent plane
[20:56:26] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[20:56:26] [PASSED] X0L2 Normal sizes
[20:56:26] [PASSED] X0L2 Max sizes
[20:56:26] [PASSED] X0L2 Invalid pitch
[20:56:26] [PASSED] X0L2 Pitch greater than minimum required
[20:56:26] [PASSED] X0L2 Handle for inexistent plane
[20:56:26] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[20:56:26] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[20:56:26] [PASSED] X0L2 Valid modifier
[20:56:26] [PASSED] X0L2 Modifier for inexistent plane
[20:56:26] =========== [PASSED] drm_test_framebuffer_create ===========
[20:56:26] [PASSED] drm_test_framebuffer_free
[20:56:26] [PASSED] drm_test_framebuffer_init
[20:56:26] [PASSED] drm_test_framebuffer_init_bad_format
[20:56:26] [PASSED] drm_test_framebuffer_init_dev_mismatch
[20:56:26] [PASSED] drm_test_framebuffer_lookup
[20:56:26] [PASSED] drm_test_framebuffer_lookup_inexistent
[20:56:26] [PASSED] drm_test_framebuffer_modifiers_not_supported
[20:56:26] ================= [PASSED] drm_framebuffer =================
[20:56:26] ================ drm_gem_shmem (8 subtests) ================
[20:56:26] [PASSED] drm_gem_shmem_test_obj_create
[20:56:26] [PASSED] drm_gem_shmem_test_obj_create_private
[20:56:26] [PASSED] drm_gem_shmem_test_pin_pages
[20:56:26] [PASSED] drm_gem_shmem_test_vmap
[20:56:26] [PASSED] drm_gem_shmem_test_get_pages_sgt
[20:56:26] [PASSED] drm_gem_shmem_test_get_sg_table
[20:56:26] [PASSED] drm_gem_shmem_test_madvise
[20:56:26] [PASSED] drm_gem_shmem_test_purge
[20:56:26] ================== [PASSED] drm_gem_shmem ==================
[20:56:26] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[20:56:26] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420 =======
[20:56:26] [PASSED] Automatic
[20:56:26] [PASSED] Full
[20:56:26] [PASSED] Limited 16:235
[20:56:26] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[20:56:26] [PASSED] drm_test_check_disable_connector
[20:56:26] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[20:56:26] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[20:56:26] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[20:56:26] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[20:56:26] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[20:56:26] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[20:56:26] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[20:56:26] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[20:56:26] [PASSED] drm_test_check_output_bpc_dvi
[20:56:26] [PASSED] drm_test_check_output_bpc_format_vic_1
[20:56:26] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[20:56:26] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[20:56:26] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[20:56:26] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[20:56:26] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[20:56:26] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[20:56:26] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[20:56:26] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[20:56:26] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[20:56:26] [PASSED] drm_test_check_broadcast_rgb_value
[20:56:26] [PASSED] drm_test_check_bpc_8_value
[20:56:26] [PASSED] drm_test_check_bpc_10_value
[20:56:26] [PASSED] drm_test_check_bpc_12_value
[20:56:26] [PASSED] drm_test_check_format_value
[20:56:26] [PASSED] drm_test_check_tmds_char_value
[20:56:26] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[20:56:26] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[20:56:26] [PASSED] drm_test_check_mode_valid
[20:56:26] [PASSED] drm_test_check_mode_valid_reject
[20:56:26] [PASSED] drm_test_check_mode_valid_reject_rate
[20:56:26] [PASSED] drm_test_check_mode_valid_reject_max_clock
[20:56:26] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[20:56:26] ================= drm_managed (2 subtests) =================
[20:56:26] [PASSED] drm_test_managed_release_action
[20:56:26] [PASSED] drm_test_managed_run_action
[20:56:26] =================== [PASSED] drm_managed ===================
[20:56:26] =================== drm_mm (6 subtests) ====================
[20:56:26] [PASSED] drm_test_mm_init
[20:56:26] [PASSED] drm_test_mm_debug
[20:56:26] [PASSED] drm_test_mm_align32
[20:56:26] [PASSED] drm_test_mm_align64
[20:56:26] [PASSED] drm_test_mm_lowest
[20:56:26] [PASSED] drm_test_mm_highest
[20:56:26] ===================== [PASSED] drm_mm ======================
[20:56:26] ============= drm_modes_analog_tv (5 subtests) =============
[20:56:26] [PASSED] drm_test_modes_analog_tv_mono_576i
[20:56:26] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[20:56:26] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[20:56:26] [PASSED] drm_test_modes_analog_tv_pal_576i
[20:56:26] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[20:56:26] =============== [PASSED] drm_modes_analog_tv ===============
[20:56:26] ============== drm_plane_helper (2 subtests) ===============
[20:56:26] =============== drm_test_check_plane_state ================
[20:56:26] [PASSED] clipping_simple
[20:56:26] [PASSED] clipping_rotate_reflect
[20:56:26] [PASSED] positioning_simple
[20:56:26] [PASSED] upscaling
[20:56:26] [PASSED] downscaling
[20:56:26] [PASSED] rounding1
[20:56:26] [PASSED] rounding2
[20:56:26] [PASSED] rounding3
[20:56:26] [PASSED] rounding4
[20:56:26] =========== [PASSED] drm_test_check_plane_state ============
[20:56:26] =========== drm_test_check_invalid_plane_state ============
[20:56:26] [PASSED] positioning_invalid
[20:56:26] [PASSED] upscaling_invalid
[20:56:26] [PASSED] downscaling_invalid
[20:56:26] ======= [PASSED] drm_test_check_invalid_plane_state ========
[20:56:26] ================ [PASSED] drm_plane_helper =================
[20:56:26] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[20:56:26] ====== drm_test_connector_helper_tv_get_modes_check =======
[20:56:26] [PASSED] None
[20:56:26] [PASSED] PAL
[20:56:26] [PASSED] NTSC
[20:56:26] [PASSED] Both, NTSC Default
[20:56:26] [PASSED] Both, PAL Default
[20:56:26] [PASSED] Both, NTSC Default, with PAL on command-line
[20:56:26] [PASSED] Both, PAL Default, with NTSC on command-line
[20:56:26] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[20:56:26] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[20:56:26] ================== drm_rect (9 subtests) ===================
[20:56:26] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[20:56:26] [PASSED] drm_test_rect_clip_scaled_not_clipped
[20:56:26] [PASSED] drm_test_rect_clip_scaled_clipped
[20:56:26] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[20:56:26] ================= drm_test_rect_intersect =================
[20:56:26] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[20:56:26] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[20:56:26] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[20:56:26] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[20:56:26] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[20:56:26] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[20:56:26] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[20:56:26] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[20:56:26] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[20:56:26] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[20:56:26] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[20:56:26] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[20:56:26] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[20:56:26] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[20:56:26] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[20:56:26] ============= [PASSED] drm_test_rect_intersect =============
[20:56:26] ================ drm_test_rect_calc_hscale ================
[20:56:26] [PASSED] normal use
[20:56:26] [PASSED] out of max range
[20:56:26] [PASSED] out of min range
[20:56:26] [PASSED] zero dst
[20:56:26] [PASSED] negative src
[20:56:26] [PASSED] negative dst
[20:56:26] ============ [PASSED] drm_test_rect_calc_hscale ============
[20:56:26] ================ drm_test_rect_calc_vscale ================
[20:56:26] [PASSED] normal use
stty: 'standard input': Inappropriate ioctl for device
[20:56:26] [PASSED] out of max range
[20:56:26] [PASSED] out of min range
[20:56:26] [PASSED] zero dst
[20:56:26] [PASSED] negative src
[20:56:26] [PASSED] negative dst
[20:56:26] ============ [PASSED] drm_test_rect_calc_vscale ============
[20:56:26] ================== drm_test_rect_rotate ===================
[20:56:26] [PASSED] reflect-x
[20:56:26] [PASSED] reflect-y
[20:56:26] [PASSED] rotate-0
[20:56:26] [PASSED] rotate-90
[20:56:26] [PASSED] rotate-180
[20:56:26] [PASSED] rotate-270
[20:56:26] ============== [PASSED] drm_test_rect_rotate ===============
[20:56:26] ================ drm_test_rect_rotate_inv =================
[20:56:26] [PASSED] reflect-x
[20:56:26] [PASSED] reflect-y
[20:56:26] [PASSED] rotate-0
[20:56:26] [PASSED] rotate-90
[20:56:26] [PASSED] rotate-180
[20:56:26] [PASSED] rotate-270
[20:56:26] ============ [PASSED] drm_test_rect_rotate_inv =============
[20:56:26] ==================== [PASSED] drm_rect =====================
[20:56:26] ============ drm_sysfb_modeset_test (1 subtest) ============
[20:56:26] ============ drm_test_sysfb_build_fourcc_list =============
[20:56:26] [PASSED] no native formats
[20:56:26] [PASSED] XRGB8888 as native format
[20:56:26] [PASSED] remove duplicates
[20:56:26] [PASSED] convert alpha formats
[20:56:26] [PASSED] random formats
[20:56:26] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[20:56:26] ============= [PASSED] drm_sysfb_modeset_test ==============
[20:56:26] ============================================================
[20:56:26] Testing complete. Ran 622 tests: passed: 622
[20:56:26] Elapsed time: 27.011s total, 1.684s configuring, 24.906s building, 0.396s running
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[20:56:26] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[20:56:28] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[20:56:37] Starting KUnit Kernel (1/1)...
[20:56:37] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[20:56:37] ================= ttm_device (5 subtests) ==================
[20:56:37] [PASSED] ttm_device_init_basic
[20:56:37] [PASSED] ttm_device_init_multiple
[20:56:37] [PASSED] ttm_device_fini_basic
[20:56:37] [PASSED] ttm_device_init_no_vma_man
[20:56:37] ================== ttm_device_init_pools ==================
[20:56:37] [PASSED] No DMA allocations, no DMA32 required
[20:56:37] [PASSED] DMA allocations, DMA32 required
[20:56:37] [PASSED] No DMA allocations, DMA32 required
[20:56:37] [PASSED] DMA allocations, no DMA32 required
[20:56:37] ============== [PASSED] ttm_device_init_pools ==============
[20:56:37] =================== [PASSED] ttm_device ====================
[20:56:37] ================== ttm_pool (8 subtests) ===================
[20:56:37] ================== ttm_pool_alloc_basic ===================
[20:56:37] [PASSED] One page
[20:56:37] [PASSED] More than one page
[20:56:37] [PASSED] Above the allocation limit
[20:56:37] [PASSED] One page, with coherent DMA mappings enabled
[20:56:37] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[20:56:37] ============== [PASSED] ttm_pool_alloc_basic ===============
[20:56:37] ============== ttm_pool_alloc_basic_dma_addr ==============
[20:56:37] [PASSED] One page
[20:56:37] [PASSED] More than one page
[20:56:37] [PASSED] Above the allocation limit
[20:56:37] [PASSED] One page, with coherent DMA mappings enabled
[20:56:37] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[20:56:37] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[20:56:37] [PASSED] ttm_pool_alloc_order_caching_match
[20:56:37] [PASSED] ttm_pool_alloc_caching_mismatch
[20:56:37] [PASSED] ttm_pool_alloc_order_mismatch
[20:56:37] [PASSED] ttm_pool_free_dma_alloc
[20:56:37] [PASSED] ttm_pool_free_no_dma_alloc
[20:56:37] [PASSED] ttm_pool_fini_basic
[20:56:37] ==================== [PASSED] ttm_pool =====================
[20:56:37] ================ ttm_resource (8 subtests) =================
[20:56:37] ================= ttm_resource_init_basic =================
[20:56:37] [PASSED] Init resource in TTM_PL_SYSTEM
[20:56:37] [PASSED] Init resource in TTM_PL_VRAM
[20:56:37] [PASSED] Init resource in a private placement
[20:56:37] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[20:56:37] ============= [PASSED] ttm_resource_init_basic =============
[20:56:37] [PASSED] ttm_resource_init_pinned
[20:56:37] [PASSED] ttm_resource_fini_basic
[20:56:37] [PASSED] ttm_resource_manager_init_basic
[20:56:37] [PASSED] ttm_resource_manager_usage_basic
[20:56:37] [PASSED] ttm_resource_manager_set_used_basic
[20:56:37] [PASSED] ttm_sys_man_alloc_basic
[20:56:37] [PASSED] ttm_sys_man_free_basic
[20:56:37] ================== [PASSED] ttm_resource ===================
[20:56:37] =================== ttm_tt (15 subtests) ===================
[20:56:37] ==================== ttm_tt_init_basic ====================
[20:56:37] [PASSED] Page-aligned size
[20:56:37] [PASSED] Extra pages requested
[20:56:37] ================ [PASSED] ttm_tt_init_basic ================
[20:56:37] [PASSED] ttm_tt_init_misaligned
[20:56:37] [PASSED] ttm_tt_fini_basic
[20:56:37] [PASSED] ttm_tt_fini_sg
[20:56:37] [PASSED] ttm_tt_fini_shmem
[20:56:37] [PASSED] ttm_tt_create_basic
[20:56:37] [PASSED] ttm_tt_create_invalid_bo_type
[20:56:37] [PASSED] ttm_tt_create_ttm_exists
[20:56:37] [PASSED] ttm_tt_create_failed
[20:56:37] [PASSED] ttm_tt_destroy_basic
[20:56:37] [PASSED] ttm_tt_populate_null_ttm
[20:56:37] [PASSED] ttm_tt_populate_populated_ttm
[20:56:37] [PASSED] ttm_tt_unpopulate_basic
[20:56:37] [PASSED] ttm_tt_unpopulate_empty_ttm
[20:56:37] [PASSED] ttm_tt_swapin_basic
[20:56:37] ===================== [PASSED] ttm_tt ======================
[20:56:37] =================== ttm_bo (14 subtests) ===================
[20:56:37] =========== ttm_bo_reserve_optimistic_no_ticket ===========
[20:56:37] [PASSED] Cannot be interrupted and sleeps
[20:56:37] [PASSED] Cannot be interrupted, locks straight away
[20:56:37] [PASSED] Can be interrupted, sleeps
[20:56:37] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[20:56:37] [PASSED] ttm_bo_reserve_locked_no_sleep
[20:56:37] [PASSED] ttm_bo_reserve_no_wait_ticket
[20:56:37] [PASSED] ttm_bo_reserve_double_resv
[20:56:37] [PASSED] ttm_bo_reserve_interrupted
[20:56:37] [PASSED] ttm_bo_reserve_deadlock
[20:56:37] [PASSED] ttm_bo_unreserve_basic
[20:56:37] [PASSED] ttm_bo_unreserve_pinned
[20:56:37] [PASSED] ttm_bo_unreserve_bulk
[20:56:37] [PASSED] ttm_bo_fini_basic
[20:56:37] [PASSED] ttm_bo_fini_shared_resv
[20:56:37] [PASSED] ttm_bo_pin_basic
[20:56:37] [PASSED] ttm_bo_pin_unpin_resource
[20:56:37] [PASSED] ttm_bo_multiple_pin_one_unpin
[20:56:37] ===================== [PASSED] ttm_bo ======================
[20:56:37] ============== ttm_bo_validate (21 subtests) ===============
[20:56:37] ============== ttm_bo_init_reserved_sys_man ===============
[20:56:37] [PASSED] Buffer object for userspace
[20:56:37] [PASSED] Kernel buffer object
[20:56:37] [PASSED] Shared buffer object
[20:56:37] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[20:56:37] ============== ttm_bo_init_reserved_mock_man ==============
[20:56:37] [PASSED] Buffer object for userspace
[20:56:37] [PASSED] Kernel buffer object
[20:56:37] [PASSED] Shared buffer object
[20:56:37] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[20:56:37] [PASSED] ttm_bo_init_reserved_resv
[20:56:37] ================== ttm_bo_validate_basic ==================
[20:56:37] [PASSED] Buffer object for userspace
[20:56:37] [PASSED] Kernel buffer object
[20:56:37] [PASSED] Shared buffer object
[20:56:37] ============== [PASSED] ttm_bo_validate_basic ==============
[20:56:37] [PASSED] ttm_bo_validate_invalid_placement
[20:56:37] ============= ttm_bo_validate_same_placement ==============
[20:56:37] [PASSED] System manager
[20:56:37] [PASSED] VRAM manager
[20:56:37] ========= [PASSED] ttm_bo_validate_same_placement ==========
[20:56:37] [PASSED] ttm_bo_validate_failed_alloc
[20:56:37] [PASSED] ttm_bo_validate_pinned
[20:56:37] [PASSED] ttm_bo_validate_busy_placement
[20:56:37] ================ ttm_bo_validate_multihop =================
[20:56:37] [PASSED] Buffer object for userspace
[20:56:37] [PASSED] Kernel buffer object
[20:56:37] [PASSED] Shared buffer object
[20:56:37] ============ [PASSED] ttm_bo_validate_multihop =============
[20:56:37] ========== ttm_bo_validate_no_placement_signaled ==========
[20:56:37] [PASSED] Buffer object in system domain, no page vector
[20:56:37] [PASSED] Buffer object in system domain with an existing page vector
[20:56:37] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[20:56:37] ======== ttm_bo_validate_no_placement_not_signaled ========
[20:56:37] [PASSED] Buffer object for userspace
[20:56:37] [PASSED] Kernel buffer object
[20:56:37] [PASSED] Shared buffer object
[20:56:37] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[20:56:37] [PASSED] ttm_bo_validate_move_fence_signaled
[20:56:37] ========= ttm_bo_validate_move_fence_not_signaled =========
[20:56:37] [PASSED] Waits for GPU
[20:56:37] [PASSED] Tries to lock straight away
[20:56:37] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[20:56:37] [PASSED] ttm_bo_validate_happy_evict
[20:56:37] [PASSED] ttm_bo_validate_all_pinned_evict
[20:56:37] [PASSED] ttm_bo_validate_allowed_only_evict
[20:56:37] [PASSED] ttm_bo_validate_deleted_evict
[20:56:37] [PASSED] ttm_bo_validate_busy_domain_evict
[20:56:37] [PASSED] ttm_bo_validate_evict_gutting
[20:56:37] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[20:56:37] ================= [PASSED] ttm_bo_validate =================
[20:56:37] ============================================================
[20:56:37] Testing complete. Ran 101 tests: passed: 101
[20:56:37] Elapsed time: 11.206s total, 1.702s configuring, 9.289s building, 0.183s running
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* ✓ Xe.CI.BAT: success for Fix DRM scheduler layering violations in Xe (rev3)
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (8 preceding siblings ...)
2025-10-16 20:56 ` ✓ CI.KUnit: success " Patchwork
@ 2025-10-16 21:36 ` Patchwork
2025-10-17 18:43 ` ✗ Xe.CI.Full: failure " Patchwork
10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-16 21:36 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe
[-- Attachment #1: Type: text/plain, Size: 1932 bytes --]
== Series Details ==
Series: Fix DRM scheduler layering violations in Xe (rev3)
URL : https://patchwork.freedesktop.org/series/155314/
State : success
== Summary ==
CI Bug Log - changes from xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad_BAT -> xe-pw-155314v3_BAT
====================================================
Summary
-------
**SUCCESS**
No regressions found.
Participating hosts (11 -> 11)
------------------------------
No changes in participating hosts
Known issues
------------
Here are the changes found in xe-pw-155314v3_BAT that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@kms_flip@basic-flip-vs-wf_vblank@d-edp1:
- bat-adlp-7: [PASS][1] -> [DMESG-WARN][2] ([Intel XE#4543]) +1 other test dmesg-warn
[1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/bat-adlp-7/igt@kms_flip@basic-flip-vs-wf_vblank@d-edp1.html
[2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/bat-adlp-7/igt@kms_flip@basic-flip-vs-wf_vblank@d-edp1.html
#### Possible fixes ####
* igt@kms_flip@basic-plain-flip@a-edp1:
- bat-adlp-7: [DMESG-WARN][3] ([Intel XE#4543]) -> [PASS][4] +1 other test pass
[3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/bat-adlp-7/igt@kms_flip@basic-plain-flip@a-edp1.html
[4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/bat-adlp-7/igt@kms_flip@basic-plain-flip@a-edp1.html
[Intel XE#4543]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4543
Build changes
-------------
* Linux: xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad -> xe-pw-155314v3
IGT_8588: 8588
xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad: bf6a6de7a240324a2d0bfca44a9760d9f9750bad
xe-pw-155314v3: 155314v3
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/index.html
[-- Attachment #2: Type: text/html, Size: 2595 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* ✗ Xe.CI.Full: failure for Fix DRM scheduler layering violations in Xe (rev3)
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
` (9 preceding siblings ...)
2025-10-16 21:36 ` ✓ Xe.CI.BAT: " Patchwork
@ 2025-10-17 18:43 ` Patchwork
10 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2025-10-17 18:43 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe
[-- Attachment #1: Type: text/plain, Size: 71983 bytes --]
== Series Details ==
Series: Fix DRM scheduler layering violations in Xe (rev3)
URL : https://patchwork.freedesktop.org/series/155314/
State : failure
== Summary ==
CI Bug Log - changes from xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad_FULL -> xe-pw-155314v3_FULL
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with xe-pw-155314v3_FULL absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in xe-pw-155314v3_FULL, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
to document this new failure mode, which will reduce false positives in CI.
Participating hosts (4 -> 4)
------------------------------
No changes in participating hosts
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in xe-pw-155314v3_FULL:
### IGT changes ###
#### Possible regressions ####
* igt@kms_fbcon_fbt@fbc-suspend:
- shard-lnl: NOTRUN -> [ABORT][1]
[1]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_fbcon_fbt@fbc-suspend.html
* igt@kms_pm_rpm@dpms-mode-unset-lpsp:
- shard-bmg: NOTRUN -> [SKIP][2]
[2]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_pm_rpm@dpms-mode-unset-lpsp.html
* igt@kms_pm_rpm@legacy-planes-dpms:
- shard-bmg: [PASS][3] -> [SKIP][4] +3 other tests skip
[3]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-8/igt@kms_pm_rpm@legacy-planes-dpms.html
[4]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-4/igt@kms_pm_rpm@legacy-planes-dpms.html
* igt@kms_pm_rpm@modeset-non-lpsp:
- shard-dg2-set2: NOTRUN -> [SKIP][5]
[5]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-466/igt@kms_pm_rpm@modeset-non-lpsp.html
* igt@kms_pm_rpm@universal-planes:
- shard-dg2-set2: [PASS][6] -> [SKIP][7] +3 other tests skip
[6]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-464/igt@kms_pm_rpm@universal-planes.html
[7]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-463/igt@kms_pm_rpm@universal-planes.html
- shard-lnl: [PASS][8] -> [SKIP][9] +2 other tests skip
[8]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-8/igt@kms_pm_rpm@universal-planes.html
[9]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-7/igt@kms_pm_rpm@universal-planes.html
* igt@kms_vblank@ts-continuation-suspend@pipe-c-edp-1:
- shard-lnl: [PASS][10] -> [DMESG-WARN][11] +2 other tests dmesg-warn
[10]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-3/igt@kms_vblank@ts-continuation-suspend@pipe-c-edp-1.html
[11]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-2/igt@kms_vblank@ts-continuation-suspend@pipe-c-edp-1.html
* igt@xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv:
- shard-dg2-set2: NOTRUN -> [ABORT][12] +1 other test abort
[12]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-466/igt@xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv.html
* igt@xe_module_load@many-reload:
- shard-adlp: [PASS][13] -> [ABORT][14] +3 other tests abort
[13]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-8/igt@xe_module_load@many-reload.html
[14]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_module_load@many-reload.html
* igt@xe_module_load@reload:
- shard-dg2-set2: [PASS][15] -> [ABORT][16] +9 other tests abort
[15]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-435/igt@xe_module_load@reload.html
[16]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-466/igt@xe_module_load@reload.html
- shard-lnl: [PASS][17] -> [ABORT][18] +6 other tests abort
[17]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-4/igt@xe_module_load@reload.html
[18]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-5/igt@xe_module_load@reload.html
- shard-bmg: [PASS][19] -> [ABORT][20] +1 other test abort
[19]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-1/igt@xe_module_load@reload.html
[20]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-1/igt@xe_module_load@reload.html
#### Warnings ####
* igt@kms_pm_rpm@dpms-lpsp:
- shard-bmg: [SKIP][21] ([Intel XE#1439] / [Intel XE#3141] / [Intel XE#836]) -> [SKIP][22]
[21]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_pm_rpm@dpms-lpsp.html
[22]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_pm_rpm@dpms-lpsp.html
* igt@kms_pm_rpm@modeset-non-lpsp:
- shard-lnl: [SKIP][23] ([Intel XE#1439] / [Intel XE#3141]) -> [SKIP][24]
[23]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-1/igt@kms_pm_rpm@modeset-non-lpsp.html
[24]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-5/igt@kms_pm_rpm@modeset-non-lpsp.html
* igt@xe_configfs@survivability-mode:
- shard-lnl: [SKIP][25] ([Intel XE#6010]) -> [ABORT][26]
[25]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-7/igt@xe_configfs@survivability-mode.html
[26]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@xe_configfs@survivability-mode.html
* igt@xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv:
- shard-lnl: [ABORT][27] ([Intel XE#4757]) -> [ABORT][28]
[27]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-1/igt@xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv.html
[28]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-5/igt@xe_fault_injection@probe-fail-guc-xe_guc_mmio_send_recv.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* {igt@xe_configfs@engines-allowed}:
- shard-lnl: [PASS][29] -> [ABORT][30]
[29]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-7/igt@xe_configfs@engines-allowed.html
[30]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-7/igt@xe_configfs@engines-allowed.html
Known issues
------------
Here are the changes found in xe-pw-155314v3_FULL that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@kms_big_fb@4-tiled-64bpp-rotate-270:
- shard-bmg: NOTRUN -> [SKIP][31] ([Intel XE#2327]) +2 other tests skip
[31]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_big_fb@4-tiled-64bpp-rotate-270.html
* igt@kms_big_fb@4-tiled-addfb-size-overflow:
- shard-adlp: NOTRUN -> [SKIP][32] ([Intel XE#610])
[32]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_big_fb@4-tiled-addfb-size-overflow.html
* igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0:
- shard-adlp: NOTRUN -> [SKIP][33] ([Intel XE#1124]) +15 other tests skip
[33]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0.html
* igt@kms_big_fb@linear-8bpp-rotate-270:
- shard-adlp: NOTRUN -> [SKIP][34] ([Intel XE#316]) +3 other tests skip
[34]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_big_fb@linear-8bpp-rotate-270.html
* igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
- shard-bmg: NOTRUN -> [SKIP][35] ([Intel XE#1124]) +1 other test skip
[35]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html
* igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-hflip:
- shard-adlp: NOTRUN -> [DMESG-FAIL][36] ([Intel XE#4543]) +6 other tests dmesg-fail
[36]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-hflip.html
* igt@kms_big_fb@yf-tiled-addfb-size-overflow:
- shard-lnl: NOTRUN -> [SKIP][37] ([Intel XE#1428])
[37]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_big_fb@yf-tiled-addfb-size-overflow.html
* igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-async-flip:
- shard-dg2-set2: NOTRUN -> [SKIP][38] ([Intel XE#1124])
[38]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-async-flip.html
* igt@kms_bw@connected-linear-tiling-2-displays-3840x2160p:
- shard-adlp: NOTRUN -> [SKIP][39] ([Intel XE#2191]) +2 other tests skip
[39]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@kms_bw@connected-linear-tiling-2-displays-3840x2160p.html
* igt@kms_bw@linear-tiling-1-displays-2560x1440p:
- shard-bmg: NOTRUN -> [SKIP][40] ([Intel XE#367])
[40]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_bw@linear-tiling-1-displays-2560x1440p.html
* igt@kms_bw@linear-tiling-3-displays-2160x1440p:
- shard-dg2-set2: NOTRUN -> [SKIP][41] ([Intel XE#367])
[41]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_bw@linear-tiling-3-displays-2160x1440p.html
* igt@kms_bw@linear-tiling-4-displays-1920x1080p:
- shard-adlp: NOTRUN -> [SKIP][42] ([Intel XE#367]) +2 other tests skip
[42]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_bw@linear-tiling-4-displays-1920x1080p.html
* igt@kms_ccs@bad-aux-stride-4-tiled-mtl-rc-ccs-cc:
- shard-adlp: NOTRUN -> [SKIP][43] ([Intel XE#455] / [Intel XE#787]) +45 other tests skip
[43]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_ccs@bad-aux-stride-4-tiled-mtl-rc-ccs-cc.html
- shard-bmg: NOTRUN -> [SKIP][44] ([Intel XE#2887]) +6 other tests skip
[44]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_ccs@bad-aux-stride-4-tiled-mtl-rc-ccs-cc.html
* igt@kms_ccs@crc-primary-basic-4-tiled-mtl-rc-ccs-cc@pipe-d-dp-4:
- shard-dg2-set2: NOTRUN -> [SKIP][45] ([Intel XE#455] / [Intel XE#787]) +1 other test skip
[45]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_ccs@crc-primary-basic-4-tiled-mtl-rc-ccs-cc@pipe-d-dp-4.html
* igt@kms_ccs@crc-primary-basic-4-tiled-mtl-rc-ccs-cc@pipe-d-hdmi-a-6:
- shard-dg2-set2: NOTRUN -> [SKIP][46] ([Intel XE#787]) +6 other tests skip
[46]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_ccs@crc-primary-basic-4-tiled-mtl-rc-ccs-cc@pipe-d-hdmi-a-6.html
* igt@kms_ccs@crc-primary-rotation-180-4-tiled-lnl-ccs@pipe-b-dp-2:
- shard-bmg: NOTRUN -> [SKIP][47] ([Intel XE#2652] / [Intel XE#787]) +3 other tests skip
[47]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-5/igt@kms_ccs@crc-primary-rotation-180-4-tiled-lnl-ccs@pipe-b-dp-2.html
* igt@kms_ccs@crc-primary-rotation-180-y-tiled-gen12-mc-ccs:
- shard-lnl: NOTRUN -> [SKIP][48] ([Intel XE#2887]) +1 other test skip
[48]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_ccs@crc-primary-rotation-180-y-tiled-gen12-mc-ccs.html
* igt@kms_ccs@crc-primary-rotation-180-yf-tiled-ccs@pipe-c-hdmi-a-1:
- shard-adlp: NOTRUN -> [SKIP][49] ([Intel XE#787]) +68 other tests skip
[49]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_ccs@crc-primary-rotation-180-yf-tiled-ccs@pipe-c-hdmi-a-1.html
* igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs:
- shard-bmg: [PASS][50] -> [INCOMPLETE][51] ([Intel XE#3862]) +1 other test incomplete
[50]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-8/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html
[51]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-7/igt@kms_ccs@crc-primary-suspend-4-tiled-bmg-ccs.html
* igt@kms_ccs@crc-sprite-planes-basic-4-tiled-bmg-ccs:
- shard-adlp: NOTRUN -> [SKIP][52] ([Intel XE#2907])
[52]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_ccs@crc-sprite-planes-basic-4-tiled-bmg-ccs.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc:
- shard-dg2-set2: [PASS][53] -> [INCOMPLETE][54] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4345] / [Intel XE#4522])
[53]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-435/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc.html
[54]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-a-dp-4:
- shard-dg2-set2: [PASS][55] -> [INCOMPLETE][56] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4522])
[55]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-435/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-a-dp-4.html
[56]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs-cc@pipe-a-dp-4.html
* igt@kms_cdclk@mode-transition-all-outputs:
- shard-adlp: NOTRUN -> [SKIP][57] ([Intel XE#4418])
[57]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_cdclk@mode-transition-all-outputs.html
* igt@kms_cdclk@plane-scaling:
- shard-lnl: NOTRUN -> [SKIP][58] ([Intel XE#4416]) +3 other tests skip
[58]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_cdclk@plane-scaling.html
* igt@kms_chamelium_color@ctm-negative:
- shard-dg2-set2: NOTRUN -> [SKIP][59] ([Intel XE#306])
[59]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_chamelium_color@ctm-negative.html
* igt@kms_chamelium_color@degamma:
- shard-adlp: NOTRUN -> [SKIP][60] ([Intel XE#306]) +1 other test skip
[60]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_chamelium_color@degamma.html
* igt@kms_chamelium_color@gamma:
- shard-lnl: NOTRUN -> [SKIP][61] ([Intel XE#306])
[61]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_chamelium_color@gamma.html
* igt@kms_chamelium_edid@dp-edid-change-during-hibernate:
- shard-dg2-set2: NOTRUN -> [SKIP][62] ([Intel XE#373])
[62]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-433/igt@kms_chamelium_edid@dp-edid-change-during-hibernate.html
* igt@kms_chamelium_edid@hdmi-edid-change-during-hibernate:
- shard-adlp: NOTRUN -> [SKIP][63] ([Intel XE#373]) +10 other tests skip
[63]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_chamelium_edid@hdmi-edid-change-during-hibernate.html
* igt@kms_chamelium_hpd@dp-hpd:
- shard-bmg: NOTRUN -> [SKIP][64] ([Intel XE#2252]) +3 other tests skip
[64]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_chamelium_hpd@dp-hpd.html
* igt@kms_content_protection@legacy:
- shard-bmg: NOTRUN -> [FAIL][65] ([Intel XE#1178]) +1 other test fail
[65]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_content_protection@legacy.html
* igt@kms_content_protection@srm@pipe-a-dp-4:
- shard-dg2-set2: NOTRUN -> [FAIL][66] ([Intel XE#1178]) +1 other test fail
[66]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_content_protection@srm@pipe-a-dp-4.html
* igt@kms_cursor_crc@cursor-offscreen-256x85:
- shard-bmg: NOTRUN -> [SKIP][67] ([Intel XE#2320])
[67]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_cursor_crc@cursor-offscreen-256x85.html
* igt@kms_cursor_crc@cursor-offscreen-512x170:
- shard-bmg: NOTRUN -> [SKIP][68] ([Intel XE#2321])
[68]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_cursor_crc@cursor-offscreen-512x170.html
* igt@kms_cursor_crc@cursor-random-32x32:
- shard-lnl: NOTRUN -> [SKIP][69] ([Intel XE#1424])
[69]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_cursor_crc@cursor-random-32x32.html
* igt@kms_cursor_crc@cursor-sliding-512x170:
- shard-adlp: NOTRUN -> [SKIP][70] ([Intel XE#308]) +1 other test skip
[70]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_cursor_crc@cursor-sliding-512x170.html
* igt@kms_cursor_legacy@cursora-vs-flipb-legacy:
- shard-lnl: NOTRUN -> [SKIP][71] ([Intel XE#309])
[71]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_cursor_legacy@cursora-vs-flipb-legacy.html
* igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions:
- shard-bmg: [PASS][72] -> [SKIP][73] ([Intel XE#2291])
[72]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-3/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions.html
[73]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-6/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions.html
* igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size:
- shard-adlp: NOTRUN -> [SKIP][74] ([Intel XE#309]) +4 other tests skip
[74]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size.html
* igt@kms_cursor_legacy@short-busy-flip-before-cursor-atomic-transitions:
- shard-adlp: NOTRUN -> [SKIP][75] ([Intel XE#323])
[75]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_cursor_legacy@short-busy-flip-before-cursor-atomic-transitions.html
* igt@kms_display_modes@extended-mode-basic:
- shard-adlp: NOTRUN -> [SKIP][76] ([Intel XE#4302])
[76]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_display_modes@extended-mode-basic.html
* igt@kms_dp_link_training@non-uhbr-sst:
- shard-adlp: NOTRUN -> [SKIP][77] ([Intel XE#4354])
[77]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_dp_link_training@non-uhbr-sst.html
* igt@kms_dp_link_training@uhbr-mst:
- shard-lnl: NOTRUN -> [SKIP][78] ([Intel XE#4354])
[78]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_dp_link_training@uhbr-mst.html
* igt@kms_dp_link_training@uhbr-sst:
- shard-adlp: NOTRUN -> [SKIP][79] ([Intel XE#4356])
[79]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_dp_link_training@uhbr-sst.html
* igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-out-visible-area:
- shard-adlp: NOTRUN -> [SKIP][80] ([Intel XE#4422]) +1 other test skip
[80]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_fbc_dirty_rect@fbc-dirty-rectangle-out-visible-area.html
* igt@kms_feature_discovery@display-3x:
- shard-adlp: NOTRUN -> [SKIP][81] ([Intel XE#703])
[81]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_feature_discovery@display-3x.html
* igt@kms_flip@2x-flip-vs-panning-interruptible:
- shard-lnl: NOTRUN -> [SKIP][82] ([Intel XE#1421])
[82]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_flip@2x-flip-vs-panning-interruptible.html
* igt@kms_flip@2x-plain-flip-ts-check-interruptible:
- shard-adlp: NOTRUN -> [SKIP][83] ([Intel XE#310]) +10 other tests skip
[83]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_flip@2x-plain-flip-ts-check-interruptible.html
* igt@kms_flip@dpms-off-confusion-interruptible@c-hdmi-a1:
- shard-adlp: [PASS][84] -> [DMESG-WARN][85] ([Intel XE#4543]) +1 other test dmesg-warn
[84]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-1/igt@kms_flip@dpms-off-confusion-interruptible@c-hdmi-a1.html
[85]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_flip@dpms-off-confusion-interruptible@c-hdmi-a1.html
* igt@kms_flip@flip-vs-expired-vblank-interruptible:
- shard-adlp: NOTRUN -> [DMESG-WARN][86] ([Intel XE#4543]) +13 other tests dmesg-warn
[86]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
* igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1:
- shard-lnl: [PASS][87] -> [FAIL][88] ([Intel XE#301] / [Intel XE#3149]) +1 other test fail
[87]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-8/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
[88]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-7/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
* igt@kms_flip@flip-vs-suspend:
- shard-bmg: [PASS][89] -> [INCOMPLETE][90] ([Intel XE#2049] / [Intel XE#2597]) +1 other test incomplete
[89]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-5/igt@kms_flip@flip-vs-suspend.html
[90]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_flip@flip-vs-suspend.html
* igt@kms_flip@flip-vs-suspend@d-dp4:
- shard-dg2-set2: [PASS][91] -> [INCOMPLETE][92] ([Intel XE#2049] / [Intel XE#2597]) +1 other test incomplete
[91]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-434/igt@kms_flip@flip-vs-suspend@d-dp4.html
[92]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-432/igt@kms_flip@flip-vs-suspend@d-dp4.html
* igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-downscaling:
- shard-bmg: NOTRUN -> [SKIP][93] ([Intel XE#2293] / [Intel XE#2380])
[93]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-downscaling.html
* igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-downscaling@pipe-a-valid-mode:
- shard-bmg: NOTRUN -> [SKIP][94] ([Intel XE#2293])
[94]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-downscaling@pipe-a-valid-mode.html
* igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling:
- shard-bmg: NOTRUN -> [SKIP][95] ([Intel XE#2380]) +1 other test skip
[95]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling.html
* igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-downscaling@pipe-a-valid-mode:
- shard-adlp: NOTRUN -> [DMESG-FAIL][96] ([Intel XE#4543] / [Intel XE#4921]) +5 other tests dmesg-fail
[96]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-downscaling@pipe-a-valid-mode.html
* igt@kms_frontbuffer_tracking@drrs-1p-pri-indfb-multidraw:
- shard-lnl: NOTRUN -> [SKIP][97] ([Intel XE#651]) +1 other test skip
[97]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_frontbuffer_tracking@drrs-1p-pri-indfb-multidraw.html
* igt@kms_frontbuffer_tracking@drrs-1p-primscrn-shrfb-pgflip-blt:
- shard-adlp: NOTRUN -> [SKIP][98] ([Intel XE#651]) +12 other tests skip
[98]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@kms_frontbuffer_tracking@drrs-1p-primscrn-shrfb-pgflip-blt.html
* igt@kms_frontbuffer_tracking@drrs-2p-primscrn-indfb-plflip-blt:
- shard-dg2-set2: NOTRUN -> [SKIP][99] ([Intel XE#651]) +7 other tests skip
[99]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-indfb-plflip-blt.html
* igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-pri-shrfb-draw-mmap-wc:
- shard-bmg: NOTRUN -> [SKIP][100] ([Intel XE#2311]) +8 other tests skip
[100]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_frontbuffer_tracking@drrs-2p-scndscrn-pri-shrfb-draw-mmap-wc.html
* igt@kms_frontbuffer_tracking@fbc-stridechange:
- shard-bmg: NOTRUN -> [SKIP][101] ([Intel XE#5390]) +5 other tests skip
[101]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_frontbuffer_tracking@fbc-stridechange.html
* igt@kms_frontbuffer_tracking@fbcdrrs-tiling-4:
- shard-adlp: NOTRUN -> [SKIP][102] ([Intel XE#1151])
[102]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_frontbuffer_tracking@fbcdrrs-tiling-4.html
* igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-blt:
- shard-dg2-set2: NOTRUN -> [SKIP][103] ([Intel XE#653]) +5 other tests skip
[103]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-466/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-blt.html
* igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-mmap-wc:
- shard-adlp: NOTRUN -> [SKIP][104] ([Intel XE#653]) +13 other tests skip
[104]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-mmap-wc.html
* igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-shrfb-plflip-blt:
- shard-bmg: NOTRUN -> [SKIP][105] ([Intel XE#2313]) +4 other tests skip
[105]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-shrfb-plflip-blt.html
* igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-spr-indfb-draw-render:
- shard-lnl: NOTRUN -> [SKIP][106] ([Intel XE#656]) +6 other tests skip
[106]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-spr-indfb-draw-render.html
* igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-shrfb-draw-blt:
- shard-adlp: NOTRUN -> [SKIP][107] ([Intel XE#656]) +51 other tests skip
[107]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-shrfb-draw-blt.html
* igt@kms_hdr@static-swap:
- shard-adlp: NOTRUN -> [SKIP][108] ([Intel XE#455]) +16 other tests skip
[108]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_hdr@static-swap.html
* igt@kms_joiner@invalid-modeset-big-joiner:
- shard-adlp: NOTRUN -> [SKIP][109] ([Intel XE#346])
[109]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_joiner@invalid-modeset-big-joiner.html
- shard-bmg: NOTRUN -> [SKIP][110] ([Intel XE#346])
[110]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_joiner@invalid-modeset-big-joiner.html
* igt@kms_joiner@switch-modeset-ultra-joiner-big-joiner:
- shard-adlp: NOTRUN -> [SKIP][111] ([Intel XE#2925])
[111]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_joiner@switch-modeset-ultra-joiner-big-joiner.html
* igt@kms_panel_fitting@legacy:
- shard-bmg: NOTRUN -> [SKIP][112] ([Intel XE#2486])
[112]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_panel_fitting@legacy.html
* igt@kms_plane_lowres@tiling-4:
- shard-lnl: NOTRUN -> [SKIP][113] ([Intel XE#599]) +3 other tests skip
[113]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@kms_plane_lowres@tiling-4.html
* igt@kms_plane_multiple@2x-tiling-y:
- shard-adlp: NOTRUN -> [SKIP][114] ([Intel XE#4596])
[114]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_plane_multiple@2x-tiling-y.html
* igt@kms_plane_multiple@tiling-yf:
- shard-adlp: NOTRUN -> [SKIP][115] ([Intel XE#5020])
[115]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_plane_multiple@tiling-yf.html
* igt@kms_pm_backlight@fade:
- shard-adlp: NOTRUN -> [SKIP][116] ([Intel XE#870])
[116]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_pm_backlight@fade.html
* igt@kms_pm_dc@dc5-retention-flops:
- shard-adlp: NOTRUN -> [SKIP][117] ([Intel XE#3309])
[117]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_pm_dc@dc5-retention-flops.html
* igt@kms_pm_dc@dc6-dpms:
- shard-adlp: NOTRUN -> [FAIL][118] ([Intel XE#718])
[118]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_pm_dc@dc6-dpms.html
* igt@kms_pm_rpm@dpms-mode-unset-lpsp:
- shard-adlp: NOTRUN -> [SKIP][119] ([Intel XE#6070]) +1 other test skip
[119]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_pm_rpm@dpms-mode-unset-lpsp.html
* igt@kms_pm_rpm@dpms-mode-unset-non-lpsp:
- shard-adlp: NOTRUN -> [SKIP][120] ([Intel XE#836])
[120]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_pm_rpm@dpms-mode-unset-non-lpsp.html
* igt@kms_pm_rpm@legacy-planes-dpms:
- shard-adlp: [PASS][121] -> [SKIP][122] ([Intel XE#6070]) +2 other tests skip
[121]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-1/igt@kms_pm_rpm@legacy-planes-dpms.html
[122]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@kms_pm_rpm@legacy-planes-dpms.html
* igt@kms_psr2_sf@fbc-psr2-cursor-plane-move-continuous-sf:
- shard-adlp: NOTRUN -> [SKIP][123] ([Intel XE#1406] / [Intel XE#1489]) +11 other tests skip
[123]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@kms_psr2_sf@fbc-psr2-cursor-plane-move-continuous-sf.html
* igt@kms_psr2_sf@psr2-cursor-plane-move-continuous-exceed-fully-sf:
- shard-bmg: NOTRUN -> [SKIP][124] ([Intel XE#1406] / [Intel XE#1489])
[124]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_psr2_sf@psr2-cursor-plane-move-continuous-exceed-fully-sf.html
* igt@kms_psr2_sf@psr2-primary-plane-update-sf-dmg-area:
- shard-dg2-set2: NOTRUN -> [SKIP][125] ([Intel XE#1406] / [Intel XE#1489]) +3 other tests skip
[125]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-466/igt@kms_psr2_sf@psr2-primary-plane-update-sf-dmg-area.html
* igt@kms_psr2_su@page_flip-xrgb8888:
- shard-adlp: NOTRUN -> [SKIP][126] ([Intel XE#1122] / [Intel XE#1406] / [Intel XE#5580])
[126]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@kms_psr2_su@page_flip-xrgb8888.html
* igt@kms_psr@fbc-psr2-cursor-plane-move:
- shard-adlp: NOTRUN -> [SKIP][127] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929]) +16 other tests skip
[127]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@kms_psr@fbc-psr2-cursor-plane-move.html
* igt@kms_psr@fbc-psr2-primary-render:
- shard-dg2-set2: NOTRUN -> [SKIP][128] ([Intel XE#1406] / [Intel XE#2850] / [Intel XE#929]) +2 other tests skip
[128]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_psr@fbc-psr2-primary-render.html
* igt@kms_psr@pr-suspend:
- shard-bmg: NOTRUN -> [SKIP][129] ([Intel XE#1406] / [Intel XE#2234] / [Intel XE#2850]) +1 other test skip
[129]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_psr@pr-suspend.html
* igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
- shard-bmg: NOTRUN -> [SKIP][130] ([Intel XE#1406] / [Intel XE#2414])
[130]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
* igt@kms_rotation_crc@primary-yf-tiled-reflect-x-0:
- shard-bmg: NOTRUN -> [SKIP][131] ([Intel XE#2330])
[131]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-0.html
* igt@kms_rotation_crc@sprite-rotation-90-pos-100-0:
- shard-adlp: NOTRUN -> [SKIP][132] ([Intel XE#3414])
[132]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_rotation_crc@sprite-rotation-90-pos-100-0.html
* igt@xe_ccs@block-copy-compressed-inc-dimension:
- shard-adlp: NOTRUN -> [SKIP][133] ([Intel XE#455] / [Intel XE#488] / [Intel XE#5607]) +1 other test skip
[133]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_ccs@block-copy-compressed-inc-dimension.html
* igt@xe_copy_basic@mem-copy-linear-0x3fff:
- shard-dg2-set2: NOTRUN -> [SKIP][134] ([Intel XE#1123])
[134]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_copy_basic@mem-copy-linear-0x3fff.html
* igt@xe_copy_basic@mem-copy-linear-0xfd:
- shard-adlp: NOTRUN -> [SKIP][135] ([Intel XE#1123]) +2 other tests skip
[135]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_copy_basic@mem-copy-linear-0xfd.html
* igt@xe_copy_basic@mem-set-linear-0x3fff:
- shard-adlp: NOTRUN -> [SKIP][136] ([Intel XE#1126])
[136]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_copy_basic@mem-set-linear-0x3fff.html
* igt@xe_create@multigpu-create-massive-size:
- shard-adlp: NOTRUN -> [SKIP][137] ([Intel XE#944]) +4 other tests skip
[137]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_create@multigpu-create-massive-size.html
* igt@xe_eudebug@basic-exec-queues:
- shard-adlp: NOTRUN -> [SKIP][138] ([Intel XE#4837] / [Intel XE#5565]) +18 other tests skip
[138]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_eudebug@basic-exec-queues.html
* igt@xe_eudebug@basic-vm-access:
- shard-lnl: NOTRUN -> [SKIP][139] ([Intel XE#4837]) +1 other test skip
[139]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@xe_eudebug@basic-vm-access.html
* igt@xe_eudebug@read-metadata:
- shard-bmg: NOTRUN -> [SKIP][140] ([Intel XE#4837]) +2 other tests skip
[140]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@xe_eudebug@read-metadata.html
* igt@xe_eudebug@vm-bind-clear-faultable:
- shard-dg2-set2: NOTRUN -> [SKIP][141] ([Intel XE#4837]) +1 other test skip
[141]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_eudebug@vm-bind-clear-faultable.html
* igt@xe_evict@evict-beng-threads-large:
- shard-adlp: NOTRUN -> [SKIP][142] ([Intel XE#261]) +2 other tests skip
[142]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_evict@evict-beng-threads-large.html
* igt@xe_evict@evict-large-external:
- shard-adlp: NOTRUN -> [SKIP][143] ([Intel XE#261] / [Intel XE#5564]) +1 other test skip
[143]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_evict@evict-large-external.html
* igt@xe_evict@evict-small-multi-vm:
- shard-adlp: NOTRUN -> [SKIP][144] ([Intel XE#261] / [Intel XE#5564] / [Intel XE#688]) +2 other tests skip
[144]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_evict@evict-small-multi-vm.html
* igt@xe_evict_ccs@evict-overcommit-parallel-instantfree-samefd:
- shard-lnl: NOTRUN -> [SKIP][145] ([Intel XE#688]) +1 other test skip
[145]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@xe_evict_ccs@evict-overcommit-parallel-instantfree-samefd.html
* igt@xe_evict_ccs@evict-overcommit-standalone-nofree-reopen:
- shard-adlp: NOTRUN -> [SKIP][146] ([Intel XE#688]) +4 other tests skip
[146]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_evict_ccs@evict-overcommit-standalone-nofree-reopen.html
* igt@xe_exec_basic@multigpu-many-execqueues-many-vm-bindexecqueue:
- shard-lnl: NOTRUN -> [SKIP][147] ([Intel XE#1392])
[147]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@xe_exec_basic@multigpu-many-execqueues-many-vm-bindexecqueue.html
* igt@xe_exec_basic@multigpu-once-bindexecqueue-rebind:
- shard-bmg: NOTRUN -> [SKIP][148] ([Intel XE#2322]) +2 other tests skip
[148]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@xe_exec_basic@multigpu-once-bindexecqueue-rebind.html
* igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-rebind:
- shard-adlp: NOTRUN -> [SKIP][149] ([Intel XE#1392] / [Intel XE#5575]) +12 other tests skip
[149]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_exec_basic@multigpu-once-bindexecqueue-userptr-rebind.html
* igt@xe_exec_fault_mode@many-userptr-invalidate-race-prefetch:
- shard-adlp: NOTRUN -> [SKIP][150] ([Intel XE#288] / [Intel XE#5561]) +30 other tests skip
[150]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_exec_fault_mode@many-userptr-invalidate-race-prefetch.html
* igt@xe_exec_fault_mode@twice-userptr-prefetch:
- shard-dg2-set2: NOTRUN -> [SKIP][151] ([Intel XE#288]) +6 other tests skip
[151]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_exec_fault_mode@twice-userptr-prefetch.html
* igt@xe_exec_mix_modes@exec-simple-batch-store-dma-fence:
- shard-adlp: NOTRUN -> [SKIP][152] ([Intel XE#2360])
[152]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_exec_mix_modes@exec-simple-batch-store-dma-fence.html
* igt@xe_exec_system_allocator@once-large-mmap-free-huge:
- shard-lnl: NOTRUN -> [SKIP][153] ([Intel XE#4943]) +5 other tests skip
[153]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-3/igt@xe_exec_system_allocator@once-large-mmap-free-huge.html
* igt@xe_exec_system_allocator@once-mmap-remap-ro-dontunmap:
- shard-adlp: NOTRUN -> [SKIP][154] ([Intel XE#4915]) +308 other tests skip
[154]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_exec_system_allocator@once-mmap-remap-ro-dontunmap.html
* igt@xe_exec_system_allocator@process-many-execqueues-mmap-free-huge:
- shard-bmg: NOTRUN -> [SKIP][155] ([Intel XE#4943])
[155]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@xe_exec_system_allocator@process-many-execqueues-mmap-free-huge.html
* igt@xe_exec_system_allocator@threads-many-large-mmap-shared-remap-dontunmap-eocheck:
- shard-dg2-set2: NOTRUN -> [SKIP][156] ([Intel XE#4915]) +50 other tests skip
[156]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_exec_system_allocator@threads-many-large-mmap-shared-remap-dontunmap-eocheck.html
* igt@xe_fault_injection@inject-fault-probe-function-wait_for_lmem_ready:
- shard-adlp: [PASS][157] -> [ABORT][158] ([Intel XE#5530]) +1 other test abort
[157]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-8/igt@xe_fault_injection@inject-fault-probe-function-wait_for_lmem_ready.html
[158]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_fault_injection@inject-fault-probe-function-wait_for_lmem_ready.html
* igt@xe_fault_injection@inject-fault-probe-function-xe_wopcm_init:
- shard-bmg: [PASS][159] -> [ABORT][160] ([Intel XE#5530]) +7 other tests abort
[159]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-4/igt@xe_fault_injection@inject-fault-probe-function-xe_wopcm_init.html
[160]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-1/igt@xe_fault_injection@inject-fault-probe-function-xe_wopcm_init.html
* igt@xe_fault_injection@vm-create-fail-xe_exec_queue_create_bind:
- shard-adlp: NOTRUN -> [ABORT][161] ([Intel XE#5530])
[161]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_fault_injection@vm-create-fail-xe_exec_queue_create_bind.html
* igt@xe_mmap@pci-membarrier-parallel:
- shard-adlp: NOTRUN -> [SKIP][162] ([Intel XE#5100]) +1 other test skip
[162]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_mmap@pci-membarrier-parallel.html
* igt@xe_mmap@small-bar:
- shard-adlp: NOTRUN -> [SKIP][163] ([Intel XE#512])
[163]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_mmap@small-bar.html
* igt@xe_module_load@load:
- shard-adlp: ([PASS][164], [PASS][165], [PASS][166], [PASS][167], [PASS][168], [PASS][169], [PASS][170], [PASS][171], [PASS][172], [PASS][173], [PASS][174], [PASS][175], [PASS][176], [PASS][177]) -> ([PASS][178], [PASS][179], [PASS][180], [PASS][181], [PASS][182], [PASS][183], [PASS][184], [PASS][185], [PASS][186], [PASS][187], [PASS][188], [PASS][189], [PASS][190], [PASS][191], [SKIP][192], [PASS][193]) ([Intel XE#378] / [Intel XE#5612])
[164]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-9/igt@xe_module_load@load.html
[165]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-1/igt@xe_module_load@load.html
[166]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-6/igt@xe_module_load@load.html
[167]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-6/igt@xe_module_load@load.html
[168]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-6/igt@xe_module_load@load.html
[169]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-2/igt@xe_module_load@load.html
[170]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-2/igt@xe_module_load@load.html
[171]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-9/igt@xe_module_load@load.html
[172]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-9/igt@xe_module_load@load.html
[173]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-1/igt@xe_module_load@load.html
[174]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-8/igt@xe_module_load@load.html
[175]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-8/igt@xe_module_load@load.html
[176]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-8/igt@xe_module_load@load.html
[177]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-1/igt@xe_module_load@load.html
[178]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_module_load@load.html
[179]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_module_load@load.html
[180]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_module_load@load.html
[181]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_module_load@load.html
[182]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_module_load@load.html
[183]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_module_load@load.html
[184]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_module_load@load.html
[185]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_module_load@load.html
[186]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_module_load@load.html
[187]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_module_load@load.html
[188]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_module_load@load.html
[189]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_module_load@load.html
[190]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_module_load@load.html
[191]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_module_load@load.html
[192]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_module_load@load.html
[193]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_module_load@load.html
* igt@xe_module_load@many-reload:
- shard-bmg: [PASS][194] -> [ABORT][195] ([Intel XE#5087])
[194]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@xe_module_load@many-reload.html
[195]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-5/igt@xe_module_load@many-reload.html
* igt@xe_module_load@reload-no-display:
- shard-dg2-set2: [PASS][196] -> [ABORT][197] ([Intel XE#5087])
[196]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-464/igt@xe_module_load@reload-no-display.html
[197]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-463/igt@xe_module_load@reload-no-display.html
- shard-lnl: [PASS][198] -> [ABORT][199] ([Intel XE#5087]) +1 other test abort
[198]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-5/igt@xe_module_load@reload-no-display.html
[199]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-8/igt@xe_module_load@reload-no-display.html
* igt@xe_oa@buffer-size:
- shard-adlp: NOTRUN -> [SKIP][200] ([Intel XE#6032])
[200]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_oa@buffer-size.html
* igt@xe_oa@invalid-oa-exponent:
- shard-dg2-set2: NOTRUN -> [SKIP][201] ([Intel XE#3573])
[201]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_oa@invalid-oa-exponent.html
* igt@xe_oa@oa-tlb-invalidate:
- shard-bmg: NOTRUN -> [SKIP][202] ([Intel XE#2248])
[202]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@xe_oa@oa-tlb-invalidate.html
* igt@xe_oa@syncs-syncobj-cfg:
- shard-adlp: NOTRUN -> [SKIP][203] ([Intel XE#3573]) +7 other tests skip
[203]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-8/igt@xe_oa@syncs-syncobj-cfg.html
* igt@xe_pm@d3cold-mmap-vram:
- shard-bmg: NOTRUN -> [SKIP][204] ([Intel XE#2284])
[204]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@xe_pm@d3cold-mmap-vram.html
* igt@xe_pm@d3hot-i2c:
- shard-adlp: NOTRUN -> [SKIP][205] ([Intel XE#5742])
[205]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_pm@d3hot-i2c.html
* igt@xe_pm@s2idle-d3hot-basic-exec:
- shard-adlp: NOTRUN -> [WARN][206] ([Intel XE#4504])
[206]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-6/igt@xe_pm@s2idle-d3hot-basic-exec.html
- shard-bmg: [PASS][207] -> [WARN][208] ([Intel XE#4504])
[207]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-7/igt@xe_pm@s2idle-d3hot-basic-exec.html
[208]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-8/igt@xe_pm@s2idle-d3hot-basic-exec.html
- shard-lnl: [PASS][209] -> [WARN][210] ([Intel XE#4504])
[209]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-lnl-4/igt@xe_pm@s2idle-d3hot-basic-exec.html
[210]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-lnl-4/igt@xe_pm@s2idle-d3hot-basic-exec.html
* igt@xe_pm@s4-d3cold-basic-exec:
- shard-adlp: NOTRUN -> [SKIP][211] ([Intel XE#2284] / [Intel XE#366]) +1 other test skip
[211]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_pm@s4-d3cold-basic-exec.html
* igt@xe_pxp@display-black-pxp-fb:
- shard-adlp: NOTRUN -> [SKIP][212] ([Intel XE#4733])
[212]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-9/igt@xe_pxp@display-black-pxp-fb.html
* igt@xe_pxp@pxp-stale-bo-bind-post-termination-irq:
- shard-adlp: NOTRUN -> [SKIP][213] ([Intel XE#4733] / [Intel XE#5594]) +1 other test skip
[213]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@xe_pxp@pxp-stale-bo-bind-post-termination-irq.html
* igt@xe_query@multigpu-query-mem-usage:
- shard-bmg: NOTRUN -> [SKIP][214] ([Intel XE#944])
[214]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@xe_query@multigpu-query-mem-usage.html
* igt@xe_render_copy@render-stress-0-copies:
- shard-adlp: NOTRUN -> [SKIP][215] ([Intel XE#4814] / [Intel XE#5614]) +1 other test skip
[215]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-2/igt@xe_render_copy@render-stress-0-copies.html
- shard-dg2-set2: NOTRUN -> [SKIP][216] ([Intel XE#4814])
[216]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_render_copy@render-stress-0-copies.html
* igt@xe_sriov_auto_provisioning@resources-released-on-vfs-disabling:
- shard-dg2-set2: NOTRUN -> [SKIP][217] ([Intel XE#4130]) +1 other test skip
[217]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@xe_sriov_auto_provisioning@resources-released-on-vfs-disabling.html
#### Possible fixes ####
* igt@kms_async_flips@async-flip-suspend-resume@pipe-a-hdmi-a-1:
- shard-adlp: [DMESG-WARN][218] ([Intel XE#2953] / [Intel XE#4173]) -> [PASS][219]
[218]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-2/igt@kms_async_flips@async-flip-suspend-resume@pipe-a-hdmi-a-1.html
[219]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_async_flips@async-flip-suspend-resume@pipe-a-hdmi-a-1.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs:
- shard-dg2-set2: [INCOMPLETE][220] ([Intel XE#2705] / [Intel XE#4212] / [Intel XE#4345]) -> [PASS][221]
[220]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs.html
[221]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-dp-4:
- shard-dg2-set2: [INCOMPLETE][222] ([Intel XE#2705] / [Intel XE#4212] / [Intel XE#6014]) -> [PASS][223]
[222]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-dp-4.html
[223]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-434/igt@kms_ccs@random-ccs-data-4-tiled-dg2-mc-ccs@pipe-a-dp-4.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs:
- shard-dg2-set2: [INCOMPLETE][224] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4345] / [Intel XE#4522]) -> [PASS][225]
[224]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-436/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs.html
[225]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs.html
* igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4:
- shard-dg2-set2: [INCOMPLETE][226] ([Intel XE#1727] / [Intel XE#2705] / [Intel XE#3113] / [Intel XE#4212] / [Intel XE#4522]) -> [PASS][227]
[226]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-436/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html
[227]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-433/igt@kms_ccs@random-ccs-data-4-tiled-dg2-rc-ccs@pipe-c-dp-4.html
* igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size:
- shard-bmg: [SKIP][228] ([Intel XE#2291]) -> [PASS][229] +3 other tests pass
[228]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size.html
[229]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_cursor_legacy@cursora-vs-flipb-atomic-transitions-varying-size.html
* igt@kms_cursor_legacy@flip-vs-cursor-atomic:
- shard-bmg: [FAIL][230] ([Intel XE#4633]) -> [PASS][231]
[230]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-4/igt@kms_cursor_legacy@flip-vs-cursor-atomic.html
[231]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_cursor_legacy@flip-vs-cursor-atomic.html
* igt@kms_flip@2x-nonexisting-fb:
- shard-bmg: [SKIP][232] ([Intel XE#2316]) -> [PASS][233] +6 other tests pass
[232]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_flip@2x-nonexisting-fb.html
[233]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_flip@2x-nonexisting-fb.html
* igt@kms_joiner@basic-force-big-joiner:
- shard-bmg: [SKIP][234] ([Intel XE#3012]) -> [PASS][235]
[234]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_joiner@basic-force-big-joiner.html
[235]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_joiner@basic-force-big-joiner.html
* igt@kms_pm_rpm@basic-pci-d3-state:
- shard-dg2-set2: [FAIL][236] ([Intel XE#4741]) -> [PASS][237]
[236]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-dg2-463/igt@kms_pm_rpm@basic-pci-d3-state.html
[237]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-dg2-463/igt@kms_pm_rpm@basic-pci-d3-state.html
* igt@kms_setmode@invalid-clone-single-crtc-stealing:
- shard-bmg: [SKIP][238] ([Intel XE#1435]) -> [PASS][239]
[238]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_setmode@invalid-clone-single-crtc-stealing.html
[239]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_setmode@invalid-clone-single-crtc-stealing.html
* igt@xe_evict@evict-mixed-many-threads-small:
- shard-bmg: [INCOMPLETE][240] ([Intel XE#6321]) -> [PASS][241]
[240]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@xe_evict@evict-mixed-many-threads-small.html
[241]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-6/igt@xe_evict@evict-mixed-many-threads-small.html
* {igt@xe_pmu@fn-engine-activity-sched-if-idle@engine-drm_xe_engine_class_video_enhance1}:
- shard-bmg: [DMESG-WARN][242] ([Intel XE#3876]) -> [PASS][243] +1 other test pass
[242]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-2/igt@xe_pmu@fn-engine-activity-sched-if-idle@engine-drm_xe_engine_class_video_enhance1.html
[243]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-3/igt@xe_pmu@fn-engine-activity-sched-if-idle@engine-drm_xe_engine_class_video_enhance1.html
#### Warnings ####
* igt@kms_async_flips@async-flip-suspend-resume:
- shard-adlp: [DMESG-WARN][244] ([Intel XE#2953] / [Intel XE#4173] / [Intel XE#4543]) -> [DMESG-WARN][245] ([Intel XE#4543])
[244]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-2/igt@kms_async_flips@async-flip-suspend-resume.html
[245]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_async_flips@async-flip-suspend-resume.html
* igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-shrfb-draw-mmap-wc:
- shard-bmg: [SKIP][246] ([Intel XE#2312]) -> [SKIP][247] ([Intel XE#2311]) +8 other tests skip
[246]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-shrfb-draw-mmap-wc.html
[247]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_frontbuffer_tracking@drrs-2p-primscrn-pri-shrfb-draw-mmap-wc.html
* igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-shrfb-pgflip-blt:
- shard-bmg: [SKIP][248] ([Intel XE#2312]) -> [SKIP][249] ([Intel XE#5390]) +4 other tests skip
[248]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-shrfb-pgflip-blt.html
[249]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-4/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-shrfb-pgflip-blt.html
* igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-cur-indfb-draw-mmap-wc:
- shard-bmg: [SKIP][250] ([Intel XE#2311]) -> [SKIP][251] ([Intel XE#2312])
[250]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-3/igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-cur-indfb-draw-mmap-wc.html
[251]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcdrrs-2p-primscrn-cur-indfb-draw-mmap-wc.html
* igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-spr-indfb-move:
- shard-bmg: [SKIP][252] ([Intel XE#2313]) -> [SKIP][253] ([Intel XE#2312]) +1 other test skip
[252]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-3/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-spr-indfb-move.html
[253]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-6/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-spr-indfb-move.html
* igt@kms_frontbuffer_tracking@psr-2p-primscrn-indfb-plflip-blt:
- shard-bmg: [SKIP][254] ([Intel XE#2312]) -> [SKIP][255] ([Intel XE#2313]) +11 other tests skip
[254]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_frontbuffer_tracking@psr-2p-primscrn-indfb-plflip-blt.html
[255]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-2/igt@kms_frontbuffer_tracking@psr-2p-primscrn-indfb-plflip-blt.html
* igt@kms_plane_multiple@2x-tiling-y:
- shard-bmg: [SKIP][256] ([Intel XE#4596]) -> [SKIP][257] ([Intel XE#5021])
[256]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-bmg-6/igt@kms_plane_multiple@2x-tiling-y.html
[257]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-bmg-4/igt@kms_plane_multiple@2x-tiling-y.html
* igt@kms_pm_rpm@modeset-non-lpsp:
- shard-adlp: [SKIP][258] ([Intel XE#836]) -> [SKIP][259] ([Intel XE#6070])
[258]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad/shard-adlp-2/igt@kms_pm_rpm@modeset-non-lpsp.html
[259]: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/shard-adlp-1/igt@kms_pm_rpm@modeset-non-lpsp.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[Intel XE#1122]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1122
[Intel XE#1123]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1123
[Intel XE#1124]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1124
[Intel XE#1126]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1126
[Intel XE#1151]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1151
[Intel XE#1178]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1178
[Intel XE#1392]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1392
[Intel XE#1406]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1406
[Intel XE#1421]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1421
[Intel XE#1424]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1424
[Intel XE#1428]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1428
[Intel XE#1435]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1435
[Intel XE#1439]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1439
[Intel XE#1489]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1489
[Intel XE#1727]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1727
[Intel XE#2049]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2049
[Intel XE#2191]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2191
[Intel XE#2234]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2234
[Intel XE#2248]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2248
[Intel XE#2252]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2252
[Intel XE#2284]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2284
[Intel XE#2291]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2291
[Intel XE#2293]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2293
[Intel XE#2311]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2311
[Intel XE#2312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2312
[Intel XE#2313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2313
[Intel XE#2316]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2316
[Intel XE#2320]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2320
[Intel XE#2321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2321
[Intel XE#2322]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2322
[Intel XE#2327]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2327
[Intel XE#2330]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2330
[Intel XE#2360]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2360
[Intel XE#2380]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2380
[Intel XE#2414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2414
[Intel XE#2486]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2486
[Intel XE#2597]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2597
[Intel XE#261]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/261
[Intel XE#2652]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2652
[Intel XE#2705]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2705
[Intel XE#2850]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2850
[Intel XE#288]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/288
[Intel XE#2887]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2887
[Intel XE#2907]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2907
[Intel XE#2925]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2925
[Intel XE#2953]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2953
[Intel XE#301]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/301
[Intel XE#3012]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3012
[Intel XE#306]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/306
[Intel XE#308]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/308
[Intel XE#309]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/309
[Intel XE#310]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/310
[Intel XE#3113]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3113
[Intel XE#3141]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3141
[Intel XE#3149]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3149
[Intel XE#316]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/316
[Intel XE#323]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/323
[Intel XE#3309]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3309
[Intel XE#3414]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3414
[Intel XE#346]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/346
[Intel XE#3573]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3573
[Intel XE#366]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/366
[Intel XE#367]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/367
[Intel XE#373]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/373
[Intel XE#378]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/378
[Intel XE#3862]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3862
[Intel XE#3876]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/3876
[Intel XE#4130]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4130
[Intel XE#4173]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4173
[Intel XE#4212]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4212
[Intel XE#4302]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4302
[Intel XE#4345]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4345
[Intel XE#4354]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4354
[Intel XE#4356]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4356
[Intel XE#4416]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4416
[Intel XE#4418]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4418
[Intel XE#4422]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4422
[Intel XE#4504]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4504
[Intel XE#4522]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4522
[Intel XE#4543]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4543
[Intel XE#455]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/455
[Intel XE#4596]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4596
[Intel XE#4633]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4633
[Intel XE#4733]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4733
[Intel XE#4741]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4741
[Intel XE#4757]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4757
[Intel XE#4814]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4814
[Intel XE#4837]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4837
[Intel XE#488]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/488
[Intel XE#4915]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4915
[Intel XE#4921]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4921
[Intel XE#4943]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/4943
[Intel XE#5007]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5007
[Intel XE#5020]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5020
[Intel XE#5021]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5021
[Intel XE#5087]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5087
[Intel XE#5100]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5100
[Intel XE#512]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/512
[Intel XE#5300]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5300
[Intel XE#5390]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5390
[Intel XE#5530]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5530
[Intel XE#5561]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5561
[Intel XE#5564]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5564
[Intel XE#5565]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5565
[Intel XE#5575]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5575
[Intel XE#5580]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5580
[Intel XE#5594]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5594
[Intel XE#5607]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5607
[Intel XE#5612]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5612
[Intel XE#5614]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5614
[Intel XE#5624]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5624
[Intel XE#5742]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/5742
[Intel XE#599]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/599
[Intel XE#6010]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6010
[Intel XE#6014]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6014
[Intel XE#6032]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6032
[Intel XE#6070]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6070
[Intel XE#610]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/610
[Intel XE#6259]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6259
[Intel XE#6281]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6281
[Intel XE#6312]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6312
[Intel XE#6313]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6313
[Intel XE#6321]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6321
[Intel XE#6376]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/6376
[Intel XE#651]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/651
[Intel XE#653]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/653
[Intel XE#656]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/656
[Intel XE#688]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/688
[Intel XE#703]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/703
[Intel XE#718]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/718
[Intel XE#787]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/787
[Intel XE#836]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/836
[Intel XE#870]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/870
[Intel XE#929]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/929
[Intel XE#944]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/944
Build changes
-------------
* Linux: xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad -> xe-pw-155314v3
IGT_8588: 8588
xe-3935-bf6a6de7a240324a2d0bfca44a9760d9f9750bad: bf6a6de7a240324a2d0bfca44a9760d9f9750bad
xe-pw-155314v3: 155314v3
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-155314v3/index.html
[-- Attachment #2: Type: text/html, Size: 84271 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running
2025-10-16 20:48 ` [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running Matthew Brost
@ 2025-11-15 1:01 ` Niranjana Vishwanathapura
2025-11-18 18:06 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-15 1:01 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:26PM -0700, Matthew Brost wrote:
>If the firmware is not running during TDR (e.g., when the driver is
>unloading), there's no need to toggle scheduling in the GuC. In such
>cases, skip this step.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/xe/xe_guc_submit.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>index bb1f2929441c..ea0cfd866981 100644
>--- a/drivers/gpu/drm/xe/xe_guc_submit.c
>+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>@@ -1146,7 +1146,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> if (exec_queue_reset(q))
> err = -EIO;
>
>- if (!exec_queue_destroyed(q)) {
>+ if (!exec_queue_destroyed(q) && xe_uc_fw_is_running(&guc->fw)) {
> /*
> * Wait for any pending G2H to flush out before
> * modifying state
Looking at the code, it seems like if we skip this 'if' statement (when fw is
not running), then it will go wait for ct->wq. Not sure how that gets woken up
and logic might try to reset gt after that? Not sure if we should check
xe_uc_fw_is_running() here will one of the conditions to wait_event_timeout()
call cover this case and we can handle it appropriately after wait_event_timeout()
returns?
Niranjana
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 1/7] drm/sched: Add pending job list iterator
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
@ 2025-11-15 1:25 ` Niranjana Vishwanathapura
2025-11-18 17:52 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-15 1:25 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:20PM -0700, Matthew Brost wrote:
>Stop open coding pending job list in drivers. Add pending job list
>iterator which safely walks DRM scheduler list asserting DRM scheduler
>is stopped.
>
>v2:
> - Fix checkpatch (CI)
>v3:
> - Drop locked version (Christian)
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> include/drm/gpu_scheduler.h | 52 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 52 insertions(+)
>
>diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>index fb88301b3c45..7f31eba3bd61 100644
>--- a/include/drm/gpu_scheduler.h
>+++ b/include/drm/gpu_scheduler.h
>@@ -698,4 +698,56 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> struct drm_gpu_scheduler **sched_list,
> unsigned int num_sched_list);
>
>+/* Inlines */
>+
>+/**
>+ * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
>+ * @sched: DRM scheduler associated with pending job iterator
>+ */
>+struct drm_sched_pending_job_iter {
>+ struct drm_gpu_scheduler *sched;
>+};
>+
>+/* Drivers should never call this directly */
>+static inline struct drm_sched_pending_job_iter
>+__drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
>+{
>+ struct drm_sched_pending_job_iter iter = {
>+ .sched = sched,
>+ };
>+
>+ WARN_ON(!READ_ONCE(sched->pause_submit));
>+ return iter;
>+}
>+
>+/* Drivers should never call this directly */
>+static inline void
>+__drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
>+{
>+ WARN_ON(!READ_ONCE(iter.sched->pause_submit));
>+}
May be instead of these inline functions, we can add the code in a '({' block
in the below DEFINE_CLASS itself to avoid drivers from calling these inline
funcions? Though I agree these inline functions makes it cleaner to read.
>+
>+DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
>+ __drm_sched_pending_job_iter_end(_T),
>+ __drm_sched_pending_job_iter_begin(__sched),
>+ struct drm_gpu_scheduler *__sched);
>+static inline void *
>+class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T)
>+{ return _T; }
>+#define class_drm_sched_pending_job_iter_is_conditional false
>+
>+/**
>+ * drm_sched_for_each_pending_job() - Iterator for each pending job in scheduler
>+ * @__job: Current pending job being iterated over
>+ * @__sched: DRM scheduler to iterate over pending jobs
>+ * @__entity: DRM scheduler entity to filter jobs, NULL indicates no filter
>+ *
>+ * Iterator for each pending job in scheduler, filtering on an entity, and
>+ * enforcing scheduler is fully stopped
>+ */
>+#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
>+ scoped_guard(drm_sched_pending_job_iter, (__sched)) \
>+ list_for_each_entry((__job), &(__sched)->pending_list, list) \
>+ for_each_if(!(__entity) || (__job)->entity == (__entity))
>+
I am comparing it with DEFINE_CLASS usage in ttm driver here.
It looks like the body of this macro (where we call list_for_each_entry()),
doesn't use the drm_sched_pending_job_iter at all. So, looks like the only
reason we are using a DEFINE_CLASS with scoped_guard here is for those
WARN_ON() messages at the beginning and end of loop iteration, which is not
fully fool proof. Right?
I wonder if we really need DEFINE_CLASS here for that, though I am not
against using it.
Niranjana
> #endif
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state
2025-10-16 20:48 ` [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state Matthew Brost
@ 2025-11-17 19:57 ` Niranjana Vishwanathapura
2025-11-18 17:45 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-17 19:57 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:21PM -0700, Matthew Brost wrote:
>Add helpers to see if scheduler is stopped and a jobs signaled state.
>Expected to be used driver side on recovery and debug flows.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
> include/drm/gpu_scheduler.h | 32 ++++++++++++++++++++++++--
> 2 files changed, 32 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>index 46119aacb809..69bd6e482268 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -344,7 +344,7 @@ drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched,
> */
> static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> {
>- if (!READ_ONCE(sched->pause_submit))
>+ if (!drm_sched_is_stopped(sched))
> queue_work(sched->submit_wq, &sched->work_run_job);
> }
>
>@@ -354,7 +354,7 @@ static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> */
> static void drm_sched_run_free_queue(struct drm_gpu_scheduler *sched)
> {
>- if (!READ_ONCE(sched->pause_submit))
>+ if (!drm_sched_is_stopped(sched))
> queue_work(sched->submit_wq, &sched->work_free_job);
> }
>
>diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>index 7f31eba3bd61..d1a2d7f61c1d 100644
>--- a/include/drm/gpu_scheduler.h
>+++ b/include/drm/gpu_scheduler.h
>@@ -700,6 +700,17 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>
> /* Inlines */
>
>+/**
>+ * drm_sched_is_stopped() - DRM is stopped
>+ * @sched: DRM scheduler
>+ *
>+ * Return: True if sched is stopped, False otherwise
>+ */
>+static inline bool drm_sched_is_stopped(struct drm_gpu_scheduler *sched)
>+{
>+ return READ_ONCE(sched->pause_submit);
>+}
>+
> /**
> * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
> * @sched: DRM scheduler associated with pending job iterator
>@@ -716,7 +727,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> .sched = sched,
> };
>
>- WARN_ON(!READ_ONCE(sched->pause_submit));
>+ WARN_ON(!drm_sched_is_stopped(sched));
> return iter;
> }
NIT...instead of modifying the functions added in previous patch, may be this
patch should go in first and the previous patch can be added after that with
drm_sched_is_stopped() usage?
>
>@@ -724,7 +735,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> static inline void
> __drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
> {
>- WARN_ON(!READ_ONCE(iter.sched->pause_submit));
>+ WARN_ON(!drm_sched_is_stopped(iter.sched));
> }
>
> DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
>@@ -750,4 +761,21 @@ class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T
> list_for_each_entry((__job), &(__sched)->pending_list, list) \
> for_each_if(!(__entity) || (__job)->entity == (__entity))
>
>+/**
>+ * drm_sched_job_is_signaled() - DRM scheduler job is signaled
>+ * @job: DRM scheduler job
>+ *
>+ * Determine if DRM scheduler job is signaled. DRM scheduler should be stopped
>+ * to obtain a stable snapshot of state.
>+ *
>+ * Return: True if job is signaled, False otherwise
>+ */
>+static inline bool drm_sched_job_is_signaled(struct drm_sched_job *job)
>+{
>+ struct drm_sched_fence *s_fence = job->s_fence;
>+
>+ WARN_ON(!drm_sched_is_stopped(job->sched));
>+ return dma_fence_is_signaled(&s_fence->finished);
>+}
NIT..In patch#4 where xe driver uses this function in couple places,
I am seeing originally it checks if the s_fence->parent is signaled
instead of &s_fence->finished as done here.
I do see below message in the 's_fence->parent' kernel-doc,
"We signal the &drm_sched_fence.finished fence once parent is signalled."
So, probably it is fine, but just want to ensure.
Niranjana
>+
> #endif
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 3/7] drm/xe: Add dedicated message lock
2025-10-16 20:48 ` [PATCH v3 3/7] drm/xe: Add dedicated message lock Matthew Brost
@ 2025-11-17 19:58 ` Niranjana Vishwanathapura
2025-11-18 17:53 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-17 19:58 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:22PM -0700, Matthew Brost wrote:
>Stop abusing DRM scheduler job list lock for messages, add dedicated
>message lock.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
LGTM.
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>---
> drivers/gpu/drm/xe/xe_gpu_scheduler.c | 5 +++--
> drivers/gpu/drm/xe/xe_gpu_scheduler.h | 4 ++--
> drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 ++
> 3 files changed, 7 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>index f91e06d03511..f4f23317191f 100644
>--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>@@ -77,6 +77,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
> };
>
> sched->ops = xe_ops;
>+ spin_lock_init(&sched->msg_lock);
> INIT_LIST_HEAD(&sched->msgs);
> INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
>
>@@ -117,7 +118,7 @@ void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
> void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
> struct xe_sched_msg *msg)
> {
>- lockdep_assert_held(&sched->base.job_list_lock);
>+ lockdep_assert_held(&sched->msg_lock);
>
> list_add_tail(&msg->link, &sched->msgs);
> xe_sched_process_msg_queue(sched);
>@@ -131,7 +132,7 @@ void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
> void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
> struct xe_sched_msg *msg)
> {
>- lockdep_assert_held(&sched->base.job_list_lock);
>+ lockdep_assert_held(&sched->msg_lock);
>
> list_add(&msg->link, &sched->msgs);
> xe_sched_process_msg_queue(sched);
>diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>index 9955397aaaa9..b971b6b69419 100644
>--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>@@ -33,12 +33,12 @@ void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
>
> static inline void xe_sched_msg_lock(struct xe_gpu_scheduler *sched)
> {
>- spin_lock(&sched->base.job_list_lock);
>+ spin_lock(&sched->msg_lock);
> }
>
> static inline void xe_sched_msg_unlock(struct xe_gpu_scheduler *sched)
> {
>- spin_unlock(&sched->base.job_list_lock);
>+ spin_unlock(&sched->msg_lock);
> }
>
> static inline void xe_sched_stop(struct xe_gpu_scheduler *sched)
>diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
>index 6731b13da8bb..63d9bf92583c 100644
>--- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
>+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
>@@ -47,6 +47,8 @@ struct xe_gpu_scheduler {
> const struct xe_sched_backend_ops *ops;
> /** @msgs: list of messages to be processed in @work_process_msg */
> struct list_head msgs;
>+ /** @msg_lock: Message lock */
>+ spinlock_t msg_lock;
> /** @work_process_msg: processes messages */
> struct work_struct work_process_msg;
> };
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals
2025-10-16 20:48 ` [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals Matthew Brost
@ 2025-11-18 6:39 ` Niranjana Vishwanathapura
2025-11-18 17:59 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 6:39 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:23PM -0700, Matthew Brost wrote:
>Use new pending job list iterator and new helper functions in Xe to
>avoid reaching into DRM scheduler internals.
>
>Part of this change involves removing pending jobs debug information
>from debugfs and devcoredump. As agreed, the pending job list should
>only be accessed when the scheduler is stopped. However, it's not
>straightforward to determine whether the scheduler is stopped from the
>shared debugfs/devcoredump code path. Additionally, the pending job list
>provides little useful information, as pending jobs can be inferred from
>seqnos and ring head/tail positions. Therefore, this debug information
>is being removed.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/xe/xe_gpu_scheduler.c | 4 +-
> drivers/gpu/drm/xe/xe_gpu_scheduler.h | 34 +++--------
> drivers/gpu/drm/xe/xe_guc_submit.c | 74 ++++--------------------
> drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
> drivers/gpu/drm/xe/xe_hw_fence.c | 16 -----
> drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
> 6 files changed, 20 insertions(+), 121 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>index f4f23317191f..9c8004d5dd91 100644
>--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>@@ -7,7 +7,7 @@
>
> static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
> {
>- if (!READ_ONCE(sched->base.pause_submit))
>+ if (!drm_sched_is_stopped(&sched->base))
> queue_work(sched->base.submit_wq, &sched->work_process_msg);
> }
>
>@@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct work_struct *w)
> container_of(w, struct xe_gpu_scheduler, work_process_msg);
> struct xe_sched_msg *msg;
>
>- if (READ_ONCE(sched->base.pause_submit))
>+ if (drm_sched_is_stopped(&sched->base))
> return;
>
> msg = xe_sched_get_msg(sched);
>diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>index b971b6b69419..583372a78140 100644
>--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>@@ -55,14 +55,10 @@ static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
> {
> struct drm_sched_job *s_job;
>
>- list_for_each_entry(s_job, &sched->base.pending_list, list) {
>- struct drm_sched_fence *s_fence = s_job->s_fence;
>- struct dma_fence *hw_fence = s_fence->parent;
>-
>+ drm_sched_for_each_pending_job(s_job, &sched->base, NULL)
> if (to_xe_sched_job(s_job)->skip_emit ||
>- (hw_fence && !dma_fence_is_signaled(hw_fence)))
>+ !drm_sched_job_is_signaled(s_job))
> sched->base.ops->run_job(s_job);
>- }
> }
>
> static inline bool
>@@ -71,14 +67,6 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
> return drm_sched_invalidate_job(&job->drm, threshold);
> }
>
>-static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
>- struct xe_sched_job *job)
>-{
>- spin_lock(&sched->base.job_list_lock);
>- list_add(&job->drm.list, &sched->base.pending_list);
>- spin_unlock(&sched->base.job_list_lock);
>-}
>-
> /**
> * xe_sched_first_pending_job() - Find first pending job which is unsignaled
> * @sched: Xe GPU scheduler
>@@ -88,21 +76,13 @@ static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
> static inline
> struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
> {
>- struct xe_sched_job *job, *r_job = NULL;
>-
>- spin_lock(&sched->base.job_list_lock);
>- list_for_each_entry(job, &sched->base.pending_list, drm.list) {
>- struct drm_sched_fence *s_fence = job->drm.s_fence;
>- struct dma_fence *hw_fence = s_fence->parent;
>+ struct drm_sched_job *job;
>
>- if (hw_fence && !dma_fence_is_signaled(hw_fence)) {
>- r_job = job;
>- break;
>- }
>- }
>- spin_unlock(&sched->base.job_list_lock);
>+ drm_sched_for_each_pending_job(job, &sched->base, NULL)
>+ if (!drm_sched_job_is_signaled(job))
>+ return to_xe_sched_job(job);
>
>- return r_job;
>+ return NULL;
> }
>
> static inline int
>diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>index 0ef67d3523a7..680696efc434 100644
>--- a/drivers/gpu/drm/xe/xe_guc_submit.c
>+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>@@ -1032,7 +1032,7 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> struct xe_exec_queue *q = ge->q;
> struct xe_guc *guc = exec_queue_to_guc(q);
> struct xe_gpu_scheduler *sched = &ge->sched;
>- struct xe_sched_job *job;
>+ struct drm_sched_job *job;
> bool wedged = false;
>
> xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
>@@ -1091,16 +1091,10 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
> xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
>
>- xe_hw_fence_irq_stop(q->fence_irq);
>+ drm_sched_for_each_pending_job(job, &sched->base, NULL)
>+ xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
>
> xe_sched_submission_start(sched);
>-
>- spin_lock(&sched->base.job_list_lock);
>- list_for_each_entry(job, &sched->base.pending_list, drm.list)
>- xe_sched_job_set_error(job, -ECANCELED);
>- spin_unlock(&sched->base.job_list_lock);
>-
>- xe_hw_fence_irq_start(q->fence_irq);
> }
>
> #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
>@@ -1219,7 +1213,7 @@ static enum drm_gpu_sched_stat
> guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> {
> struct xe_sched_job *job = to_xe_sched_job(drm_job);
>- struct xe_sched_job *tmp_job;
>+ struct drm_sched_job *tmp_job;
> struct xe_exec_queue *q = job->q;
> struct xe_gpu_scheduler *sched = &q->guc->sched;
> struct xe_guc *guc = exec_queue_to_guc(q);
>@@ -1228,7 +1222,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> unsigned int fw_ref;
> int err = -ETIME;
> pid_t pid = -1;
>- int i = 0;
> bool wedged = false, skip_timeout_check;
>
> xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
>@@ -1395,28 +1388,15 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> __deregister_exec_queue(guc, q);
> }
>
>- /* Stop fence signaling */
>- xe_hw_fence_irq_stop(q->fence_irq);
>+ /* Mark all outstanding jobs as bad, thus completing them */
>+ xe_sched_job_set_error(job, err);
This setting error for this timed out job is newly added.
Why was it not there before and being added now?
>+ drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
>+ xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
>
>- /*
>- * Fence state now stable, stop / start scheduler which cleans up any
>- * fences that are complete
>- */
>- xe_sched_add_pending_job(sched, job);
Why xe_sched_add_pending_job() was there before?
> xe_sched_submission_start(sched);
>-
> xe_guc_exec_queue_trigger_cleanup(q);
Why do we need to trigger cleanup again here?
>
>- /* Mark all outstanding jobs as bad, thus completing them */
>- spin_lock(&sched->base.job_list_lock);
>- list_for_each_entry(tmp_job, &sched->base.pending_list, drm.list)
>- xe_sched_job_set_error(tmp_job, !i++ ? err : -ECANCELED);
>- spin_unlock(&sched->base.job_list_lock);
>-
>- /* Start fence signaling */
>- xe_hw_fence_irq_start(q->fence_irq);
>-
>- return DRM_GPU_SCHED_STAT_RESET;
>+ return DRM_GPU_SCHED_STAT_NO_HANG;
This is error case. So, why return is changed to NO_HANG?
Niranjana
>
> sched_enable:
> set_exec_queue_pending_tdr_exit(q);
>@@ -2244,7 +2224,7 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc,
> struct drm_sched_job *s_job;
> struct xe_sched_job *job = NULL;
>
>- list_for_each_entry(s_job, &sched->base.pending_list, list) {
>+ drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
> job = to_xe_sched_job(s_job);
>
> xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d",
>@@ -2349,7 +2329,7 @@ void xe_guc_submit_unpause(struct xe_guc *guc)
> * created after resfix done.
> */
> if (q->guc->id != index ||
>- !READ_ONCE(q->guc->sched.base.pause_submit))
>+ !drm_sched_is_stopped(&q->guc->sched.base))
> continue;
>
> guc_exec_queue_unpause(guc, q);
>@@ -2771,30 +2751,6 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
> if (snapshot->parallel_execution)
> guc_exec_queue_wq_snapshot_capture(q, snapshot);
>
>- spin_lock(&sched->base.job_list_lock);
>- snapshot->pending_list_size = list_count_nodes(&sched->base.pending_list);
>- snapshot->pending_list = kmalloc_array(snapshot->pending_list_size,
>- sizeof(struct pending_list_snapshot),
>- GFP_ATOMIC);
>-
>- if (snapshot->pending_list) {
>- struct xe_sched_job *job_iter;
>-
>- i = 0;
>- list_for_each_entry(job_iter, &sched->base.pending_list, drm.list) {
>- snapshot->pending_list[i].seqno =
>- xe_sched_job_seqno(job_iter);
>- snapshot->pending_list[i].fence =
>- dma_fence_is_signaled(job_iter->fence) ? 1 : 0;
>- snapshot->pending_list[i].finished =
>- dma_fence_is_signaled(&job_iter->drm.s_fence->finished)
>- ? 1 : 0;
>- i++;
>- }
>- }
>-
>- spin_unlock(&sched->base.job_list_lock);
>-
> return snapshot;
> }
>
>@@ -2852,13 +2808,6 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>
> if (snapshot->parallel_execution)
> guc_exec_queue_wq_snapshot_print(snapshot, p);
>-
>- for (i = 0; snapshot->pending_list && i < snapshot->pending_list_size;
>- i++)
>- drm_printf(p, "\tJob: seqno=%d, fence=%d, finished=%d\n",
>- snapshot->pending_list[i].seqno,
>- snapshot->pending_list[i].fence,
>- snapshot->pending_list[i].finished);
> }
>
> /**
>@@ -2881,7 +2830,6 @@ void xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *s
> xe_lrc_snapshot_free(snapshot->lrc[i]);
> kfree(snapshot->lrc);
> }
>- kfree(snapshot->pending_list);
> kfree(snapshot);
> }
>
>diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h
>index dc7456c34583..0b08c79cf3b9 100644
>--- a/drivers/gpu/drm/xe/xe_guc_submit_types.h
>+++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h
>@@ -61,12 +61,6 @@ struct guc_submit_parallel_scratch {
> u32 wq[WQ_SIZE / sizeof(u32)];
> };
>
>-struct pending_list_snapshot {
>- u32 seqno;
>- bool fence;
>- bool finished;
>-};
>-
> /**
> * struct xe_guc_submit_exec_queue_snapshot - Snapshot for devcoredump
> */
>@@ -134,11 +128,6 @@ struct xe_guc_submit_exec_queue_snapshot {
> /** @wq: Workqueue Items */
> u32 wq[WQ_SIZE / sizeof(u32)];
> } parallel;
>-
>- /** @pending_list_size: Size of the pending list snapshot array */
>- int pending_list_size;
>- /** @pending_list: snapshot of the pending list info */
>- struct pending_list_snapshot *pending_list;
> };
>
> #endif
>diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c
>index b2a0c46dfcd4..e65dfcdfdbc5 100644
>--- a/drivers/gpu/drm/xe/xe_hw_fence.c
>+++ b/drivers/gpu/drm/xe/xe_hw_fence.c
>@@ -110,22 +110,6 @@ void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq)
> irq_work_queue(&irq->work);
> }
>
>-void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq)
>-{
>- spin_lock_irq(&irq->lock);
>- irq->enabled = false;
>- spin_unlock_irq(&irq->lock);
>-}
>-
>-void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq)
>-{
>- spin_lock_irq(&irq->lock);
>- irq->enabled = true;
>- spin_unlock_irq(&irq->lock);
>-
>- irq_work_queue(&irq->work);
>-}
>-
> void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> struct xe_hw_fence_irq *irq, const char *name)
> {
>diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h
>index f13a1c4982c7..599492c13f80 100644
>--- a/drivers/gpu/drm/xe/xe_hw_fence.h
>+++ b/drivers/gpu/drm/xe/xe_hw_fence.h
>@@ -17,8 +17,6 @@ void xe_hw_fence_module_exit(void);
> void xe_hw_fence_irq_init(struct xe_hw_fence_irq *irq);
> void xe_hw_fence_irq_finish(struct xe_hw_fence_irq *irq);
> void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq);
>-void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq);
>-void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq);
>
> void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> struct xe_hw_fence_irq *irq, const char *name);
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR
2025-10-16 20:48 ` [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR Matthew Brost
@ 2025-11-18 6:41 ` Niranjana Vishwanathapura
2025-11-18 18:02 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 6:41 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:24PM -0700, Matthew Brost wrote:
>Deregistering queues in the TDR introduces unnecessary complexity,
>requiring reference counting tricks to function correctly. All that's
>needed in the TDR is to kick the queue off the hardware, which is
>achieved by disabling scheduling. Queue deregistration should be handled
>in a single, well-defined point in the cleanup path, tied to the queue's
>reference count.
>
Overall looks good to me.
But it would help if the commit text describes why this extra reference
taking was there before for lr jobs and why it is not needed now.
Niranjana
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/xe/xe_guc_submit.c | 57 +++---------------------------
> 1 file changed, 5 insertions(+), 52 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>index 680696efc434..ab0f1a2d4871 100644
>--- a/drivers/gpu/drm/xe/xe_guc_submit.c
>+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>@@ -69,9 +69,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
> #define EXEC_QUEUE_STATE_WEDGED (1 << 8)
> #define EXEC_QUEUE_STATE_BANNED (1 << 9)
> #define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10)
>-#define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11)
>-#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 12)
>-#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 13)
>+#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 11)
>+#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 12)
>
> static bool exec_queue_registered(struct xe_exec_queue *q)
> {
>@@ -218,21 +217,6 @@ static void clear_exec_queue_check_timeout(struct xe_exec_queue *q)
> atomic_and(~EXEC_QUEUE_STATE_CHECK_TIMEOUT, &q->guc->state);
> }
>
>-static bool exec_queue_extra_ref(struct xe_exec_queue *q)
>-{
>- return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_EXTRA_REF;
>-}
>-
>-static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
>-{
>- atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
>-}
>-
>-static void clear_exec_queue_extra_ref(struct xe_exec_queue *q)
>-{
>- atomic_and(~EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
>-}
>-
> static bool exec_queue_pending_resume(struct xe_exec_queue *q)
> {
> return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
>@@ -1190,25 +1174,6 @@ static void disable_scheduling(struct xe_exec_queue *q, bool immediate)
> G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, 1);
> }
>
>-static void __deregister_exec_queue(struct xe_guc *guc, struct xe_exec_queue *q)
>-{
>- u32 action[] = {
>- XE_GUC_ACTION_DEREGISTER_CONTEXT,
>- q->guc->id,
>- };
>-
>- xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
>- xe_gt_assert(guc_to_gt(guc), exec_queue_registered(q));
>- xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_enable(q));
>- xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_disable(q));
>-
>- set_exec_queue_destroyed(q);
>- trace_xe_exec_queue_deregister(q);
>-
>- xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
>- G2H_LEN_DW_DEREGISTER_CONTEXT, 1);
>-}
>-
> static enum drm_gpu_sched_stat
> guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> {
>@@ -1326,8 +1291,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> xe_devcoredump(q, job,
> "Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
> q->guc->id, ret, xe_guc_read_stopped(guc));
>- set_exec_queue_extra_ref(q);
>- xe_exec_queue_get(q); /* GT reset owns this */
> set_exec_queue_banned(q);
> xe_gt_reset_async(q->gt);
> xe_sched_tdr_queue_imm(sched);
>@@ -1380,13 +1343,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> }
> }
>
>- /* Finish cleaning up exec queue via deregister */
> set_exec_queue_banned(q);
>- if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
>- set_exec_queue_extra_ref(q);
>- xe_exec_queue_get(q);
>- __deregister_exec_queue(guc, q);
>- }
>
> /* Mark all outstanding jobs as bad, thus completing them */
> xe_sched_job_set_error(job, err);
>@@ -1928,7 +1885,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
>
> /* Clean up lost G2H + reset engine state */
> if (exec_queue_registered(q)) {
>- if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
>+ if (xe_exec_queue_is_lr(q))
> xe_exec_queue_put(q);
> else if (exec_queue_destroyed(q))
> __guc_exec_queue_destroy(guc, q);
>@@ -2062,11 +2019,7 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
>
> if (exec_queue_destroyed(q) && exec_queue_registered(q)) {
> clear_exec_queue_destroyed(q);
>- if (exec_queue_extra_ref(q))
>- xe_exec_queue_put(q);
>- else
>- q->guc->needs_cleanup = true;
>- clear_exec_queue_extra_ref(q);
>+ q->guc->needs_cleanup = true;
> xe_gt_dbg(guc_to_gt(guc), "Replay CLEANUP - guc_id=%d",
> q->guc->id);
> }
>@@ -2483,7 +2436,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
>
> clear_exec_queue_registered(q);
>
>- if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
>+ if (xe_exec_queue_is_lr(q))
> xe_exec_queue_put(q);
> else
> __guc_exec_queue_destroy(guc, q);
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission
2025-10-16 20:48 ` [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission Matthew Brost
@ 2025-11-18 6:45 ` Niranjana Vishwanathapura
2025-11-18 18:03 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 6:45 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Thu, Oct 16, 2025 at 01:48:25PM -0700, Matthew Brost wrote:
>Now that LR jobs are tracked by the DRM scheduler, there's no longer a
>need to special-case LR queues. This change removes all LR
>queue-specific handling, including dedicated TDR logic, reference
>counting schemes, and other related mechanisms.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 2 -
> drivers/gpu/drm/xe/xe_guc_submit.c | 129 +------------------
> 2 files changed, 7 insertions(+), 124 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
>index a3b034e4b205..fd0915ed8eb1 100644
>--- a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
>+++ b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
>@@ -33,8 +33,6 @@ struct xe_guc_exec_queue {
> */
> #define MAX_STATIC_MSG_TYPE 3
> struct xe_sched_msg static_msgs[MAX_STATIC_MSG_TYPE];
>- /** @lr_tdr: long running TDR worker */
>- struct work_struct lr_tdr;
> /** @destroy_async: do final destroy async from this worker */
> struct work_struct destroy_async;
> /** @resume_time: time of last resume */
>diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>index ab0f1a2d4871..bb1f2929441c 100644
>--- a/drivers/gpu/drm/xe/xe_guc_submit.c
>+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>@@ -674,14 +674,6 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
> parallel_write(xe, map, wq_desc.wq_status, WQ_STATUS_ACTIVE);
> }
>
>- /*
>- * We must keep a reference for LR engines if engine is registered with
>- * the GuC as jobs signal immediately and can't destroy an engine if the
>- * GuC has a reference to it.
>- */
>- if (xe_exec_queue_is_lr(q))
>- xe_exec_queue_get(q);
>-
> set_exec_queue_registered(q);
> trace_xe_exec_queue_register(q);
> if (xe_exec_queue_is_parallel(q))
>@@ -854,7 +846,7 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
> struct xe_sched_job *job = to_xe_sched_job(drm_job);
> struct xe_exec_queue *q = job->q;
> struct xe_guc *guc = exec_queue_to_guc(q);
>- bool lr = xe_exec_queue_is_lr(q), killed_or_banned_or_wedged =
>+ bool killed_or_banned_or_wedged =
> exec_queue_killed_or_banned_or_wedged(q);
>
> xe_gt_assert(guc_to_gt(guc), !(exec_queue_destroyed(q) || exec_queue_pending_disable(q)) ||
>@@ -871,15 +863,6 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
> job->skip_emit = false;
> }
>
>- /*
>- * We don't care about job-fence ordering in LR VMs because these fences
>- * are never exported; they are used solely to keep jobs on the pending
>- * list. Once a queue enters an error state, there's no need to track
>- * them.
>- */
>- if (killed_or_banned_or_wedged && lr)
>- xe_sched_job_set_error(job, -ECANCELED);
>-
Why this piece of code here is being removed?
> return job->fence;
> }
>
>@@ -923,8 +906,7 @@ static void disable_scheduling_deregister(struct xe_guc *guc,
> xe_gt_warn(q->gt, "Pending enable/disable failed to respond\n");
> xe_sched_submission_start(sched);
> xe_gt_reset_async(q->gt);
>- if (!xe_exec_queue_is_lr(q))
>- xe_sched_tdr_queue_imm(sched);
>+ xe_sched_tdr_queue_imm(sched);
> return;
> }
>
>@@ -950,10 +932,7 @@ static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q)
> /** to wakeup xe_wait_user_fence ioctl if exec queue is reset */
> wake_up_all(&xe->ufence_wq);
>
>- if (xe_exec_queue_is_lr(q))
>- queue_work(guc_to_gt(guc)->ordered_wq, &q->guc->lr_tdr);
>- else
>- xe_sched_tdr_queue_imm(&q->guc->sched);
>+ xe_sched_tdr_queue_imm(&q->guc->sched);
> }
>
> /**
>@@ -1009,78 +988,6 @@ static bool guc_submit_hint_wedged(struct xe_guc *guc)
> return true;
> }
>
>-static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
>-{
>- struct xe_guc_exec_queue *ge =
>- container_of(w, struct xe_guc_exec_queue, lr_tdr);
>- struct xe_exec_queue *q = ge->q;
>- struct xe_guc *guc = exec_queue_to_guc(q);
>- struct xe_gpu_scheduler *sched = &ge->sched;
>- struct drm_sched_job *job;
>- bool wedged = false;
>-
>- xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
>-
>- if (vf_recovery(guc))
>- return;
>-
>- trace_xe_exec_queue_lr_cleanup(q);
Remove the trace event as well in xe_trace.h?
Niranjana
>-
>- if (!exec_queue_killed(q))
>- wedged = guc_submit_hint_wedged(exec_queue_to_guc(q));
>-
>- /* Kill the run_job / process_msg entry points */
>- xe_sched_submission_stop(sched);
>-
>- /*
>- * Engine state now mostly stable, disable scheduling / deregister if
>- * needed. This cleanup routine might be called multiple times, where
>- * the actual async engine deregister drops the final engine ref.
>- * Calling disable_scheduling_deregister will mark the engine as
>- * destroyed and fire off the CT requests to disable scheduling /
>- * deregister, which we only want to do once. We also don't want to mark
>- * the engine as pending_disable again as this may race with the
>- * xe_guc_deregister_done_handler() which treats it as an unexpected
>- * state.
>- */
>- if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
>- struct xe_guc *guc = exec_queue_to_guc(q);
>- int ret;
>-
>- set_exec_queue_banned(q);
>- disable_scheduling_deregister(guc, q);
>-
>- /*
>- * Must wait for scheduling to be disabled before signalling
>- * any fences, if GT broken the GT reset code should signal us.
>- */
>- ret = wait_event_timeout(guc->ct.wq,
>- !exec_queue_pending_disable(q) ||
>- xe_guc_read_stopped(guc) ||
>- vf_recovery(guc), HZ * 5);
>- if (vf_recovery(guc))
>- return;
>-
>- if (!ret) {
>- xe_gt_warn(q->gt, "Schedule disable failed to respond, guc_id=%d\n",
>- q->guc->id);
>- xe_devcoredump(q, NULL, "Schedule disable failed to respond, guc_id=%d\n",
>- q->guc->id);
>- xe_sched_submission_start(sched);
>- xe_gt_reset_async(q->gt);
>- return;
>- }
>- }
>-
>- if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
>- xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
>-
>- drm_sched_for_each_pending_job(job, &sched->base, NULL)
>- xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
>-
>- xe_sched_submission_start(sched);
>-}
>-
> #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
>
> static bool check_timeout(struct xe_exec_queue *q, struct xe_sched_job *job)
>@@ -1150,8 +1057,7 @@ static void enable_scheduling(struct xe_exec_queue *q)
> xe_gt_warn(guc_to_gt(guc), "Schedule enable failed to respond");
> set_exec_queue_banned(q);
> xe_gt_reset_async(q->gt);
>- if (!xe_exec_queue_is_lr(q))
>- xe_sched_tdr_queue_imm(&q->guc->sched);
>+ xe_sched_tdr_queue_imm(&q->guc->sched);
> }
> }
>
>@@ -1189,8 +1095,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> pid_t pid = -1;
> bool wedged = false, skip_timeout_check;
>
>- xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
>-
> /*
> * TDR has fired before free job worker. Common if exec queue
> * immediately closed after last fence signaled. Add back to pending
>@@ -1395,8 +1299,6 @@ static void __guc_exec_queue_destroy_async(struct work_struct *w)
> xe_pm_runtime_get(guc_to_xe(guc));
> trace_xe_exec_queue_destroy(q);
>
>- if (xe_exec_queue_is_lr(q))
>- cancel_work_sync(&ge->lr_tdr);
> /* Confirm no work left behind accessing device structures */
> cancel_delayed_work_sync(&ge->sched.base.work_tdr);
>
>@@ -1629,9 +1531,6 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
> if (err)
> goto err_sched;
>
>- if (xe_exec_queue_is_lr(q))
>- INIT_WORK(&q->guc->lr_tdr, xe_guc_exec_queue_lr_cleanup);
>-
> mutex_lock(&guc->submission_state.lock);
>
> err = alloc_guc_id(guc, q);
>@@ -1885,9 +1784,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
>
> /* Clean up lost G2H + reset engine state */
> if (exec_queue_registered(q)) {
>- if (xe_exec_queue_is_lr(q))
>- xe_exec_queue_put(q);
>- else if (exec_queue_destroyed(q))
>+ if (exec_queue_destroyed(q))
> __guc_exec_queue_destroy(guc, q);
> }
> if (q->guc->suspend_pending) {
>@@ -1917,9 +1814,6 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> trace_xe_sched_job_ban(job);
> ban = true;
> }
>- } else if (xe_exec_queue_is_lr(q) &&
>- !xe_lrc_ring_is_idle(q->lrc[0])) {
>- ban = true;
> }
>
> if (ban) {
>@@ -2002,8 +1896,6 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
> if (pending_enable && !pending_resume &&
> !exec_queue_pending_tdr_exit(q)) {
> clear_exec_queue_registered(q);
>- if (xe_exec_queue_is_lr(q))
>- xe_exec_queue_put(q);
> xe_gt_dbg(guc_to_gt(guc), "Replay REGISTER - guc_id=%d",
> q->guc->id);
> }
>@@ -2060,10 +1952,7 @@ static void guc_exec_queue_pause(struct xe_guc *guc, struct xe_exec_queue *q)
>
> /* Stop scheduling + flush any DRM scheduler operations */
> xe_sched_submission_stop(sched);
>- if (xe_exec_queue_is_lr(q))
>- cancel_work_sync(&q->guc->lr_tdr);
>- else
>- cancel_delayed_work_sync(&sched->base.work_tdr);
>+ cancel_delayed_work_sync(&sched->base.work_tdr);
>
> guc_exec_queue_revert_pending_state_change(guc, q);
>
>@@ -2435,11 +2324,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
> trace_xe_exec_queue_deregister_done(q);
>
> clear_exec_queue_registered(q);
>-
>- if (xe_exec_queue_is_lr(q))
>- xe_exec_queue_put(q);
>- else
>- __guc_exec_queue_destroy(guc, q);
>+ __guc_exec_queue_destroy(guc, q);
> }
>
> int xe_guc_deregister_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state
2025-11-17 19:57 ` Niranjana Vishwanathapura
@ 2025-11-18 17:45 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 17:45 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Mon, Nov 17, 2025 at 11:57:44AM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:21PM -0700, Matthew Brost wrote:
> > Add helpers to see if scheduler is stopped and a jobs signaled state.
> > Expected to be used driver side on recovery and debug flows.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
> > include/drm/gpu_scheduler.h | 32 ++++++++++++++++++++++++--
> > 2 files changed, 32 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 46119aacb809..69bd6e482268 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -344,7 +344,7 @@ drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched,
> > */
> > static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> > {
> > - if (!READ_ONCE(sched->pause_submit))
> > + if (!drm_sched_is_stopped(sched))
> > queue_work(sched->submit_wq, &sched->work_run_job);
> > }
> >
> > @@ -354,7 +354,7 @@ static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> > */
> > static void drm_sched_run_free_queue(struct drm_gpu_scheduler *sched)
> > {
> > - if (!READ_ONCE(sched->pause_submit))
> > + if (!drm_sched_is_stopped(sched))
> > queue_work(sched->submit_wq, &sched->work_free_job);
> > }
> >
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7f31eba3bd61..d1a2d7f61c1d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -700,6 +700,17 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >
> > /* Inlines */
> >
> > +/**
> > + * drm_sched_is_stopped() - DRM is stopped
> > + * @sched: DRM scheduler
> > + *
> > + * Return: True if sched is stopped, False otherwise
> > + */
> > +static inline bool drm_sched_is_stopped(struct drm_gpu_scheduler *sched)
> > +{
> > + return READ_ONCE(sched->pause_submit);
> > +}
> > +
> > /**
> > * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
> > * @sched: DRM scheduler associated with pending job iterator
> > @@ -716,7 +727,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> > .sched = sched,
> > };
> >
> > - WARN_ON(!READ_ONCE(sched->pause_submit));
> > + WARN_ON(!drm_sched_is_stopped(sched));
> > return iter;
> > }
>
> NIT...instead of modifying the functions added in previous patch, may be this
> patch should go in first and the previous patch can be added after that with
> drm_sched_is_stopped() usage?
>
Yes, I think that would be better ordering. Will fix.
> >
> > @@ -724,7 +735,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> > static inline void
> > __drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
> > {
> > - WARN_ON(!READ_ONCE(iter.sched->pause_submit));
> > + WARN_ON(!drm_sched_is_stopped(iter.sched));
> > }
> >
> > DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
> > @@ -750,4 +761,21 @@ class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T
> > list_for_each_entry((__job), &(__sched)->pending_list, list) \
> > for_each_if(!(__entity) || (__job)->entity == (__entity))
> >
> > +/**
> > + * drm_sched_job_is_signaled() - DRM scheduler job is signaled
> > + * @job: DRM scheduler job
> > + *
> > + * Determine if DRM scheduler job is signaled. DRM scheduler should be stopped
> > + * to obtain a stable snapshot of state.
> > + *
> > + * Return: True if job is signaled, False otherwise
> > + */
> > +static inline bool drm_sched_job_is_signaled(struct drm_sched_job *job)
> > +{
> > + struct drm_sched_fence *s_fence = job->s_fence;
> > +
> > + WARN_ON(!drm_sched_is_stopped(job->sched));
> > + return dma_fence_is_signaled(&s_fence->finished);
> > +}
>
> NIT..In patch#4 where xe driver uses this function in couple places,
> I am seeing originally it checks if the s_fence->parent is signaled
> instead of &s_fence->finished as done here.
> I do see below message in the 's_fence->parent' kernel-doc,
> "We signal the &drm_sched_fence.finished fence once parent is signalled."
> So, probably it is fine, but just want to ensure.
>
It more or less is the same check. Techincally the hardware fence
(parent) can signal before software fence (finished) but it is pretty
small race window which practice should never be hit but I could make
this function more robust can check on the parent fence too, that is
probably better I guess. Let me change this.
Matt
> Niranjana
>
> > +
> > #endif
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 1/7] drm/sched: Add pending job list iterator
2025-11-15 1:25 ` Niranjana Vishwanathapura
@ 2025-11-18 17:52 ` Matthew Brost
2025-11-18 21:12 ` Niranjana Vishwanathapura
0 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 17:52 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Fri, Nov 14, 2025 at 05:25:47PM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:20PM -0700, Matthew Brost wrote:
> > Stop open coding pending job list in drivers. Add pending job list
> > iterator which safely walks DRM scheduler list asserting DRM scheduler
> > is stopped.
> >
> > v2:
> > - Fix checkpatch (CI)
> > v3:
> > - Drop locked version (Christian)
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > include/drm/gpu_scheduler.h | 52 +++++++++++++++++++++++++++++++++++++
> > 1 file changed, 52 insertions(+)
> >
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index fb88301b3c45..7f31eba3bd61 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -698,4 +698,56 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> > struct drm_gpu_scheduler **sched_list,
> > unsigned int num_sched_list);
> >
> > +/* Inlines */
> > +
> > +/**
> > + * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
> > + * @sched: DRM scheduler associated with pending job iterator
> > + */
> > +struct drm_sched_pending_job_iter {
> > + struct drm_gpu_scheduler *sched;
> > +};
> > +
> > +/* Drivers should never call this directly */
> > +static inline struct drm_sched_pending_job_iter
> > +__drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> > +{
> > + struct drm_sched_pending_job_iter iter = {
> > + .sched = sched,
> > + };
> > +
> > + WARN_ON(!READ_ONCE(sched->pause_submit));
> > + return iter;
> > +}
> > +
> > +/* Drivers should never call this directly */
> > +static inline void
> > +__drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
> > +{
> > + WARN_ON(!READ_ONCE(iter.sched->pause_submit));
> > +}
>
> May be instead of these inline functions, we can add the code in a '({' block
> in the below DEFINE_CLASS itself to avoid drivers from calling these inline
> funcions? Though I agree these inline functions makes it cleaner to read.
>
I'm not sure we can just call code inline from DEFINE_CLASS, rather only
functions.
> > +
> > +DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
> > + __drm_sched_pending_job_iter_end(_T),
> > + __drm_sched_pending_job_iter_begin(__sched),
> > + struct drm_gpu_scheduler *__sched);
> > +static inline void *
> > +class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T)
> > +{ return _T; }
> > +#define class_drm_sched_pending_job_iter_is_conditional false
> > +
> > +/**
> > + * drm_sched_for_each_pending_job() - Iterator for each pending job in scheduler
> > + * @__job: Current pending job being iterated over
> > + * @__sched: DRM scheduler to iterate over pending jobs
> > + * @__entity: DRM scheduler entity to filter jobs, NULL indicates no filter
> > + *
> > + * Iterator for each pending job in scheduler, filtering on an entity, and
> > + * enforcing scheduler is fully stopped
> > + */
> > +#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
> > + scoped_guard(drm_sched_pending_job_iter, (__sched)) \
> > + list_for_each_entry((__job), &(__sched)->pending_list, list) \
> > + for_each_if(!(__entity) || (__job)->entity == (__entity))
> > +
>
> I am comparing it with DEFINE_CLASS usage in ttm driver here.
> It looks like the body of this macro (where we call list_for_each_entry()),
> doesn't use the drm_sched_pending_job_iter at all. So, looks like the only
> reason we are using a DEFINE_CLASS with scoped_guard here is for those
> WARN_ON() messages at the beginning and end of loop iteration, which is not
> fully fool proof. Right?
The drm_sched_pending_job_iter is for futuring proofing (e.g., if we
need more information than drm_gpu_scheduler, we have iterator
structure).
The define class is purpose is to ensure at the start of iterator and
end of the iterator the scheduler is paused which is only time we (DRM
scheduler maintainers) have agreed it is safe for driver to look at the
pending list. FWIW, this caught some bugs in Xe VF restore
implementation.
> I wonder if we really need DEFINE_CLASS here for that, though I am not
> against using it.
>
So yes, I think a DEFINE_CLASS is apporiate here to implement the
iterator.
Matt
> Niranjana
>
> > #endif
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 3/7] drm/xe: Add dedicated message lock
2025-11-17 19:58 ` Niranjana Vishwanathapura
@ 2025-11-18 17:53 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 17:53 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Mon, Nov 17, 2025 at 11:58:46AM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:22PM -0700, Matthew Brost wrote:
> > Stop abusing DRM scheduler job list lock for messages, add dedicated
> > message lock.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>
> LGTM.
> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>
Going to send this out on its own for CI and merge. Thanks for the review.
Matt
> > ---
> > drivers/gpu/drm/xe/xe_gpu_scheduler.c | 5 +++--
> > drivers/gpu/drm/xe/xe_gpu_scheduler.h | 4 ++--
> > drivers/gpu/drm/xe/xe_gpu_scheduler_types.h | 2 ++
> > 3 files changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index f91e06d03511..f4f23317191f 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -77,6 +77,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
> > };
> >
> > sched->ops = xe_ops;
> > + spin_lock_init(&sched->msg_lock);
> > INIT_LIST_HEAD(&sched->msgs);
> > INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
> >
> > @@ -117,7 +118,7 @@ void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
> > void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
> > struct xe_sched_msg *msg)
> > {
> > - lockdep_assert_held(&sched->base.job_list_lock);
> > + lockdep_assert_held(&sched->msg_lock);
> >
> > list_add_tail(&msg->link, &sched->msgs);
> > xe_sched_process_msg_queue(sched);
> > @@ -131,7 +132,7 @@ void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
> > void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
> > struct xe_sched_msg *msg)
> > {
> > - lockdep_assert_held(&sched->base.job_list_lock);
> > + lockdep_assert_held(&sched->msg_lock);
> >
> > list_add(&msg->link, &sched->msgs);
> > xe_sched_process_msg_queue(sched);
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > index 9955397aaaa9..b971b6b69419 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > @@ -33,12 +33,12 @@ void xe_sched_add_msg_head(struct xe_gpu_scheduler *sched,
> >
> > static inline void xe_sched_msg_lock(struct xe_gpu_scheduler *sched)
> > {
> > - spin_lock(&sched->base.job_list_lock);
> > + spin_lock(&sched->msg_lock);
> > }
> >
> > static inline void xe_sched_msg_unlock(struct xe_gpu_scheduler *sched)
> > {
> > - spin_unlock(&sched->base.job_list_lock);
> > + spin_unlock(&sched->msg_lock);
> > }
> >
> > static inline void xe_sched_stop(struct xe_gpu_scheduler *sched)
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> > index 6731b13da8bb..63d9bf92583c 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler_types.h
> > @@ -47,6 +47,8 @@ struct xe_gpu_scheduler {
> > const struct xe_sched_backend_ops *ops;
> > /** @msgs: list of messages to be processed in @work_process_msg */
> > struct list_head msgs;
> > + /** @msg_lock: Message lock */
> > + spinlock_t msg_lock;
> > /** @work_process_msg: processes messages */
> > struct work_struct work_process_msg;
> > };
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals
2025-11-18 6:39 ` Niranjana Vishwanathapura
@ 2025-11-18 17:59 ` Matthew Brost
2025-11-18 21:17 ` Niranjana Vishwanathapura
0 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 17:59 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Mon, Nov 17, 2025 at 10:39:42PM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:23PM -0700, Matthew Brost wrote:
> > Use new pending job list iterator and new helper functions in Xe to
> > avoid reaching into DRM scheduler internals.
> >
> > Part of this change involves removing pending jobs debug information
> > from debugfs and devcoredump. As agreed, the pending job list should
> > only be accessed when the scheduler is stopped. However, it's not
> > straightforward to determine whether the scheduler is stopped from the
> > shared debugfs/devcoredump code path. Additionally, the pending job list
> > provides little useful information, as pending jobs can be inferred from
> > seqnos and ring head/tail positions. Therefore, this debug information
> > is being removed.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gpu_scheduler.c | 4 +-
> > drivers/gpu/drm/xe/xe_gpu_scheduler.h | 34 +++--------
> > drivers/gpu/drm/xe/xe_guc_submit.c | 74 ++++--------------------
> > drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
> > drivers/gpu/drm/xe/xe_hw_fence.c | 16 -----
> > drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
> > 6 files changed, 20 insertions(+), 121 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index f4f23317191f..9c8004d5dd91 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -7,7 +7,7 @@
> >
> > static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
> > {
> > - if (!READ_ONCE(sched->base.pause_submit))
> > + if (!drm_sched_is_stopped(&sched->base))
> > queue_work(sched->base.submit_wq, &sched->work_process_msg);
> > }
> >
> > @@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct work_struct *w)
> > container_of(w, struct xe_gpu_scheduler, work_process_msg);
> > struct xe_sched_msg *msg;
> >
> > - if (READ_ONCE(sched->base.pause_submit))
> > + if (drm_sched_is_stopped(&sched->base))
> > return;
> >
> > msg = xe_sched_get_msg(sched);
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > index b971b6b69419..583372a78140 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > @@ -55,14 +55,10 @@ static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
> > {
> > struct drm_sched_job *s_job;
> >
> > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
> > - struct drm_sched_fence *s_fence = s_job->s_fence;
> > - struct dma_fence *hw_fence = s_fence->parent;
> > -
> > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL)
> > if (to_xe_sched_job(s_job)->skip_emit ||
> > - (hw_fence && !dma_fence_is_signaled(hw_fence)))
> > + !drm_sched_job_is_signaled(s_job))
> > sched->base.ops->run_job(s_job);
> > - }
> > }
> >
> > static inline bool
> > @@ -71,14 +67,6 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
> > return drm_sched_invalidate_job(&job->drm, threshold);
> > }
> >
> > -static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
> > - struct xe_sched_job *job)
> > -{
> > - spin_lock(&sched->base.job_list_lock);
> > - list_add(&job->drm.list, &sched->base.pending_list);
> > - spin_unlock(&sched->base.job_list_lock);
> > -}
> > -
> > /**
> > * xe_sched_first_pending_job() - Find first pending job which is unsignaled
> > * @sched: Xe GPU scheduler
> > @@ -88,21 +76,13 @@ static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
> > static inline
> > struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
> > {
> > - struct xe_sched_job *job, *r_job = NULL;
> > -
> > - spin_lock(&sched->base.job_list_lock);
> > - list_for_each_entry(job, &sched->base.pending_list, drm.list) {
> > - struct drm_sched_fence *s_fence = job->drm.s_fence;
> > - struct dma_fence *hw_fence = s_fence->parent;
> > + struct drm_sched_job *job;
> >
> > - if (hw_fence && !dma_fence_is_signaled(hw_fence)) {
> > - r_job = job;
> > - break;
> > - }
> > - }
> > - spin_unlock(&sched->base.job_list_lock);
> > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
> > + if (!drm_sched_job_is_signaled(job))
> > + return to_xe_sched_job(job);
> >
> > - return r_job;
> > + return NULL;
> > }
> >
> > static inline int
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 0ef67d3523a7..680696efc434 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -1032,7 +1032,7 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> > struct xe_exec_queue *q = ge->q;
> > struct xe_guc *guc = exec_queue_to_guc(q);
> > struct xe_gpu_scheduler *sched = &ge->sched;
> > - struct xe_sched_job *job;
> > + struct drm_sched_job *job;
> > bool wedged = false;
> >
> > xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
> > @@ -1091,16 +1091,10 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> > if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
> > xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
> >
> > - xe_hw_fence_irq_stop(q->fence_irq);
> > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
> > + xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
> >
> > xe_sched_submission_start(sched);
> > -
> > - spin_lock(&sched->base.job_list_lock);
> > - list_for_each_entry(job, &sched->base.pending_list, drm.list)
> > - xe_sched_job_set_error(job, -ECANCELED);
> > - spin_unlock(&sched->base.job_list_lock);
> > -
> > - xe_hw_fence_irq_start(q->fence_irq);
> > }
> >
> > #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
> > @@ -1219,7 +1213,7 @@ static enum drm_gpu_sched_stat
> > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > {
> > struct xe_sched_job *job = to_xe_sched_job(drm_job);
> > - struct xe_sched_job *tmp_job;
> > + struct drm_sched_job *tmp_job;
> > struct xe_exec_queue *q = job->q;
> > struct xe_gpu_scheduler *sched = &q->guc->sched;
> > struct xe_guc *guc = exec_queue_to_guc(q);
> > @@ -1228,7 +1222,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > unsigned int fw_ref;
> > int err = -ETIME;
> > pid_t pid = -1;
> > - int i = 0;
> > bool wedged = false, skip_timeout_check;
> >
> > xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
> > @@ -1395,28 +1388,15 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > __deregister_exec_queue(guc, q);
> > }
> >
> > - /* Stop fence signaling */
> > - xe_hw_fence_irq_stop(q->fence_irq);
> > + /* Mark all outstanding jobs as bad, thus completing them */
> > + xe_sched_job_set_error(job, err);
>
> This setting error for this timed out job is newly added.
> Why was it not there before and being added now?
>
Because the TDR job was added back into the pending list first, so in
fact we did set the error on the job.
> > + drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
> > + xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
> >
> > - /*
> > - * Fence state now stable, stop / start scheduler which cleans up any
> > - * fences that are complete
> > - */
> > - xe_sched_add_pending_job(sched, job);
>
> Why xe_sched_add_pending_job() was there before?
>
We (DRM scheduler maintainers agreed drivers shouldn't touch the pending
list), below returning DRM_GPU_SCHED_STAT_NO_HANG defers this step to
the DRM scheduler core.
> > xe_sched_submission_start(sched);
> > -
> > xe_guc_exec_queue_trigger_cleanup(q);
>
> Why do we need to trigger cleanup again here?
>
This is existing code and it should only be called once in this
function. At this point in time, we don't know if the TDR fired
naturally with a normal timeout value or if we are already in process of
cleaning up. If it is the former, then we switch to cleanup immediately
mode which is why this call is needed.
> >
> > - /* Mark all outstanding jobs as bad, thus completing them */
> > - spin_lock(&sched->base.job_list_lock);
> > - list_for_each_entry(tmp_job, &sched->base.pending_list, drm.list)
> > - xe_sched_job_set_error(tmp_job, !i++ ? err : -ECANCELED);
> > - spin_unlock(&sched->base.job_list_lock);
> > -
> > - /* Start fence signaling */
> > - xe_hw_fence_irq_start(q->fence_irq);
> > -
> > - return DRM_GPU_SCHED_STAT_RESET;
> > + return DRM_GPU_SCHED_STAT_NO_HANG;
>
> This is error case. So, why return is changed to NO_HANG?
>
See above, this how we can delete xe_sched_add_pending_job.
> Niranjana
>
> >
> > sched_enable:
> > set_exec_queue_pending_tdr_exit(q);
> > @@ -2244,7 +2224,7 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc,
> > struct drm_sched_job *s_job;
> > struct xe_sched_job *job = NULL;
> >
> > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
> > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
> > job = to_xe_sched_job(s_job);
> >
> > xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d",
> > @@ -2349,7 +2329,7 @@ void xe_guc_submit_unpause(struct xe_guc *guc)
> > * created after resfix done.
> > */
> > if (q->guc->id != index ||
> > - !READ_ONCE(q->guc->sched.base.pause_submit))
> > + !drm_sched_is_stopped(&q->guc->sched.base))
> > continue;
> >
> > guc_exec_queue_unpause(guc, q);
> > @@ -2771,30 +2751,6 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
> > if (snapshot->parallel_execution)
> > guc_exec_queue_wq_snapshot_capture(q, snapshot);
> >
> > - spin_lock(&sched->base.job_list_lock);
> > - snapshot->pending_list_size = list_count_nodes(&sched->base.pending_list);
> > - snapshot->pending_list = kmalloc_array(snapshot->pending_list_size,
> > - sizeof(struct pending_list_snapshot),
> > - GFP_ATOMIC);
> > -
> > - if (snapshot->pending_list) {
> > - struct xe_sched_job *job_iter;
> > -
> > - i = 0;
> > - list_for_each_entry(job_iter, &sched->base.pending_list, drm.list) {
> > - snapshot->pending_list[i].seqno =
> > - xe_sched_job_seqno(job_iter);
> > - snapshot->pending_list[i].fence =
> > - dma_fence_is_signaled(job_iter->fence) ? 1 : 0;
> > - snapshot->pending_list[i].finished =
> > - dma_fence_is_signaled(&job_iter->drm.s_fence->finished)
> > - ? 1 : 0;
> > - i++;
> > - }
> > - }
> > -
> > - spin_unlock(&sched->base.job_list_lock);
> > -
> > return snapshot;
> > }
> >
> > @@ -2852,13 +2808,6 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
> >
> > if (snapshot->parallel_execution)
> > guc_exec_queue_wq_snapshot_print(snapshot, p);
> > -
> > - for (i = 0; snapshot->pending_list && i < snapshot->pending_list_size;
> > - i++)
> > - drm_printf(p, "\tJob: seqno=%d, fence=%d, finished=%d\n",
> > - snapshot->pending_list[i].seqno,
> > - snapshot->pending_list[i].fence,
> > - snapshot->pending_list[i].finished);
> > }
> >
> > /**
> > @@ -2881,7 +2830,6 @@ void xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *s
> > xe_lrc_snapshot_free(snapshot->lrc[i]);
> > kfree(snapshot->lrc);
> > }
> > - kfree(snapshot->pending_list);
> > kfree(snapshot);
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > index dc7456c34583..0b08c79cf3b9 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > @@ -61,12 +61,6 @@ struct guc_submit_parallel_scratch {
> > u32 wq[WQ_SIZE / sizeof(u32)];
> > };
> >
> > -struct pending_list_snapshot {
> > - u32 seqno;
> > - bool fence;
> > - bool finished;
> > -};
> > -
> > /**
> > * struct xe_guc_submit_exec_queue_snapshot - Snapshot for devcoredump
> > */
> > @@ -134,11 +128,6 @@ struct xe_guc_submit_exec_queue_snapshot {
> > /** @wq: Workqueue Items */
> > u32 wq[WQ_SIZE / sizeof(u32)];
> > } parallel;
> > -
> > - /** @pending_list_size: Size of the pending list snapshot array */
> > - int pending_list_size;
> > - /** @pending_list: snapshot of the pending list info */
> > - struct pending_list_snapshot *pending_list;
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c
> > index b2a0c46dfcd4..e65dfcdfdbc5 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_fence.c
> > +++ b/drivers/gpu/drm/xe/xe_hw_fence.c
> > @@ -110,22 +110,6 @@ void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq)
> > irq_work_queue(&irq->work);
> > }
> >
> > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq)
> > -{
> > - spin_lock_irq(&irq->lock);
> > - irq->enabled = false;
> > - spin_unlock_irq(&irq->lock);
> > -}
> > -
> > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq)
> > -{
> > - spin_lock_irq(&irq->lock);
> > - irq->enabled = true;
> > - spin_unlock_irq(&irq->lock);
> > -
> > - irq_work_queue(&irq->work);
> > -}
> > -
> > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> > struct xe_hw_fence_irq *irq, const char *name)
> > {
> > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h
> > index f13a1c4982c7..599492c13f80 100644
> > --- a/drivers/gpu/drm/xe/xe_hw_fence.h
> > +++ b/drivers/gpu/drm/xe/xe_hw_fence.h
> > @@ -17,8 +17,6 @@ void xe_hw_fence_module_exit(void);
> > void xe_hw_fence_irq_init(struct xe_hw_fence_irq *irq);
> > void xe_hw_fence_irq_finish(struct xe_hw_fence_irq *irq);
> > void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq);
> > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq);
> > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq);
> >
> > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> > struct xe_hw_fence_irq *irq, const char *name);
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR
2025-11-18 6:41 ` Niranjana Vishwanathapura
@ 2025-11-18 18:02 ` Matthew Brost
2025-11-18 21:19 ` Niranjana Vishwanathapura
0 siblings, 1 reply; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 18:02 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Mon, Nov 17, 2025 at 10:41:52PM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:24PM -0700, Matthew Brost wrote:
> > Deregistering queues in the TDR introduces unnecessary complexity,
> > requiring reference counting tricks to function correctly. All that's
> > needed in the TDR is to kick the queue off the hardware, which is
> > achieved by disabling scheduling. Queue deregistration should be handled
> > in a single, well-defined point in the cleanup path, tied to the queue's
> > reference count.
> >
>
> Overall looks good to me.
> But it would help if the commit text describes why this extra reference
> taking was there before for lr jobs and why it is not needed now.
>
This patch isn't related to LR jobs, the following patch is.
The deregistering queues in TDR was never required, and this patches
removes that flow.
Matt
> Niranjana
>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_submit.c | 57 +++---------------------------
> > 1 file changed, 5 insertions(+), 52 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 680696efc434..ab0f1a2d4871 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -69,9 +69,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
> > #define EXEC_QUEUE_STATE_WEDGED (1 << 8)
> > #define EXEC_QUEUE_STATE_BANNED (1 << 9)
> > #define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10)
> > -#define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11)
> > -#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 12)
> > -#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 13)
> > +#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 11)
> > +#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 12)
> >
> > static bool exec_queue_registered(struct xe_exec_queue *q)
> > {
> > @@ -218,21 +217,6 @@ static void clear_exec_queue_check_timeout(struct xe_exec_queue *q)
> > atomic_and(~EXEC_QUEUE_STATE_CHECK_TIMEOUT, &q->guc->state);
> > }
> >
> > -static bool exec_queue_extra_ref(struct xe_exec_queue *q)
> > -{
> > - return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_EXTRA_REF;
> > -}
> > -
> > -static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
> > -{
> > - atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
> > -}
> > -
> > -static void clear_exec_queue_extra_ref(struct xe_exec_queue *q)
> > -{
> > - atomic_and(~EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
> > -}
> > -
> > static bool exec_queue_pending_resume(struct xe_exec_queue *q)
> > {
> > return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
> > @@ -1190,25 +1174,6 @@ static void disable_scheduling(struct xe_exec_queue *q, bool immediate)
> > G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, 1);
> > }
> >
> > -static void __deregister_exec_queue(struct xe_guc *guc, struct xe_exec_queue *q)
> > -{
> > - u32 action[] = {
> > - XE_GUC_ACTION_DEREGISTER_CONTEXT,
> > - q->guc->id,
> > - };
> > -
> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
> > - xe_gt_assert(guc_to_gt(guc), exec_queue_registered(q));
> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_enable(q));
> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_disable(q));
> > -
> > - set_exec_queue_destroyed(q);
> > - trace_xe_exec_queue_deregister(q);
> > -
> > - xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > - G2H_LEN_DW_DEREGISTER_CONTEXT, 1);
> > -}
> > -
> > static enum drm_gpu_sched_stat
> > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > {
> > @@ -1326,8 +1291,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > xe_devcoredump(q, job,
> > "Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
> > q->guc->id, ret, xe_guc_read_stopped(guc));
> > - set_exec_queue_extra_ref(q);
> > - xe_exec_queue_get(q); /* GT reset owns this */
> > set_exec_queue_banned(q);
> > xe_gt_reset_async(q->gt);
> > xe_sched_tdr_queue_imm(sched);
> > @@ -1380,13 +1343,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > }
> > }
> >
> > - /* Finish cleaning up exec queue via deregister */
> > set_exec_queue_banned(q);
> > - if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
> > - set_exec_queue_extra_ref(q);
> > - xe_exec_queue_get(q);
> > - __deregister_exec_queue(guc, q);
> > - }
> >
> > /* Mark all outstanding jobs as bad, thus completing them */
> > xe_sched_job_set_error(job, err);
> > @@ -1928,7 +1885,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> >
> > /* Clean up lost G2H + reset engine state */
> > if (exec_queue_registered(q)) {
> > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
> > + if (xe_exec_queue_is_lr(q))
> > xe_exec_queue_put(q);
> > else if (exec_queue_destroyed(q))
> > __guc_exec_queue_destroy(guc, q);
> > @@ -2062,11 +2019,7 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
> >
> > if (exec_queue_destroyed(q) && exec_queue_registered(q)) {
> > clear_exec_queue_destroyed(q);
> > - if (exec_queue_extra_ref(q))
> > - xe_exec_queue_put(q);
> > - else
> > - q->guc->needs_cleanup = true;
> > - clear_exec_queue_extra_ref(q);
> > + q->guc->needs_cleanup = true;
> > xe_gt_dbg(guc_to_gt(guc), "Replay CLEANUP - guc_id=%d",
> > q->guc->id);
> > }
> > @@ -2483,7 +2436,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
> >
> > clear_exec_queue_registered(q);
> >
> > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
> > + if (xe_exec_queue_is_lr(q))
> > xe_exec_queue_put(q);
> > else
> > __guc_exec_queue_destroy(guc, q);
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission
2025-11-18 6:45 ` Niranjana Vishwanathapura
@ 2025-11-18 18:03 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 18:03 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Mon, Nov 17, 2025 at 10:45:31PM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:25PM -0700, Matthew Brost wrote:
> > Now that LR jobs are tracked by the DRM scheduler, there's no longer a
> > need to special-case LR queues. This change removes all LR
> > queue-specific handling, including dedicated TDR logic, reference
> > counting schemes, and other related mechanisms.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_exec_queue_types.h | 2 -
> > drivers/gpu/drm/xe/xe_guc_submit.c | 129 +------------------
> > 2 files changed, 7 insertions(+), 124 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
> > index a3b034e4b205..fd0915ed8eb1 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_exec_queue_types.h
> > @@ -33,8 +33,6 @@ struct xe_guc_exec_queue {
> > */
> > #define MAX_STATIC_MSG_TYPE 3
> > struct xe_sched_msg static_msgs[MAX_STATIC_MSG_TYPE];
> > - /** @lr_tdr: long running TDR worker */
> > - struct work_struct lr_tdr;
> > /** @destroy_async: do final destroy async from this worker */
> > struct work_struct destroy_async;
> > /** @resume_time: time of last resume */
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index ab0f1a2d4871..bb1f2929441c 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -674,14 +674,6 @@ static void register_exec_queue(struct xe_exec_queue *q, int ctx_type)
> > parallel_write(xe, map, wq_desc.wq_status, WQ_STATUS_ACTIVE);
> > }
> >
> > - /*
> > - * We must keep a reference for LR engines if engine is registered with
> > - * the GuC as jobs signal immediately and can't destroy an engine if the
> > - * GuC has a reference to it.
> > - */
> > - if (xe_exec_queue_is_lr(q))
> > - xe_exec_queue_get(q);
> > -
> > set_exec_queue_registered(q);
> > trace_xe_exec_queue_register(q);
> > if (xe_exec_queue_is_parallel(q))
> > @@ -854,7 +846,7 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
> > struct xe_sched_job *job = to_xe_sched_job(drm_job);
> > struct xe_exec_queue *q = job->q;
> > struct xe_guc *guc = exec_queue_to_guc(q);
> > - bool lr = xe_exec_queue_is_lr(q), killed_or_banned_or_wedged =
> > + bool killed_or_banned_or_wedged =
> > exec_queue_killed_or_banned_or_wedged(q);
> >
> > xe_gt_assert(guc_to_gt(guc), !(exec_queue_destroyed(q) || exec_queue_pending_disable(q)) ||
> > @@ -871,15 +863,6 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job)
> > job->skip_emit = false;
> > }
> >
> > - /*
> > - * We don't care about job-fence ordering in LR VMs because these fences
> > - * are never exported; they are used solely to keep jobs on the pending
> > - * list. Once a queue enters an error state, there's no need to track
> > - * them.
> > - */
> > - if (killed_or_banned_or_wedged && lr)
> > - xe_sched_job_set_error(job, -ECANCELED);
> > -
>
> Why this piece of code here is being removed?
>
The TDR will always run for LR jobs now, that path will error out the
job. Prior to this, the LR cleanup function only ran once.
> > return job->fence;
> > }
> >
> > @@ -923,8 +906,7 @@ static void disable_scheduling_deregister(struct xe_guc *guc,
> > xe_gt_warn(q->gt, "Pending enable/disable failed to respond\n");
> > xe_sched_submission_start(sched);
> > xe_gt_reset_async(q->gt);
> > - if (!xe_exec_queue_is_lr(q))
> > - xe_sched_tdr_queue_imm(sched);
> > + xe_sched_tdr_queue_imm(sched);
> > return;
> > }
> >
> > @@ -950,10 +932,7 @@ static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q)
> > /** to wakeup xe_wait_user_fence ioctl if exec queue is reset */
> > wake_up_all(&xe->ufence_wq);
> >
> > - if (xe_exec_queue_is_lr(q))
> > - queue_work(guc_to_gt(guc)->ordered_wq, &q->guc->lr_tdr);
> > - else
> > - xe_sched_tdr_queue_imm(&q->guc->sched);
> > + xe_sched_tdr_queue_imm(&q->guc->sched);
> > }
> >
> > /**
> > @@ -1009,78 +988,6 @@ static bool guc_submit_hint_wedged(struct xe_guc *guc)
> > return true;
> > }
> >
> > -static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> > -{
> > - struct xe_guc_exec_queue *ge =
> > - container_of(w, struct xe_guc_exec_queue, lr_tdr);
> > - struct xe_exec_queue *q = ge->q;
> > - struct xe_guc *guc = exec_queue_to_guc(q);
> > - struct xe_gpu_scheduler *sched = &ge->sched;
> > - struct drm_sched_job *job;
> > - bool wedged = false;
> > -
> > - xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
> > -
> > - if (vf_recovery(guc))
> > - return;
> > -
> > - trace_xe_exec_queue_lr_cleanup(q);
>
> Remove the trace event as well in xe_trace.h?
>
Yes, will do.
Matt
> Niranjana
>
> > -
> > - if (!exec_queue_killed(q))
> > - wedged = guc_submit_hint_wedged(exec_queue_to_guc(q));
> > -
> > - /* Kill the run_job / process_msg entry points */
> > - xe_sched_submission_stop(sched);
> > -
> > - /*
> > - * Engine state now mostly stable, disable scheduling / deregister if
> > - * needed. This cleanup routine might be called multiple times, where
> > - * the actual async engine deregister drops the final engine ref.
> > - * Calling disable_scheduling_deregister will mark the engine as
> > - * destroyed and fire off the CT requests to disable scheduling /
> > - * deregister, which we only want to do once. We also don't want to mark
> > - * the engine as pending_disable again as this may race with the
> > - * xe_guc_deregister_done_handler() which treats it as an unexpected
> > - * state.
> > - */
> > - if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
> > - struct xe_guc *guc = exec_queue_to_guc(q);
> > - int ret;
> > -
> > - set_exec_queue_banned(q);
> > - disable_scheduling_deregister(guc, q);
> > -
> > - /*
> > - * Must wait for scheduling to be disabled before signalling
> > - * any fences, if GT broken the GT reset code should signal us.
> > - */
> > - ret = wait_event_timeout(guc->ct.wq,
> > - !exec_queue_pending_disable(q) ||
> > - xe_guc_read_stopped(guc) ||
> > - vf_recovery(guc), HZ * 5);
> > - if (vf_recovery(guc))
> > - return;
> > -
> > - if (!ret) {
> > - xe_gt_warn(q->gt, "Schedule disable failed to respond, guc_id=%d\n",
> > - q->guc->id);
> > - xe_devcoredump(q, NULL, "Schedule disable failed to respond, guc_id=%d\n",
> > - q->guc->id);
> > - xe_sched_submission_start(sched);
> > - xe_gt_reset_async(q->gt);
> > - return;
> > - }
> > - }
> > -
> > - if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
> > - xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
> > -
> > - drm_sched_for_each_pending_job(job, &sched->base, NULL)
> > - xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
> > -
> > - xe_sched_submission_start(sched);
> > -}
> > -
> > #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
> >
> > static bool check_timeout(struct xe_exec_queue *q, struct xe_sched_job *job)
> > @@ -1150,8 +1057,7 @@ static void enable_scheduling(struct xe_exec_queue *q)
> > xe_gt_warn(guc_to_gt(guc), "Schedule enable failed to respond");
> > set_exec_queue_banned(q);
> > xe_gt_reset_async(q->gt);
> > - if (!xe_exec_queue_is_lr(q))
> > - xe_sched_tdr_queue_imm(&q->guc->sched);
> > + xe_sched_tdr_queue_imm(&q->guc->sched);
> > }
> > }
> >
> > @@ -1189,8 +1095,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > pid_t pid = -1;
> > bool wedged = false, skip_timeout_check;
> >
> > - xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
> > -
> > /*
> > * TDR has fired before free job worker. Common if exec queue
> > * immediately closed after last fence signaled. Add back to pending
> > @@ -1395,8 +1299,6 @@ static void __guc_exec_queue_destroy_async(struct work_struct *w)
> > xe_pm_runtime_get(guc_to_xe(guc));
> > trace_xe_exec_queue_destroy(q);
> >
> > - if (xe_exec_queue_is_lr(q))
> > - cancel_work_sync(&ge->lr_tdr);
> > /* Confirm no work left behind accessing device structures */
> > cancel_delayed_work_sync(&ge->sched.base.work_tdr);
> >
> > @@ -1629,9 +1531,6 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
> > if (err)
> > goto err_sched;
> >
> > - if (xe_exec_queue_is_lr(q))
> > - INIT_WORK(&q->guc->lr_tdr, xe_guc_exec_queue_lr_cleanup);
> > -
> > mutex_lock(&guc->submission_state.lock);
> >
> > err = alloc_guc_id(guc, q);
> > @@ -1885,9 +1784,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> >
> > /* Clean up lost G2H + reset engine state */
> > if (exec_queue_registered(q)) {
> > - if (xe_exec_queue_is_lr(q))
> > - xe_exec_queue_put(q);
> > - else if (exec_queue_destroyed(q))
> > + if (exec_queue_destroyed(q))
> > __guc_exec_queue_destroy(guc, q);
> > }
> > if (q->guc->suspend_pending) {
> > @@ -1917,9 +1814,6 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> > trace_xe_sched_job_ban(job);
> > ban = true;
> > }
> > - } else if (xe_exec_queue_is_lr(q) &&
> > - !xe_lrc_ring_is_idle(q->lrc[0])) {
> > - ban = true;
> > }
> >
> > if (ban) {
> > @@ -2002,8 +1896,6 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
> > if (pending_enable && !pending_resume &&
> > !exec_queue_pending_tdr_exit(q)) {
> > clear_exec_queue_registered(q);
> > - if (xe_exec_queue_is_lr(q))
> > - xe_exec_queue_put(q);
> > xe_gt_dbg(guc_to_gt(guc), "Replay REGISTER - guc_id=%d",
> > q->guc->id);
> > }
> > @@ -2060,10 +1952,7 @@ static void guc_exec_queue_pause(struct xe_guc *guc, struct xe_exec_queue *q)
> >
> > /* Stop scheduling + flush any DRM scheduler operations */
> > xe_sched_submission_stop(sched);
> > - if (xe_exec_queue_is_lr(q))
> > - cancel_work_sync(&q->guc->lr_tdr);
> > - else
> > - cancel_delayed_work_sync(&sched->base.work_tdr);
> > + cancel_delayed_work_sync(&sched->base.work_tdr);
> >
> > guc_exec_queue_revert_pending_state_change(guc, q);
> >
> > @@ -2435,11 +2324,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
> > trace_xe_exec_queue_deregister_done(q);
> >
> > clear_exec_queue_registered(q);
> > -
> > - if (xe_exec_queue_is_lr(q))
> > - xe_exec_queue_put(q);
> > - else
> > - __guc_exec_queue_destroy(guc, q);
> > + __guc_exec_queue_destroy(guc, q);
> > }
> >
> > int xe_guc_deregister_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running
2025-11-15 1:01 ` Niranjana Vishwanathapura
@ 2025-11-18 18:06 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 18:06 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Fri, Nov 14, 2025 at 05:01:45PM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:26PM -0700, Matthew Brost wrote:
> > If the firmware is not running during TDR (e.g., when the driver is
> > unloading), there's no need to toggle scheduling in the GuC. In such
> > cases, skip this step.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_submit.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index bb1f2929441c..ea0cfd866981 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -1146,7 +1146,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > if (exec_queue_reset(q))
> > err = -EIO;
> >
> > - if (!exec_queue_destroyed(q)) {
> > + if (!exec_queue_destroyed(q) && xe_uc_fw_is_running(&guc->fw)) {
> > /*
> > * Wait for any pending G2H to flush out before
> > * modifying state
>
> Looking at the code, it seems like if we skip this 'if' statement (when fw is
> not running), then it will go wait for ct->wq. Not sure how that gets woken up
> and logic might try to reset gt after that? Not sure if we should check
> xe_uc_fw_is_running() here will one of the conditions to wait_event_timeout()
> call cover this case and we can handle it appropriately after wait_event_timeout()
> returns?
I believe exec_queue_pending_disable will never be true, but maybe there
is race there. Let me and this condition for safety.
Matt
>
> Niranjana
>
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 1/7] drm/sched: Add pending job list iterator
2025-11-18 17:52 ` Matthew Brost
@ 2025-11-18 21:12 ` Niranjana Vishwanathapura
0 siblings, 0 replies; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 21:12 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Tue, Nov 18, 2025 at 09:52:40AM -0800, Matthew Brost wrote:
>On Fri, Nov 14, 2025 at 05:25:47PM -0800, Niranjana Vishwanathapura wrote:
>> On Thu, Oct 16, 2025 at 01:48:20PM -0700, Matthew Brost wrote:
>> > Stop open coding pending job list in drivers. Add pending job list
>> > iterator which safely walks DRM scheduler list asserting DRM scheduler
>> > is stopped.
>> >
>> > v2:
>> > - Fix checkpatch (CI)
>> > v3:
>> > - Drop locked version (Christian)
>> >
>> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> > ---
>> > include/drm/gpu_scheduler.h | 52 +++++++++++++++++++++++++++++++++++++
>> > 1 file changed, 52 insertions(+)
>> >
>> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> > index fb88301b3c45..7f31eba3bd61 100644
>> > --- a/include/drm/gpu_scheduler.h
>> > +++ b/include/drm/gpu_scheduler.h
>> > @@ -698,4 +698,56 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>> > struct drm_gpu_scheduler **sched_list,
>> > unsigned int num_sched_list);
>> >
>> > +/* Inlines */
>> > +
>> > +/**
>> > + * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
>> > + * @sched: DRM scheduler associated with pending job iterator
>> > + */
>> > +struct drm_sched_pending_job_iter {
>> > + struct drm_gpu_scheduler *sched;
>> > +};
>> > +
>> > +/* Drivers should never call this directly */
>> > +static inline struct drm_sched_pending_job_iter
>> > +__drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
>> > +{
>> > + struct drm_sched_pending_job_iter iter = {
>> > + .sched = sched,
>> > + };
>> > +
>> > + WARN_ON(!READ_ONCE(sched->pause_submit));
>> > + return iter;
>> > +}
>> > +
>> > +/* Drivers should never call this directly */
>> > +static inline void
>> > +__drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
>> > +{
>> > + WARN_ON(!READ_ONCE(iter.sched->pause_submit));
>> > +}
>>
>> May be instead of these inline functions, we can add the code in a '({' block
>> in the below DEFINE_CLASS itself to avoid drivers from calling these inline
>> funcions? Though I agree these inline functions makes it cleaner to read.
>>
>
>I'm not sure we can just call code inline from DEFINE_CLASS, rather only
>functions.
I do see some examples of it.
https://elixir.bootlin.com/linux/v6.18-rc6/source/drivers/gpu/drm/xe/xe_validation.h#L167
https://elixir.bootlin.com/linux/v6.18-rc6/source/drivers/gpio/gpiolib.h#L229
But DEFINE_CLASS also inserts static inline functions here. So, not super critical.
>
>> > +
>> > +DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
>> > + __drm_sched_pending_job_iter_end(_T),
>> > + __drm_sched_pending_job_iter_begin(__sched),
>> > + struct drm_gpu_scheduler *__sched);
>> > +static inline void *
>> > +class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T)
>> > +{ return _T; }
>> > +#define class_drm_sched_pending_job_iter_is_conditional false
>> > +
>> > +/**
>> > + * drm_sched_for_each_pending_job() - Iterator for each pending job in scheduler
>> > + * @__job: Current pending job being iterated over
>> > + * @__sched: DRM scheduler to iterate over pending jobs
>> > + * @__entity: DRM scheduler entity to filter jobs, NULL indicates no filter
>> > + *
>> > + * Iterator for each pending job in scheduler, filtering on an entity, and
>> > + * enforcing scheduler is fully stopped
>> > + */
>> > +#define drm_sched_for_each_pending_job(__job, __sched, __entity) \
>> > + scoped_guard(drm_sched_pending_job_iter, (__sched)) \
>> > + list_for_each_entry((__job), &(__sched)->pending_list, list) \
>> > + for_each_if(!(__entity) || (__job)->entity == (__entity))
>> > +
>>
>> I am comparing it with DEFINE_CLASS usage in ttm driver here.
>> It looks like the body of this macro (where we call list_for_each_entry()),
>> doesn't use the drm_sched_pending_job_iter at all. So, looks like the only
>> reason we are using a DEFINE_CLASS with scoped_guard here is for those
>> WARN_ON() messages at the beginning and end of loop iteration, which is not
>> fully fool proof. Right?
>
>The drm_sched_pending_job_iter is for futuring proofing (e.g., if we
>need more information than drm_gpu_scheduler, we have iterator
>structure).
>
>The define class is purpose is to ensure at the start of iterator and
>end of the iterator the scheduler is paused which is only time we (DRM
>scheduler maintainers) have agreed it is safe for driver to look at the
>pending list. FWIW, this caught some bugs in Xe VF restore
>implementation.
>
>> I wonder if we really need DEFINE_CLASS here for that, though I am not
>> against using it.
>>
>
>So yes, I think a DEFINE_CLASS is apporiate here to implement the
>iterator.
>
Ok, thanks.
Nianjana
>Matt
>
>> Niranjana
>>
>> > #endif
>> > --
>> > 2.34.1
>> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals
2025-11-18 17:59 ` Matthew Brost
@ 2025-11-18 21:17 ` Niranjana Vishwanathapura
2025-11-18 22:54 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 21:17 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Tue, Nov 18, 2025 at 09:59:32AM -0800, Matthew Brost wrote:
>On Mon, Nov 17, 2025 at 10:39:42PM -0800, Niranjana Vishwanathapura wrote:
>> On Thu, Oct 16, 2025 at 01:48:23PM -0700, Matthew Brost wrote:
>> > Use new pending job list iterator and new helper functions in Xe to
>> > avoid reaching into DRM scheduler internals.
>> >
>> > Part of this change involves removing pending jobs debug information
>> > from debugfs and devcoredump. As agreed, the pending job list should
>> > only be accessed when the scheduler is stopped. However, it's not
>> > straightforward to determine whether the scheduler is stopped from the
>> > shared debugfs/devcoredump code path. Additionally, the pending job list
>> > provides little useful information, as pending jobs can be inferred from
>> > seqnos and ring head/tail positions. Therefore, this debug information
>> > is being removed.
>> >
>> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> > ---
>> > drivers/gpu/drm/xe/xe_gpu_scheduler.c | 4 +-
>> > drivers/gpu/drm/xe/xe_gpu_scheduler.h | 34 +++--------
>> > drivers/gpu/drm/xe/xe_guc_submit.c | 74 ++++--------------------
>> > drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
>> > drivers/gpu/drm/xe/xe_hw_fence.c | 16 -----
>> > drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
>> > 6 files changed, 20 insertions(+), 121 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> > index f4f23317191f..9c8004d5dd91 100644
>> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> > @@ -7,7 +7,7 @@
>> >
>> > static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
>> > {
>> > - if (!READ_ONCE(sched->base.pause_submit))
>> > + if (!drm_sched_is_stopped(&sched->base))
>> > queue_work(sched->base.submit_wq, &sched->work_process_msg);
>> > }
>> >
>> > @@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct work_struct *w)
>> > container_of(w, struct xe_gpu_scheduler, work_process_msg);
>> > struct xe_sched_msg *msg;
>> >
>> > - if (READ_ONCE(sched->base.pause_submit))
>> > + if (drm_sched_is_stopped(&sched->base))
>> > return;
>> >
>> > msg = xe_sched_get_msg(sched);
>> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> > index b971b6b69419..583372a78140 100644
>> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> > @@ -55,14 +55,10 @@ static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
>> > {
>> > struct drm_sched_job *s_job;
>> >
>> > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
>> > - struct drm_sched_fence *s_fence = s_job->s_fence;
>> > - struct dma_fence *hw_fence = s_fence->parent;
>> > -
>> > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL)
>> > if (to_xe_sched_job(s_job)->skip_emit ||
>> > - (hw_fence && !dma_fence_is_signaled(hw_fence)))
>> > + !drm_sched_job_is_signaled(s_job))
>> > sched->base.ops->run_job(s_job);
>> > - }
>> > }
>> >
>> > static inline bool
>> > @@ -71,14 +67,6 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
>> > return drm_sched_invalidate_job(&job->drm, threshold);
>> > }
>> >
>> > -static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
>> > - struct xe_sched_job *job)
>> > -{
>> > - spin_lock(&sched->base.job_list_lock);
>> > - list_add(&job->drm.list, &sched->base.pending_list);
>> > - spin_unlock(&sched->base.job_list_lock);
>> > -}
>> > -
>> > /**
>> > * xe_sched_first_pending_job() - Find first pending job which is unsignaled
>> > * @sched: Xe GPU scheduler
>> > @@ -88,21 +76,13 @@ static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
>> > static inline
>> > struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
>> > {
>> > - struct xe_sched_job *job, *r_job = NULL;
>> > -
>> > - spin_lock(&sched->base.job_list_lock);
>> > - list_for_each_entry(job, &sched->base.pending_list, drm.list) {
>> > - struct drm_sched_fence *s_fence = job->drm.s_fence;
>> > - struct dma_fence *hw_fence = s_fence->parent;
>> > + struct drm_sched_job *job;
>> >
>> > - if (hw_fence && !dma_fence_is_signaled(hw_fence)) {
>> > - r_job = job;
>> > - break;
>> > - }
>> > - }
>> > - spin_unlock(&sched->base.job_list_lock);
>> > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
>> > + if (!drm_sched_job_is_signaled(job))
>> > + return to_xe_sched_job(job);
>> >
>> > - return r_job;
>> > + return NULL;
>> > }
>> >
>> > static inline int
>> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>> > index 0ef67d3523a7..680696efc434 100644
>> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> > @@ -1032,7 +1032,7 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
>> > struct xe_exec_queue *q = ge->q;
>> > struct xe_guc *guc = exec_queue_to_guc(q);
>> > struct xe_gpu_scheduler *sched = &ge->sched;
>> > - struct xe_sched_job *job;
>> > + struct drm_sched_job *job;
>> > bool wedged = false;
>> >
>> > xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
>> > @@ -1091,16 +1091,10 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
>> > if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
>> > xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
>> >
>> > - xe_hw_fence_irq_stop(q->fence_irq);
>> > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
>> > + xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
>> >
>> > xe_sched_submission_start(sched);
>> > -
>> > - spin_lock(&sched->base.job_list_lock);
>> > - list_for_each_entry(job, &sched->base.pending_list, drm.list)
>> > - xe_sched_job_set_error(job, -ECANCELED);
>> > - spin_unlock(&sched->base.job_list_lock);
>> > -
>> > - xe_hw_fence_irq_start(q->fence_irq);
>> > }
>> >
>> > #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
>> > @@ -1219,7 +1213,7 @@ static enum drm_gpu_sched_stat
>> > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > {
>> > struct xe_sched_job *job = to_xe_sched_job(drm_job);
>> > - struct xe_sched_job *tmp_job;
>> > + struct drm_sched_job *tmp_job;
>> > struct xe_exec_queue *q = job->q;
>> > struct xe_gpu_scheduler *sched = &q->guc->sched;
>> > struct xe_guc *guc = exec_queue_to_guc(q);
>> > @@ -1228,7 +1222,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > unsigned int fw_ref;
>> > int err = -ETIME;
>> > pid_t pid = -1;
>> > - int i = 0;
>> > bool wedged = false, skip_timeout_check;
>> >
>> > xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
>> > @@ -1395,28 +1388,15 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > __deregister_exec_queue(guc, q);
>> > }
>> >
>> > - /* Stop fence signaling */
>> > - xe_hw_fence_irq_stop(q->fence_irq);
>> > + /* Mark all outstanding jobs as bad, thus completing them */
>> > + xe_sched_job_set_error(job, err);
>>
>> This setting error for this timed out job is newly added.
>> Why was it not there before and being added now?
>>
>
>Because the TDR job was added back into the pending list first, so in
>fact we did set the error on the job.
>
Ok, got it. Thanks.
>> > + drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
>> > + xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
>> >
>> > - /*
>> > - * Fence state now stable, stop / start scheduler which cleans up any
>> > - * fences that are complete
>> > - */
>> > - xe_sched_add_pending_job(sched, job);
>>
>> Why xe_sched_add_pending_job() was there before?
>>
>
>We (DRM scheduler maintainers agreed drivers shouldn't touch the pending
>list), below returning DRM_GPU_SCHED_STAT_NO_HANG defers this step to
>the DRM scheduler core.
>
>> > xe_sched_submission_start(sched);
>> > -
>> > xe_guc_exec_queue_trigger_cleanup(q);
>>
>> Why do we need to trigger cleanup again here?
>>
>
>This is existing code and it should only be called once in this
>function. At this point in time, we don't know if the TDR fired
>naturally with a normal timeout value or if we are already in process of
>cleaning up. If it is the former, then we switch to cleanup immediately
>mode which is why this call is needed.
>
>> >
>> > - /* Mark all outstanding jobs as bad, thus completing them */
>> > - spin_lock(&sched->base.job_list_lock);
>> > - list_for_each_entry(tmp_job, &sched->base.pending_list, drm.list)
>> > - xe_sched_job_set_error(tmp_job, !i++ ? err : -ECANCELED);
>> > - spin_unlock(&sched->base.job_list_lock);
>> > -
>> > - /* Start fence signaling */
>> > - xe_hw_fence_irq_start(q->fence_irq);
>> > -
>> > - return DRM_GPU_SCHED_STAT_RESET;
>> > + return DRM_GPU_SCHED_STAT_NO_HANG;
>>
>> This is error case. So, why return is changed to NO_HANG?
>>
>
>See above, this how we can delete xe_sched_add_pending_job.
>
Ok, returning NO_HANG here so that drm scheduler adds the job
back into the pending list. It is bit confusing to reader as
to why we return NO_HANG even the case of a hang (error)
condition here. May be a comment will help.
Niranjana
>> Niranjana
>>
>> >
>> > sched_enable:
>> > set_exec_queue_pending_tdr_exit(q);
>> > @@ -2244,7 +2224,7 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc,
>> > struct drm_sched_job *s_job;
>> > struct xe_sched_job *job = NULL;
>> >
>> > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
>> > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
>> > job = to_xe_sched_job(s_job);
>> >
>> > xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d",
>> > @@ -2349,7 +2329,7 @@ void xe_guc_submit_unpause(struct xe_guc *guc)
>> > * created after resfix done.
>> > */
>> > if (q->guc->id != index ||
>> > - !READ_ONCE(q->guc->sched.base.pause_submit))
>> > + !drm_sched_is_stopped(&q->guc->sched.base))
>> > continue;
>> >
>> > guc_exec_queue_unpause(guc, q);
>> > @@ -2771,30 +2751,6 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
>> > if (snapshot->parallel_execution)
>> > guc_exec_queue_wq_snapshot_capture(q, snapshot);
>> >
>> > - spin_lock(&sched->base.job_list_lock);
>> > - snapshot->pending_list_size = list_count_nodes(&sched->base.pending_list);
>> > - snapshot->pending_list = kmalloc_array(snapshot->pending_list_size,
>> > - sizeof(struct pending_list_snapshot),
>> > - GFP_ATOMIC);
>> > -
>> > - if (snapshot->pending_list) {
>> > - struct xe_sched_job *job_iter;
>> > -
>> > - i = 0;
>> > - list_for_each_entry(job_iter, &sched->base.pending_list, drm.list) {
>> > - snapshot->pending_list[i].seqno =
>> > - xe_sched_job_seqno(job_iter);
>> > - snapshot->pending_list[i].fence =
>> > - dma_fence_is_signaled(job_iter->fence) ? 1 : 0;
>> > - snapshot->pending_list[i].finished =
>> > - dma_fence_is_signaled(&job_iter->drm.s_fence->finished)
>> > - ? 1 : 0;
>> > - i++;
>> > - }
>> > - }
>> > -
>> > - spin_unlock(&sched->base.job_list_lock);
>> > -
>> > return snapshot;
>> > }
>> >
>> > @@ -2852,13 +2808,6 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
>> >
>> > if (snapshot->parallel_execution)
>> > guc_exec_queue_wq_snapshot_print(snapshot, p);
>> > -
>> > - for (i = 0; snapshot->pending_list && i < snapshot->pending_list_size;
>> > - i++)
>> > - drm_printf(p, "\tJob: seqno=%d, fence=%d, finished=%d\n",
>> > - snapshot->pending_list[i].seqno,
>> > - snapshot->pending_list[i].fence,
>> > - snapshot->pending_list[i].finished);
>> > }
>> >
>> > /**
>> > @@ -2881,7 +2830,6 @@ void xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *s
>> > xe_lrc_snapshot_free(snapshot->lrc[i]);
>> > kfree(snapshot->lrc);
>> > }
>> > - kfree(snapshot->pending_list);
>> > kfree(snapshot);
>> > }
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h
>> > index dc7456c34583..0b08c79cf3b9 100644
>> > --- a/drivers/gpu/drm/xe/xe_guc_submit_types.h
>> > +++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h
>> > @@ -61,12 +61,6 @@ struct guc_submit_parallel_scratch {
>> > u32 wq[WQ_SIZE / sizeof(u32)];
>> > };
>> >
>> > -struct pending_list_snapshot {
>> > - u32 seqno;
>> > - bool fence;
>> > - bool finished;
>> > -};
>> > -
>> > /**
>> > * struct xe_guc_submit_exec_queue_snapshot - Snapshot for devcoredump
>> > */
>> > @@ -134,11 +128,6 @@ struct xe_guc_submit_exec_queue_snapshot {
>> > /** @wq: Workqueue Items */
>> > u32 wq[WQ_SIZE / sizeof(u32)];
>> > } parallel;
>> > -
>> > - /** @pending_list_size: Size of the pending list snapshot array */
>> > - int pending_list_size;
>> > - /** @pending_list: snapshot of the pending list info */
>> > - struct pending_list_snapshot *pending_list;
>> > };
>> >
>> > #endif
>> > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c
>> > index b2a0c46dfcd4..e65dfcdfdbc5 100644
>> > --- a/drivers/gpu/drm/xe/xe_hw_fence.c
>> > +++ b/drivers/gpu/drm/xe/xe_hw_fence.c
>> > @@ -110,22 +110,6 @@ void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq)
>> > irq_work_queue(&irq->work);
>> > }
>> >
>> > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq)
>> > -{
>> > - spin_lock_irq(&irq->lock);
>> > - irq->enabled = false;
>> > - spin_unlock_irq(&irq->lock);
>> > -}
>> > -
>> > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq)
>> > -{
>> > - spin_lock_irq(&irq->lock);
>> > - irq->enabled = true;
>> > - spin_unlock_irq(&irq->lock);
>> > -
>> > - irq_work_queue(&irq->work);
>> > -}
>> > -
>> > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
>> > struct xe_hw_fence_irq *irq, const char *name)
>> > {
>> > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h
>> > index f13a1c4982c7..599492c13f80 100644
>> > --- a/drivers/gpu/drm/xe/xe_hw_fence.h
>> > +++ b/drivers/gpu/drm/xe/xe_hw_fence.h
>> > @@ -17,8 +17,6 @@ void xe_hw_fence_module_exit(void);
>> > void xe_hw_fence_irq_init(struct xe_hw_fence_irq *irq);
>> > void xe_hw_fence_irq_finish(struct xe_hw_fence_irq *irq);
>> > void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq);
>> > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq);
>> > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq);
>> >
>> > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
>> > struct xe_hw_fence_irq *irq, const char *name);
>> > --
>> > 2.34.1
>> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR
2025-11-18 18:02 ` Matthew Brost
@ 2025-11-18 21:19 ` Niranjana Vishwanathapura
2025-11-18 22:59 ` Matthew Brost
0 siblings, 1 reply; 31+ messages in thread
From: Niranjana Vishwanathapura @ 2025-11-18 21:19 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Tue, Nov 18, 2025 at 10:02:00AM -0800, Matthew Brost wrote:
>On Mon, Nov 17, 2025 at 10:41:52PM -0800, Niranjana Vishwanathapura wrote:
>> On Thu, Oct 16, 2025 at 01:48:24PM -0700, Matthew Brost wrote:
>> > Deregistering queues in the TDR introduces unnecessary complexity,
>> > requiring reference counting tricks to function correctly. All that's
>> > needed in the TDR is to kick the queue off the hardware, which is
>> > achieved by disabling scheduling. Queue deregistration should be handled
>> > in a single, well-defined point in the cleanup path, tied to the queue's
>> > reference count.
>> >
>>
>> Overall looks good to me.
>> But it would help if the commit text describes why this extra reference
>> taking was there before for lr jobs and why it is not needed now.
>>
>
>This patch isn't related to LR jobs, the following patch is.
>
I was talking about the set/clear_exec_queue_extra_ref() and its usage
being removed in this patchset.
>The deregistering queues in TDR was never required, and this patches
>removes that flow.
>
Ok, thanks.
Niranjana
>Matt
>
>> Niranjana
>>
>> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> > ---
>> > drivers/gpu/drm/xe/xe_guc_submit.c | 57 +++---------------------------
>> > 1 file changed, 5 insertions(+), 52 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>> > index 680696efc434..ab0f1a2d4871 100644
>> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> > @@ -69,9 +69,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
>> > #define EXEC_QUEUE_STATE_WEDGED (1 << 8)
>> > #define EXEC_QUEUE_STATE_BANNED (1 << 9)
>> > #define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10)
>> > -#define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11)
>> > -#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 12)
>> > -#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 13)
>> > +#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 11)
>> > +#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 12)
>> >
>> > static bool exec_queue_registered(struct xe_exec_queue *q)
>> > {
>> > @@ -218,21 +217,6 @@ static void clear_exec_queue_check_timeout(struct xe_exec_queue *q)
>> > atomic_and(~EXEC_QUEUE_STATE_CHECK_TIMEOUT, &q->guc->state);
>> > }
>> >
>> > -static bool exec_queue_extra_ref(struct xe_exec_queue *q)
>> > -{
>> > - return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_EXTRA_REF;
>> > -}
>> > -
>> > -static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
>> > -{
>> > - atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
>> > -}
>> > -
>> > -static void clear_exec_queue_extra_ref(struct xe_exec_queue *q)
>> > -{
>> > - atomic_and(~EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
>> > -}
>> > -
>> > static bool exec_queue_pending_resume(struct xe_exec_queue *q)
>> > {
>> > return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
>> > @@ -1190,25 +1174,6 @@ static void disable_scheduling(struct xe_exec_queue *q, bool immediate)
>> > G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, 1);
>> > }
>> >
>> > -static void __deregister_exec_queue(struct xe_guc *guc, struct xe_exec_queue *q)
>> > -{
>> > - u32 action[] = {
>> > - XE_GUC_ACTION_DEREGISTER_CONTEXT,
>> > - q->guc->id,
>> > - };
>> > -
>> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
>> > - xe_gt_assert(guc_to_gt(guc), exec_queue_registered(q));
>> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_enable(q));
>> > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_disable(q));
>> > -
>> > - set_exec_queue_destroyed(q);
>> > - trace_xe_exec_queue_deregister(q);
>> > -
>> > - xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
>> > - G2H_LEN_DW_DEREGISTER_CONTEXT, 1);
>> > -}
>> > -
>> > static enum drm_gpu_sched_stat
>> > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > {
>> > @@ -1326,8 +1291,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > xe_devcoredump(q, job,
>> > "Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
>> > q->guc->id, ret, xe_guc_read_stopped(guc));
>> > - set_exec_queue_extra_ref(q);
>> > - xe_exec_queue_get(q); /* GT reset owns this */
>> > set_exec_queue_banned(q);
>> > xe_gt_reset_async(q->gt);
>> > xe_sched_tdr_queue_imm(sched);
>> > @@ -1380,13 +1343,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>> > }
>> > }
>> >
>> > - /* Finish cleaning up exec queue via deregister */
>> > set_exec_queue_banned(q);
>> > - if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
>> > - set_exec_queue_extra_ref(q);
>> > - xe_exec_queue_get(q);
>> > - __deregister_exec_queue(guc, q);
>> > - }
>> >
>> > /* Mark all outstanding jobs as bad, thus completing them */
>> > xe_sched_job_set_error(job, err);
>> > @@ -1928,7 +1885,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
>> >
>> > /* Clean up lost G2H + reset engine state */
>> > if (exec_queue_registered(q)) {
>> > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
>> > + if (xe_exec_queue_is_lr(q))
>> > xe_exec_queue_put(q);
>> > else if (exec_queue_destroyed(q))
>> > __guc_exec_queue_destroy(guc, q);
>> > @@ -2062,11 +2019,7 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
>> >
>> > if (exec_queue_destroyed(q) && exec_queue_registered(q)) {
>> > clear_exec_queue_destroyed(q);
>> > - if (exec_queue_extra_ref(q))
>> > - xe_exec_queue_put(q);
>> > - else
>> > - q->guc->needs_cleanup = true;
>> > - clear_exec_queue_extra_ref(q);
>> > + q->guc->needs_cleanup = true;
>> > xe_gt_dbg(guc_to_gt(guc), "Replay CLEANUP - guc_id=%d",
>> > q->guc->id);
>> > }
>> > @@ -2483,7 +2436,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
>> >
>> > clear_exec_queue_registered(q);
>> >
>> > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
>> > + if (xe_exec_queue_is_lr(q))
>> > xe_exec_queue_put(q);
>> > else
>> > __guc_exec_queue_destroy(guc, q);
>> > --
>> > 2.34.1
>> >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals
2025-11-18 21:17 ` Niranjana Vishwanathapura
@ 2025-11-18 22:54 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 22:54 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Tue, Nov 18, 2025 at 01:17:22PM -0800, Niranjana Vishwanathapura wrote:
> On Tue, Nov 18, 2025 at 09:59:32AM -0800, Matthew Brost wrote:
> > On Mon, Nov 17, 2025 at 10:39:42PM -0800, Niranjana Vishwanathapura wrote:
> > > On Thu, Oct 16, 2025 at 01:48:23PM -0700, Matthew Brost wrote:
> > > > Use new pending job list iterator and new helper functions in Xe to
> > > > avoid reaching into DRM scheduler internals.
> > > >
> > > > Part of this change involves removing pending jobs debug information
> > > > from debugfs and devcoredump. As agreed, the pending job list should
> > > > only be accessed when the scheduler is stopped. However, it's not
> > > > straightforward to determine whether the scheduler is stopped from the
> > > > shared debugfs/devcoredump code path. Additionally, the pending job list
> > > > provides little useful information, as pending jobs can be inferred from
> > > > seqnos and ring head/tail positions. Therefore, this debug information
> > > > is being removed.
> > > >
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_gpu_scheduler.c | 4 +-
> > > > drivers/gpu/drm/xe/xe_gpu_scheduler.h | 34 +++--------
> > > > drivers/gpu/drm/xe/xe_guc_submit.c | 74 ++++--------------------
> > > > drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
> > > > drivers/gpu/drm/xe/xe_hw_fence.c | 16 -----
> > > > drivers/gpu/drm/xe/xe_hw_fence.h | 2 -
> > > > 6 files changed, 20 insertions(+), 121 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > > index f4f23317191f..9c8004d5dd91 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > > @@ -7,7 +7,7 @@
> > > >
> > > > static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
> > > > {
> > > > - if (!READ_ONCE(sched->base.pause_submit))
> > > > + if (!drm_sched_is_stopped(&sched->base))
> > > > queue_work(sched->base.submit_wq, &sched->work_process_msg);
> > > > }
> > > >
> > > > @@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct work_struct *w)
> > > > container_of(w, struct xe_gpu_scheduler, work_process_msg);
> > > > struct xe_sched_msg *msg;
> > > >
> > > > - if (READ_ONCE(sched->base.pause_submit))
> > > > + if (drm_sched_is_stopped(&sched->base))
> > > > return;
> > > >
> > > > msg = xe_sched_get_msg(sched);
> > > > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > > index b971b6b69419..583372a78140 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > > @@ -55,14 +55,10 @@ static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
> > > > {
> > > > struct drm_sched_job *s_job;
> > > >
> > > > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
> > > > - struct drm_sched_fence *s_fence = s_job->s_fence;
> > > > - struct dma_fence *hw_fence = s_fence->parent;
> > > > -
> > > > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL)
> > > > if (to_xe_sched_job(s_job)->skip_emit ||
> > > > - (hw_fence && !dma_fence_is_signaled(hw_fence)))
> > > > + !drm_sched_job_is_signaled(s_job))
> > > > sched->base.ops->run_job(s_job);
> > > > - }
> > > > }
> > > >
> > > > static inline bool
> > > > @@ -71,14 +67,6 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
> > > > return drm_sched_invalidate_job(&job->drm, threshold);
> > > > }
> > > >
> > > > -static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
> > > > - struct xe_sched_job *job)
> > > > -{
> > > > - spin_lock(&sched->base.job_list_lock);
> > > > - list_add(&job->drm.list, &sched->base.pending_list);
> > > > - spin_unlock(&sched->base.job_list_lock);
> > > > -}
> > > > -
> > > > /**
> > > > * xe_sched_first_pending_job() - Find first pending job which is unsignaled
> > > > * @sched: Xe GPU scheduler
> > > > @@ -88,21 +76,13 @@ static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
> > > > static inline
> > > > struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
> > > > {
> > > > - struct xe_sched_job *job, *r_job = NULL;
> > > > -
> > > > - spin_lock(&sched->base.job_list_lock);
> > > > - list_for_each_entry(job, &sched->base.pending_list, drm.list) {
> > > > - struct drm_sched_fence *s_fence = job->drm.s_fence;
> > > > - struct dma_fence *hw_fence = s_fence->parent;
> > > > + struct drm_sched_job *job;
> > > >
> > > > - if (hw_fence && !dma_fence_is_signaled(hw_fence)) {
> > > > - r_job = job;
> > > > - break;
> > > > - }
> > > > - }
> > > > - spin_unlock(&sched->base.job_list_lock);
> > > > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
> > > > + if (!drm_sched_job_is_signaled(job))
> > > > + return to_xe_sched_job(job);
> > > >
> > > > - return r_job;
> > > > + return NULL;
> > > > }
> > > >
> > > > static inline int
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > index 0ef67d3523a7..680696efc434 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > @@ -1032,7 +1032,7 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> > > > struct xe_exec_queue *q = ge->q;
> > > > struct xe_guc *guc = exec_queue_to_guc(q);
> > > > struct xe_gpu_scheduler *sched = &ge->sched;
> > > > - struct xe_sched_job *job;
> > > > + struct drm_sched_job *job;
> > > > bool wedged = false;
> > > >
> > > > xe_gt_assert(guc_to_gt(guc), xe_exec_queue_is_lr(q));
> > > > @@ -1091,16 +1091,10 @@ static void xe_guc_exec_queue_lr_cleanup(struct work_struct *w)
> > > > if (!exec_queue_killed(q) && !xe_lrc_ring_is_idle(q->lrc[0]))
> > > > xe_devcoredump(q, NULL, "LR job cleanup, guc_id=%d", q->guc->id);
> > > >
> > > > - xe_hw_fence_irq_stop(q->fence_irq);
> > > > + drm_sched_for_each_pending_job(job, &sched->base, NULL)
> > > > + xe_sched_job_set_error(to_xe_sched_job(job), -ECANCELED);
> > > >
> > > > xe_sched_submission_start(sched);
> > > > -
> > > > - spin_lock(&sched->base.job_list_lock);
> > > > - list_for_each_entry(job, &sched->base.pending_list, drm.list)
> > > > - xe_sched_job_set_error(job, -ECANCELED);
> > > > - spin_unlock(&sched->base.job_list_lock);
> > > > -
> > > > - xe_hw_fence_irq_start(q->fence_irq);
> > > > }
> > > >
> > > > #define ADJUST_FIVE_PERCENT(__t) mul_u64_u32_div(__t, 105, 100)
> > > > @@ -1219,7 +1213,7 @@ static enum drm_gpu_sched_stat
> > > > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > {
> > > > struct xe_sched_job *job = to_xe_sched_job(drm_job);
> > > > - struct xe_sched_job *tmp_job;
> > > > + struct drm_sched_job *tmp_job;
> > > > struct xe_exec_queue *q = job->q;
> > > > struct xe_gpu_scheduler *sched = &q->guc->sched;
> > > > struct xe_guc *guc = exec_queue_to_guc(q);
> > > > @@ -1228,7 +1222,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > unsigned int fw_ref;
> > > > int err = -ETIME;
> > > > pid_t pid = -1;
> > > > - int i = 0;
> > > > bool wedged = false, skip_timeout_check;
> > > >
> > > > xe_gt_assert(guc_to_gt(guc), !xe_exec_queue_is_lr(q));
> > > > @@ -1395,28 +1388,15 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > __deregister_exec_queue(guc, q);
> > > > }
> > > >
> > > > - /* Stop fence signaling */
> > > > - xe_hw_fence_irq_stop(q->fence_irq);
> > > > + /* Mark all outstanding jobs as bad, thus completing them */
> > > > + xe_sched_job_set_error(job, err);
> > >
> > > This setting error for this timed out job is newly added.
> > > Why was it not there before and being added now?
> > >
> >
> > Because the TDR job was added back into the pending list first, so in
> > fact we did set the error on the job.
> >
>
> Ok, got it. Thanks.
>
> > > > + drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
> > > > + xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
> > > >
> > > > - /*
> > > > - * Fence state now stable, stop / start scheduler which cleans up any
> > > > - * fences that are complete
> > > > - */
> > > > - xe_sched_add_pending_job(sched, job);
> > >
> > > Why xe_sched_add_pending_job() was there before?
> > >
> >
> > We (DRM scheduler maintainers agreed drivers shouldn't touch the pending
> > list), below returning DRM_GPU_SCHED_STAT_NO_HANG defers this step to
> > the DRM scheduler core.
> >
> > > > xe_sched_submission_start(sched);
> > > > -
> > > > xe_guc_exec_queue_trigger_cleanup(q);
> > >
> > > Why do we need to trigger cleanup again here?
> > >
> >
> > This is existing code and it should only be called once in this
> > function. At this point in time, we don't know if the TDR fired
> > naturally with a normal timeout value or if we are already in process of
> > cleaning up. If it is the former, then we switch to cleanup immediately
> > mode which is why this call is needed.
> >
> > > >
> > > > - /* Mark all outstanding jobs as bad, thus completing them */
> > > > - spin_lock(&sched->base.job_list_lock);
> > > > - list_for_each_entry(tmp_job, &sched->base.pending_list, drm.list)
> > > > - xe_sched_job_set_error(tmp_job, !i++ ? err : -ECANCELED);
> > > > - spin_unlock(&sched->base.job_list_lock);
> > > > -
> > > > - /* Start fence signaling */
> > > > - xe_hw_fence_irq_start(q->fence_irq);
> > > > -
> > > > - return DRM_GPU_SCHED_STAT_RESET;
> > > > + return DRM_GPU_SCHED_STAT_NO_HANG;
> > >
> > > This is error case. So, why return is changed to NO_HANG?
> > >
> >
> > See above, this how we can delete xe_sched_add_pending_job.
> >
>
> Ok, returning NO_HANG here so that drm scheduler adds the job
> back into the pending list. It is bit confusing to reader as
> to why we return NO_HANG even the case of a hang (error)
> condition here. May be a comment will help.
>
This is in the DRM scheduler doc, but will add comment here too.
Matt
> Niranjana
>
> > > Niranjana
> > >
> > > >
> > > > sched_enable:
> > > > set_exec_queue_pending_tdr_exit(q);
> > > > @@ -2244,7 +2224,7 @@ static void guc_exec_queue_unpause_prepare(struct xe_guc *guc,
> > > > struct drm_sched_job *s_job;
> > > > struct xe_sched_job *job = NULL;
> > > >
> > > > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
> > > > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
> > > > job = to_xe_sched_job(s_job);
> > > >
> > > > xe_gt_dbg(guc_to_gt(guc), "Replay JOB - guc_id=%d, seqno=%d",
> > > > @@ -2349,7 +2329,7 @@ void xe_guc_submit_unpause(struct xe_guc *guc)
> > > > * created after resfix done.
> > > > */
> > > > if (q->guc->id != index ||
> > > > - !READ_ONCE(q->guc->sched.base.pause_submit))
> > > > + !drm_sched_is_stopped(&q->guc->sched.base))
> > > > continue;
> > > >
> > > > guc_exec_queue_unpause(guc, q);
> > > > @@ -2771,30 +2751,6 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
> > > > if (snapshot->parallel_execution)
> > > > guc_exec_queue_wq_snapshot_capture(q, snapshot);
> > > >
> > > > - spin_lock(&sched->base.job_list_lock);
> > > > - snapshot->pending_list_size = list_count_nodes(&sched->base.pending_list);
> > > > - snapshot->pending_list = kmalloc_array(snapshot->pending_list_size,
> > > > - sizeof(struct pending_list_snapshot),
> > > > - GFP_ATOMIC);
> > > > -
> > > > - if (snapshot->pending_list) {
> > > > - struct xe_sched_job *job_iter;
> > > > -
> > > > - i = 0;
> > > > - list_for_each_entry(job_iter, &sched->base.pending_list, drm.list) {
> > > > - snapshot->pending_list[i].seqno =
> > > > - xe_sched_job_seqno(job_iter);
> > > > - snapshot->pending_list[i].fence =
> > > > - dma_fence_is_signaled(job_iter->fence) ? 1 : 0;
> > > > - snapshot->pending_list[i].finished =
> > > > - dma_fence_is_signaled(&job_iter->drm.s_fence->finished)
> > > > - ? 1 : 0;
> > > > - i++;
> > > > - }
> > > > - }
> > > > -
> > > > - spin_unlock(&sched->base.job_list_lock);
> > > > -
> > > > return snapshot;
> > > > }
> > > >
> > > > @@ -2852,13 +2808,6 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
> > > >
> > > > if (snapshot->parallel_execution)
> > > > guc_exec_queue_wq_snapshot_print(snapshot, p);
> > > > -
> > > > - for (i = 0; snapshot->pending_list && i < snapshot->pending_list_size;
> > > > - i++)
> > > > - drm_printf(p, "\tJob: seqno=%d, fence=%d, finished=%d\n",
> > > > - snapshot->pending_list[i].seqno,
> > > > - snapshot->pending_list[i].fence,
> > > > - snapshot->pending_list[i].finished);
> > > > }
> > > >
> > > > /**
> > > > @@ -2881,7 +2830,6 @@ void xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *s
> > > > xe_lrc_snapshot_free(snapshot->lrc[i]);
> > > > kfree(snapshot->lrc);
> > > > }
> > > > - kfree(snapshot->pending_list);
> > > > kfree(snapshot);
> > > > }
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit_types.h b/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > > > index dc7456c34583..0b08c79cf3b9 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit_types.h
> > > > @@ -61,12 +61,6 @@ struct guc_submit_parallel_scratch {
> > > > u32 wq[WQ_SIZE / sizeof(u32)];
> > > > };
> > > >
> > > > -struct pending_list_snapshot {
> > > > - u32 seqno;
> > > > - bool fence;
> > > > - bool finished;
> > > > -};
> > > > -
> > > > /**
> > > > * struct xe_guc_submit_exec_queue_snapshot - Snapshot for devcoredump
> > > > */
> > > > @@ -134,11 +128,6 @@ struct xe_guc_submit_exec_queue_snapshot {
> > > > /** @wq: Workqueue Items */
> > > > u32 wq[WQ_SIZE / sizeof(u32)];
> > > > } parallel;
> > > > -
> > > > - /** @pending_list_size: Size of the pending list snapshot array */
> > > > - int pending_list_size;
> > > > - /** @pending_list: snapshot of the pending list info */
> > > > - struct pending_list_snapshot *pending_list;
> > > > };
> > > >
> > > > #endif
> > > > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.c b/drivers/gpu/drm/xe/xe_hw_fence.c
> > > > index b2a0c46dfcd4..e65dfcdfdbc5 100644
> > > > --- a/drivers/gpu/drm/xe/xe_hw_fence.c
> > > > +++ b/drivers/gpu/drm/xe/xe_hw_fence.c
> > > > @@ -110,22 +110,6 @@ void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq)
> > > > irq_work_queue(&irq->work);
> > > > }
> > > >
> > > > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq)
> > > > -{
> > > > - spin_lock_irq(&irq->lock);
> > > > - irq->enabled = false;
> > > > - spin_unlock_irq(&irq->lock);
> > > > -}
> > > > -
> > > > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq)
> > > > -{
> > > > - spin_lock_irq(&irq->lock);
> > > > - irq->enabled = true;
> > > > - spin_unlock_irq(&irq->lock);
> > > > -
> > > > - irq_work_queue(&irq->work);
> > > > -}
> > > > -
> > > > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> > > > struct xe_hw_fence_irq *irq, const char *name)
> > > > {
> > > > diff --git a/drivers/gpu/drm/xe/xe_hw_fence.h b/drivers/gpu/drm/xe/xe_hw_fence.h
> > > > index f13a1c4982c7..599492c13f80 100644
> > > > --- a/drivers/gpu/drm/xe/xe_hw_fence.h
> > > > +++ b/drivers/gpu/drm/xe/xe_hw_fence.h
> > > > @@ -17,8 +17,6 @@ void xe_hw_fence_module_exit(void);
> > > > void xe_hw_fence_irq_init(struct xe_hw_fence_irq *irq);
> > > > void xe_hw_fence_irq_finish(struct xe_hw_fence_irq *irq);
> > > > void xe_hw_fence_irq_run(struct xe_hw_fence_irq *irq);
> > > > -void xe_hw_fence_irq_stop(struct xe_hw_fence_irq *irq);
> > > > -void xe_hw_fence_irq_start(struct xe_hw_fence_irq *irq);
> > > >
> > > > void xe_hw_fence_ctx_init(struct xe_hw_fence_ctx *ctx, struct xe_gt *gt,
> > > > struct xe_hw_fence_irq *irq, const char *name);
> > > > --
> > > > 2.34.1
> > > >
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR
2025-11-18 21:19 ` Niranjana Vishwanathapura
@ 2025-11-18 22:59 ` Matthew Brost
0 siblings, 0 replies; 31+ messages in thread
From: Matthew Brost @ 2025-11-18 22:59 UTC (permalink / raw)
To: Niranjana Vishwanathapura
Cc: intel-xe, dri-devel, christian.koenig, pstanner, dakr
On Tue, Nov 18, 2025 at 01:19:16PM -0800, Niranjana Vishwanathapura wrote:
> On Tue, Nov 18, 2025 at 10:02:00AM -0800, Matthew Brost wrote:
> > On Mon, Nov 17, 2025 at 10:41:52PM -0800, Niranjana Vishwanathapura wrote:
> > > On Thu, Oct 16, 2025 at 01:48:24PM -0700, Matthew Brost wrote:
> > > > Deregistering queues in the TDR introduces unnecessary complexity,
> > > > requiring reference counting tricks to function correctly. All that's
> > > > needed in the TDR is to kick the queue off the hardware, which is
> > > > achieved by disabling scheduling. Queue deregistration should be handled
> > > > in a single, well-defined point in the cleanup path, tied to the queue's
> > > > reference count.
> > > >
> > >
> > > Overall looks good to me.
> > > But it would help if the commit text describes why this extra reference
> > > taking was there before for lr jobs and why it is not needed now.
> > >
> >
> > This patch isn't related to LR jobs, the following patch is.
> >
>
> I was talking about the set/clear_exec_queue_extra_ref() and its usage
> being removed in this patchset.
>
Oh, the extra was needed before to prevent the queue from disappearing
on the final put or by a GT reset while a deregister from the TDR was in
flight. It was a pretty hacky W/A to this odd UAF case. I can adjust the
commit message with to include this information.
Matt
> > The deregistering queues in TDR was never required, and this patches
> > removes that flow.
> >
>
> Ok, thanks.
>
> Niranjana
>
> > Matt
> >
> > > Niranjana
> > >
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_guc_submit.c | 57 +++---------------------------
> > > > 1 file changed, 5 insertions(+), 52 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > index 680696efc434..ab0f1a2d4871 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > @@ -69,9 +69,8 @@ exec_queue_to_guc(struct xe_exec_queue *q)
> > > > #define EXEC_QUEUE_STATE_WEDGED (1 << 8)
> > > > #define EXEC_QUEUE_STATE_BANNED (1 << 9)
> > > > #define EXEC_QUEUE_STATE_CHECK_TIMEOUT (1 << 10)
> > > > -#define EXEC_QUEUE_STATE_EXTRA_REF (1 << 11)
> > > > -#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 12)
> > > > -#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 13)
> > > > +#define EXEC_QUEUE_STATE_PENDING_RESUME (1 << 11)
> > > > +#define EXEC_QUEUE_STATE_PENDING_TDR_EXIT (1 << 12)
> > > >
> > > > static bool exec_queue_registered(struct xe_exec_queue *q)
> > > > {
> > > > @@ -218,21 +217,6 @@ static void clear_exec_queue_check_timeout(struct xe_exec_queue *q)
> > > > atomic_and(~EXEC_QUEUE_STATE_CHECK_TIMEOUT, &q->guc->state);
> > > > }
> > > >
> > > > -static bool exec_queue_extra_ref(struct xe_exec_queue *q)
> > > > -{
> > > > - return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_EXTRA_REF;
> > > > -}
> > > > -
> > > > -static void set_exec_queue_extra_ref(struct xe_exec_queue *q)
> > > > -{
> > > > - atomic_or(EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
> > > > -}
> > > > -
> > > > -static void clear_exec_queue_extra_ref(struct xe_exec_queue *q)
> > > > -{
> > > > - atomic_and(~EXEC_QUEUE_STATE_EXTRA_REF, &q->guc->state);
> > > > -}
> > > > -
> > > > static bool exec_queue_pending_resume(struct xe_exec_queue *q)
> > > > {
> > > > return atomic_read(&q->guc->state) & EXEC_QUEUE_STATE_PENDING_RESUME;
> > > > @@ -1190,25 +1174,6 @@ static void disable_scheduling(struct xe_exec_queue *q, bool immediate)
> > > > G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, 1);
> > > > }
> > > >
> > > > -static void __deregister_exec_queue(struct xe_guc *guc, struct xe_exec_queue *q)
> > > > -{
> > > > - u32 action[] = {
> > > > - XE_GUC_ACTION_DEREGISTER_CONTEXT,
> > > > - q->guc->id,
> > > > - };
> > > > -
> > > > - xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
> > > > - xe_gt_assert(guc_to_gt(guc), exec_queue_registered(q));
> > > > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_enable(q));
> > > > - xe_gt_assert(guc_to_gt(guc), !exec_queue_pending_disable(q));
> > > > -
> > > > - set_exec_queue_destroyed(q);
> > > > - trace_xe_exec_queue_deregister(q);
> > > > -
> > > > - xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action),
> > > > - G2H_LEN_DW_DEREGISTER_CONTEXT, 1);
> > > > -}
> > > > -
> > > > static enum drm_gpu_sched_stat
> > > > guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > {
> > > > @@ -1326,8 +1291,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > xe_devcoredump(q, job,
> > > > "Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
> > > > q->guc->id, ret, xe_guc_read_stopped(guc));
> > > > - set_exec_queue_extra_ref(q);
> > > > - xe_exec_queue_get(q); /* GT reset owns this */
> > > > set_exec_queue_banned(q);
> > > > xe_gt_reset_async(q->gt);
> > > > xe_sched_tdr_queue_imm(sched);
> > > > @@ -1380,13 +1343,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> > > > }
> > > > }
> > > >
> > > > - /* Finish cleaning up exec queue via deregister */
> > > > set_exec_queue_banned(q);
> > > > - if (!wedged && exec_queue_registered(q) && !exec_queue_destroyed(q)) {
> > > > - set_exec_queue_extra_ref(q);
> > > > - xe_exec_queue_get(q);
> > > > - __deregister_exec_queue(guc, q);
> > > > - }
> > > >
> > > > /* Mark all outstanding jobs as bad, thus completing them */
> > > > xe_sched_job_set_error(job, err);
> > > > @@ -1928,7 +1885,7 @@ static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> > > >
> > > > /* Clean up lost G2H + reset engine state */
> > > > if (exec_queue_registered(q)) {
> > > > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
> > > > + if (xe_exec_queue_is_lr(q))
> > > > xe_exec_queue_put(q);
> > > > else if (exec_queue_destroyed(q))
> > > > __guc_exec_queue_destroy(guc, q);
> > > > @@ -2062,11 +2019,7 @@ static void guc_exec_queue_revert_pending_state_change(struct xe_guc *guc,
> > > >
> > > > if (exec_queue_destroyed(q) && exec_queue_registered(q)) {
> > > > clear_exec_queue_destroyed(q);
> > > > - if (exec_queue_extra_ref(q))
> > > > - xe_exec_queue_put(q);
> > > > - else
> > > > - q->guc->needs_cleanup = true;
> > > > - clear_exec_queue_extra_ref(q);
> > > > + q->guc->needs_cleanup = true;
> > > > xe_gt_dbg(guc_to_gt(guc), "Replay CLEANUP - guc_id=%d",
> > > > q->guc->id);
> > > > }
> > > > @@ -2483,7 +2436,7 @@ static void handle_deregister_done(struct xe_guc *guc, struct xe_exec_queue *q)
> > > >
> > > > clear_exec_queue_registered(q);
> > > >
> > > > - if (exec_queue_extra_ref(q) || xe_exec_queue_is_lr(q))
> > > > + if (xe_exec_queue_is_lr(q))
> > > > xe_exec_queue_put(q);
> > > > else
> > > > __guc_exec_queue_destroy(guc, q);
> > > > --
> > > > 2.34.1
> > > >
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2025-11-18 22:59 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
2025-11-15 1:25 ` Niranjana Vishwanathapura
2025-11-18 17:52 ` Matthew Brost
2025-11-18 21:12 ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state Matthew Brost
2025-11-17 19:57 ` Niranjana Vishwanathapura
2025-11-18 17:45 ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 3/7] drm/xe: Add dedicated message lock Matthew Brost
2025-11-17 19:58 ` Niranjana Vishwanathapura
2025-11-18 17:53 ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals Matthew Brost
2025-11-18 6:39 ` Niranjana Vishwanathapura
2025-11-18 17:59 ` Matthew Brost
2025-11-18 21:17 ` Niranjana Vishwanathapura
2025-11-18 22:54 ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR Matthew Brost
2025-11-18 6:41 ` Niranjana Vishwanathapura
2025-11-18 18:02 ` Matthew Brost
2025-11-18 21:19 ` Niranjana Vishwanathapura
2025-11-18 22:59 ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission Matthew Brost
2025-11-18 6:45 ` Niranjana Vishwanathapura
2025-11-18 18:03 ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running Matthew Brost
2025-11-15 1:01 ` Niranjana Vishwanathapura
2025-11-18 18:06 ` Matthew Brost
2025-10-16 20:55 ` ✗ CI.checkpatch: warning for Fix DRM scheduler layering violations in Xe (rev3) Patchwork
2025-10-16 20:56 ` ✓ CI.KUnit: success " Patchwork
2025-10-16 21:36 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-17 18:43 ` ✗ Xe.CI.Full: failure " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox