[PATCH v3 0/4] Improve gpu_scheduler trace events

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/4] Improve gpu_scheduler trace events
@ 2024-06-06 13:06 Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-06 13:06 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt
  Cc: Pierre-Eric Pelloux-Prayer

Hi,

This is the new version of my patch series aiming at improving the
trace events around gpu jobs.
The main ideas implemented are: trace dependencies between jobs and
identify the GPU running jobs (because 'ring' is not a unique attribute).

Changes from v2:
* dropped all amdgpu changes. The goal here is to make the gpu_scheduler
trace events usable by a tool, without dependencies on driver-specific
events
* dropped the vblank related changes. I'll post a separate series to
cover drm/vblank events later.
* reworked fence printing so it's coherent between all events.
* added a dev_index to let the tools parsing the events which GPU is
running a job. It wasn't needed before the gpu scheduler switch to
workqueues because the queue pid was enough to identify the hardware queue.
* dropped the changes to trace the "why" of a dependency. I implemented
a version based on Sima's suggestion using xa_tag_pointer but I'm not
convinced it's really useful, so I'm dropping it for now.

Open questions:
* assuming the new fence printing format is agreed on,
should we add some code to preserve the old format?
* checkpatch doesn't like the indentation in the last patch, because
everything is vertically aligned to 'TP_fast_assign('. How to best fix it?

   WARNING: Statements should start on a tabstop
   #82: FILE: drivers/gpu/drm/scheduler/gpu_scheduler_trace.h:80:
   +        unsigned long idx;

v2: https://lists.freedesktop.org/archives/dri-devel/2024-February/441341.html

Pierre-Eric Pelloux-Prayer (4):
  drm/sched: store the drm_device instead of the device
  drm/sched: add dev_index=xx to the drm_sched_process_job event
  drm/sched: cleanup gpu_scheduler trace events
  drm/sched: trace dependencies for gpu jobs

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c       |  2 +-
 drivers/gpu/drm/imagination/pvr_queue.c       |  2 +-
 drivers/gpu/drm/lima/lima_sched.c             |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |  2 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c       |  2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |  2 +-
 drivers/gpu/drm/panthor/panthor_mmu.c         |  2 +-
 drivers/gpu/drm/panthor/panthor_sched.c       |  2 +-
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 88 +++++++++++++++----
 drivers/gpu/drm/scheduler/sched_entity.c      | 11 ++-
 drivers/gpu/drm/scheduler/sched_main.c        | 28 +++---
 drivers/gpu/drm/v3d/v3d_sched.c               | 12 +--
 drivers/gpu/drm/xe/xe_execlist.c              |  2 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler.c         |  2 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler.h         |  2 +-
 drivers/gpu/drm/xe/xe_guc_submit.c            |  2 +-
 include/drm/gpu_scheduler.h                   |  4 +-
 18 files changed, 115 insertions(+), 54 deletions(-)

-- 
2.40.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:06 [PATCH v3 0/4] Improve gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:06 ` Pierre-Eric Pelloux-Prayer
  2024-06-06 13:18   ` Christian König
                     ` (2 more replies)
  2024-06-06 13:06 ` [PATCH v3 2/4] drm/sched: add dev_index=xx to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-06 13:06 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt
  Cc: Pierre-Eric Pelloux-Prayer

When tracing is enabled, being able to identify which device is sending
events is useful; for this the next commit will extend events to include
drm_device::primary::index.

Since the device member is only used in the drm_* log macros, we can
replace it by a drm_device pointer.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  2 +-
 drivers/gpu/drm/imagination/pvr_queue.c    |  2 +-
 drivers/gpu/drm/lima/lima_sched.c          |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c       |  2 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c    |  2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c    |  2 +-
 drivers/gpu/drm/panthor/panthor_mmu.c      |  2 +-
 drivers/gpu/drm/panthor/panthor_sched.c    |  2 +-
 drivers/gpu/drm/scheduler/sched_entity.c   |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c     | 26 +++++++++++-----------
 drivers/gpu/drm/v3d/v3d_sched.c            | 12 +++++-----
 drivers/gpu/drm/xe/xe_execlist.c           |  2 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler.c      |  2 +-
 drivers/gpu/drm/xe/xe_gpu_scheduler.h      |  2 +-
 drivers/gpu/drm/xe/xe_guc_submit.c         |  2 +-
 include/drm/gpu_scheduler.h                |  4 ++--
 17 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 932dc93b2e63..7f2a68ad8034 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2738,7 +2738,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 				   ring->num_hw_submission, 0,
 				   timeout, adev->reset_domain->wq,
 				   ring->sched_score, ring->name,
-				   adev->dev);
+				   &adev->ddev);
 		if (r) {
 			DRM_ERROR("Failed to create scheduler on ring %s.\n",
 				  ring->name);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index c4b04b0dee16..c4345b68a51f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -138,7 +138,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
 			     DRM_SCHED_PRIORITY_COUNT,
 			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
 			     msecs_to_jiffies(500), NULL, NULL,
-			     dev_name(gpu->dev), gpu->dev);
+			     dev_name(gpu->dev), gpu->drm);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c
index 5ed9c98fb599..cdbb6c01e952 100644
--- a/drivers/gpu/drm/imagination/pvr_queue.c
+++ b/drivers/gpu/drm/imagination/pvr_queue.c
@@ -1287,7 +1287,7 @@ struct pvr_queue *pvr_queue_create(struct pvr_context *ctx,
 			     pvr_dev->sched_wq, 1, 64 * 1024, 1,
 			     msecs_to_jiffies(500),
 			     pvr_dev->sched_wq, NULL, "pvr-queue",
-			     pvr_dev->base.dev);
+			     &pvr_dev->base);
 	if (err)
 		goto err_release_ufo;
 
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index bbf3f8feab94..db6ee7650468 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -526,7 +526,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
 			      1,
 			      lima_job_hang_limit,
 			      msecs_to_jiffies(timeout), NULL,
-			      NULL, name, pipe->ldev->dev);
+			      NULL, name, pipe->ldev->ddev);
 }
 
 void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 9d6655f96f0c..3a4b3816f2c9 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -101,7 +101,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
 	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
 			     DRM_SCHED_PRIORITY_COUNT,
 			     num_hw_submissions, 0, sched_timeout,
-			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
+			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev);
 	if (ret) {
 		goto fail;
 	}
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 32fa2e273965..386839bed8a2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -419,7 +419,7 @@ nouveau_sched_init(struct nouveau_sched *sched, struct nouveau_drm *drm,
 	ret = drm_sched_init(drm_sched, &nouveau_sched_ops, wq,
 			     NOUVEAU_SCHED_PRIORITY_COUNT,
 			     credit_limit, 0, job_hang_limit,
-			     NULL, NULL, "nouveau_sched", drm->dev->dev);
+			     NULL, NULL, "nouveau_sched", drm->dev);
 	if (ret)
 		goto fail_wq;
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index a61ef0af9a4e..28c7680a8dbf 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -875,7 +875,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 				     nentries, 0,
 				     msecs_to_jiffies(JOB_TIMEOUT_MS),
 				     pfdev->reset.wq,
-				     NULL, "pan_js", pfdev->dev);
+				     NULL, "pan_js", pfdev->ddev);
 		if (ret) {
 			dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
 			goto err_sched;
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index fa0a002b1016..b9c5b500b7d1 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -2293,7 +2293,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
 	ret = drm_sched_init(&vm->sched, &panthor_vm_bind_ops, ptdev->mmu->vm.wq,
 			     1, 1, 0,
 			     MAX_SCHEDULE_TIMEOUT, NULL, NULL,
-			     "panthor-vm-bind", ptdev->base.dev);
+			     "panthor-vm-bind", &ptdev->base);
 	if (ret)
 		goto err_free_io_pgtable;
 
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index 79ffcbc41d78..47e52f61571b 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -3043,7 +3043,7 @@ group_create_queue(struct panthor_group *group,
 			     args->ringbuf_size / (NUM_INSTRS_PER_SLOT * sizeof(u64)),
 			     0, msecs_to_jiffies(JOB_TIMEOUT_MS),
 			     group->ptdev->reset.wq,
-			     NULL, "panthor-queue", group->ptdev->base.dev);
+			     NULL, "panthor-queue", &group->ptdev->base);
 	if (ret)
 		goto err_free_queue;
 
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 58c8161289fe..194798b9ce09 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -92,7 +92,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 		 * the lowest priority available.
 		 */
 		if (entity->priority >= sched_list[0]->num_rqs) {
-			drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n",
+			drm_err(sched_list[0]->dev, "entity with out-of-bounds priority:%u num_rqs:%u\n",
 				entity->priority, sched_list[0]->num_rqs);
 			entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1,
 						 (s32) DRM_SCHED_PRIORITY_KERNEL);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 7e90c9f95611..74a2fe51e653 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -103,9 +103,9 @@ static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched)
 {
 	u32 credits;
 
-	drm_WARN_ON(sched, check_sub_overflow(sched->credit_limit,
-					      atomic_read(&sched->credit_count),
-					      &credits));
+	drm_WARN_ON(sched->dev, check_sub_overflow(sched->credit_limit,
+						  atomic_read(&sched->credit_count),
+						  &credits));
 
 	return credits;
 }
@@ -130,14 +130,14 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched,
 	if (sched->ops->update_job_credits) {
 		s_job->credits = sched->ops->update_job_credits(s_job);
 
-		drm_WARN(sched, !s_job->credits,
+		drm_WARN(sched->dev, !s_job->credits,
 			 "Jobs with zero credits bypass job-flow control.\n");
 	}
 
 	/* If a job exceeds the credit limit, truncate it to the credit limit
 	 * itself to guarantee forward progress.
 	 */
-	if (drm_WARN(sched, s_job->credits > sched->credit_limit,
+	if (drm_WARN(sched->dev, s_job->credits > sched->credit_limit,
 		     "Jobs may not exceed the credit limit, truncate.\n"))
 		s_job->credits = sched->credit_limit;
 
@@ -701,7 +701,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 			if (r == -ENOENT)
 				drm_sched_job_done(s_job, fence->error);
 			else if (r)
-				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
+				DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n",
 					  r);
 		} else
 			drm_sched_job_done(s_job, -ECANCELED);
@@ -797,7 +797,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		 * or worse--a blank screen--leave a trail in the
 		 * logs, so this can be debugged easier.
 		 */
-		drm_err(job->sched, "%s: entity has no rq!\n", __func__);
+		drm_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
 		return -ENOENT;
 	}
 
@@ -1215,7 +1215,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
 		if (r == -ENOENT)
 			drm_sched_job_done(sched_job, fence->error);
 		else if (r)
-			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r);
+			DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n", r);
 	} else {
 		drm_sched_job_done(sched_job, IS_ERR(fence) ?
 				   PTR_ERR(fence) : 0);
@@ -1240,7 +1240,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
  *		used
  * @score: optional score atomic shared with other schedulers
  * @name: name used for debugging
- * @dev: target &struct device
+ * @dev: target &struct drm_device
  *
  * Return 0 on success, otherwise error code.
  */
@@ -1249,7 +1249,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   struct workqueue_struct *submit_wq,
 		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
-		   atomic_t *score, const char *name, struct device *dev)
+		   atomic_t *score, const char *name, struct drm_device *dev)
 {
 	int i;
 
@@ -1265,7 +1265,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 	if (num_rqs > DRM_SCHED_PRIORITY_COUNT) {
 		/* This is a gross violation--tell drivers what the  problem is.
 		 */
-		drm_err(sched, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
+		drm_err(dev, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
 			__func__);
 		return -EINVAL;
 	} else if (sched->sched_rq) {
@@ -1273,7 +1273,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		 * fine-tune their DRM calling order, and return all
 		 * is good.
 		 */
-		drm_warn(sched, "%s: scheduler already initialized!\n", __func__);
+		drm_warn(dev, "%s: scheduler already initialized!\n", __func__);
 		return 0;
 	}
 
@@ -1322,7 +1322,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 Out_check_own:
 	if (sched->own_submit_wq)
 		destroy_workqueue(sched->submit_wq);
-	drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
+	drm_err(dev, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
 	return -ENOMEM;
 }
 EXPORT_SYMBOL(drm_sched_init);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 7cd8c335cd9b..73383b6ef9bb 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -740,7 +740,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     DRM_SCHED_PRIORITY_COUNT,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_bin", v3d->drm.dev);
+			     NULL, "v3d_bin", &v3d->drm);
 	if (ret)
 		return ret;
 
@@ -749,7 +749,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     DRM_SCHED_PRIORITY_COUNT,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_render", v3d->drm.dev);
+			     NULL, "v3d_render", &v3d->drm);
 	if (ret)
 		goto fail;
 
@@ -758,7 +758,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     DRM_SCHED_PRIORITY_COUNT,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_tfu", v3d->drm.dev);
+			     NULL, "v3d_tfu", &v3d->drm);
 	if (ret)
 		goto fail;
 
@@ -768,7 +768,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 				     DRM_SCHED_PRIORITY_COUNT,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
-				     NULL, "v3d_csd", v3d->drm.dev);
+				     NULL, "v3d_csd", &v3d->drm);
 		if (ret)
 			goto fail;
 
@@ -777,7 +777,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 				     DRM_SCHED_PRIORITY_COUNT,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
-				     NULL, "v3d_cache_clean", v3d->drm.dev);
+				     NULL, "v3d_cache_clean", &v3d->drm);
 		if (ret)
 			goto fail;
 	}
@@ -787,7 +787,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     DRM_SCHED_PRIORITY_COUNT,
 			     1, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_cpu", v3d->drm.dev);
+			     NULL, "v3d_cpu", &v3d->drm);
 	if (ret)
 		goto fail;
 
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index dece2785933c..dc81e9f39727 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -336,7 +336,7 @@ static int execlist_exec_queue_init(struct xe_exec_queue *q)
 			     q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES,
 			     XE_SCHED_HANG_LIMIT, XE_SCHED_JOB_TIMEOUT,
 			     NULL, NULL, q->hwe->name,
-			     gt_to_xe(q->gt)->drm.dev);
+			     &gt_to_xe(q->gt)->drm);
 	if (err)
 		goto err_free;
 
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
index e4ad1d6ce1d5..66d36cac82a0 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
@@ -61,7 +61,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
 		  uint32_t hw_submission, unsigned hang_limit,
 		  long timeout, struct workqueue_struct *timeout_wq,
 		  atomic_t *score, const char *name,
-		  struct device *dev)
+		  struct drm_device *dev)
 {
 	sched->ops = xe_ops;
 	INIT_LIST_HEAD(&sched->msgs);
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
index 10c6bb9c9386..9a75457813f2 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
@@ -16,7 +16,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
 		  uint32_t hw_submission, unsigned hang_limit,
 		  long timeout, struct workqueue_struct *timeout_wq,
 		  atomic_t *score, const char *name,
-		  struct device *dev);
+		  struct drm_device *dev);
 void xe_sched_fini(struct xe_gpu_scheduler *sched);
 
 void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index e4e3658e6a13..b9c114f2c715 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1208,7 +1208,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
 			    get_submit_wq(guc),
 			    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
 			    timeout, guc_to_gt(guc)->ordered_wq, NULL,
-			    q->name, gt_to_xe(q->gt)->drm.dev);
+			    q->name, &gt_to_xe(q->gt)->drm);
 	if (err)
 		goto err_free;
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 5acc64954a88..0ba8716ec069 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -539,7 +539,7 @@ struct drm_gpu_scheduler {
 	bool				free_guilty;
 	bool				pause_submit;
 	bool				own_submit_wq;
-	struct device			*dev;
+	struct drm_device		*dev;
 };
 
 int drm_sched_init(struct drm_gpu_scheduler *sched,
@@ -547,7 +547,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   struct workqueue_struct *submit_wq,
 		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
-		   atomic_t *score, const char *name, struct device *dev);
+		   atomic_t *score, const char *name, struct drm_device *dev);
 
 void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/4] drm/sched: add dev_index=xx to the drm_sched_process_job event
  2024-06-06 13:06 [PATCH v3 0/4] Improve gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:06 ` Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 4/4] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
  3 siblings, 0 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-06 13:06 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt
  Cc: Pierre-Eric Pelloux-Prayer

Until the switch from kthread to workqueue, a userspace application could
determine the device index from the pid of the thread sending this event.

With workqueues this is not possible anymore, so the event needs to contain
this information (the ring name alone is not enough, because they're not
unique).

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler_trace.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index c75302ca3427..0a19c121bda5 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -28,6 +28,9 @@
 #include <linux/types.h>
 #include <linux/tracepoint.h>
 
+#include "drm/drm_device.h"
+#include "drm/drm_file.h"
+
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM gpu_scheduler
 #define TRACE_INCLUDE_FILE gpu_scheduler_trace
@@ -42,6 +45,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			     __field(uint64_t, id)
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
+			     __field(int, dev_index)
 			     ),
 
 	    TP_fast_assign(
@@ -52,6 +56,7 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			   __entry->job_count = spsc_queue_count(&entity->job_queue);
 			   __entry->hw_job_count = atomic_read(
 				   &sched_job->sched->credit_count);
+			   __entry->dev_index = sched_job->sched->dev->primary->index;
 			   ),
 	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
 		      __entry->entity, __entry->id,
@@ -64,9 +69,13 @@ DEFINE_EVENT(drm_sched_job, drm_sched_job,
 	    TP_ARGS(sched_job, entity)
 );
 
-DEFINE_EVENT(drm_sched_job, drm_run_job,
+DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
-	    TP_ARGS(sched_job, entity)
+	    TP_ARGS(sched_job, entity),
+	    TP_printk("dev_index=%d entity=%p id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
+		      __entry->dev_index, __entry->entity, __entry->id,
+		      __entry->fence, __get_str(name),
+		      __entry->job_count, __entry->hw_job_count)
 );
 
 TRACE_EVENT(drm_sched_process_job,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events
  2024-06-06 13:06 [PATCH v3 0/4] Improve gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
  2024-06-06 13:06 ` [PATCH v3 2/4] drm/sched: add dev_index=xx to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:06 ` Pierre-Eric Pelloux-Prayer
  2024-06-06 13:19   ` Steven Rostedt
  2024-06-06 13:06 ` [PATCH v3 4/4] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer
  3 siblings, 1 reply; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-06 13:06 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt
  Cc: Pierre-Eric Pelloux-Prayer

Print identifiers instead of pointers:
* "fence=%p" is replaced by "fence=(context:%llu, seqno:%lld)" to have a
coherent way to print the fence. A possible follow up change would be
to use the same format in traces/../dma-fence.h.
* "entity=%p" is removed because the fence's context is already an
identifier of the job owner.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 27 ++++++++++---------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 0a19c121bda5..2d7f2252eb5d 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -58,9 +58,9 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 				   &sched_job->sched->credit_count);
 			   __entry->dev_index = sched_job->sched->dev->primary->index;
 			   ),
-	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
-		      __entry->entity, __entry->id,
-		      __entry->fence, __get_str(name),
+	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
+		      __entry->id,
+		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
 		      __entry->job_count, __entry->hw_job_count)
 );
 
@@ -72,9 +72,9 @@ DEFINE_EVENT(drm_sched_job, drm_sched_job,
 DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
 	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
 	    TP_ARGS(sched_job, entity),
-	    TP_printk("dev_index=%d entity=%p id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
-		      __entry->dev_index, __entry->entity, __entry->id,
-		      __entry->fence, __get_str(name),
+	    TP_printk("dev_index=%d id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
+		      __entry->dev_index, __entry->id,
+		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
 		      __entry->job_count, __entry->hw_job_count)
 );
 
@@ -88,7 +88,8 @@ TRACE_EVENT(drm_sched_process_job,
 	    TP_fast_assign(
 		    __entry->fence = &fence->finished;
 		    ),
-	    TP_printk("fence=%p signaled", __entry->fence)
+	    TP_printk("fence=(context:%llu, seqno:%lld) signaled",
+		      __entry->fence->context, __entry->fence->seqno)
 );
 
 TRACE_EVENT(drm_sched_job_wait_dep,
@@ -96,23 +97,25 @@ TRACE_EVENT(drm_sched_job_wait_dep,
 	    TP_ARGS(sched_job, fence),
 	    TP_STRUCT__entry(
 			     __string(name, sched_job->sched->name)
+			     __field(uint64_t, fence_context)
 			     __field(uint64_t, id)
 			     __field(struct dma_fence *, fence)
 			     __field(uint64_t, ctx)
-			     __field(unsigned, seqno)
+			     __field(uint64_t, seqno)
 			     ),
 
 	    TP_fast_assign(
 			   __assign_str(name);
+			   /* Store the hw exec fence context. */
+			   __entry->fence_context = sched_job->entity->fence_context + 1;
 			   __entry->id = sched_job->id;
 			   __entry->fence = fence;
 			   __entry->ctx = fence->context;
 			   __entry->seqno = fence->seqno;
 			   ),
-	    TP_printk("job ring=%s, id=%llu, depends fence=%p, context=%llu, seq=%u",
-		      __get_str(name), __entry->id,
-		      __entry->fence, __entry->ctx,
-		      __entry->seqno)
+	    TP_printk("job ring=%s, fence_context=%llu, id=%llu, depends fence=(context:%llu, seqno:%lld)",
+		      __get_str(name), __entry->fence_context, __entry->id,
+		      __entry->ctx, __entry->seqno)
 );
 
 #endif
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/4] drm/sched: trace dependencies for gpu jobs
  2024-06-06 13:06 [PATCH v3 0/4] Improve gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
                   ` (2 preceding siblings ...)
  2024-06-06 13:06 ` [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:06 ` Pierre-Eric Pelloux-Prayer
  3 siblings, 0 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-06 13:06 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt
  Cc: Pierre-Eric Pelloux-Prayer

Trace the fence dependencies similarly to how we print fences:

 ... , dependencies:{(context:606, seqno:38006)}

This allows tools to analyze the dependencies between the jobs (previously
it was only possible for fences traced by drm_sched_job_wait_dep).

Since drm_sched_job and drm_run_job use the same base event class,
the caller of trace_drm_run_job have to pass a dep_count of 0 (which
is ignored because dependencies are only considered at submit time).

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 58 ++++++++++++++++---
 drivers/gpu/drm/scheduler/sched_entity.c      |  8 ++-
 drivers/gpu/drm/scheduler/sched_main.c        |  2 +-
 3 files changed, 58 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
index 2d7f2252eb5d..9d90237793a1 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
@@ -26,6 +26,7 @@
 
 #include <linux/stringify.h>
 #include <linux/types.h>
+#include <linux/trace_seq.h>
 #include <linux/tracepoint.h>
 
 #include "drm/drm_device.h"
@@ -35,9 +36,34 @@
 #define TRACE_SYSTEM gpu_scheduler
 #define TRACE_INCLUDE_FILE gpu_scheduler_trace
 
+#ifndef __TRACE_EVENT_GPU_SCHEDULER_PRINT_FN
+#define __TRACE_EVENT_GPU_SCHEDULER_PRINT_FN
+/* Similar to trace_print_array_seq but for fences. */
+static inline const char *__print_dma_fence_array(struct trace_seq *p, const void *buf, int count)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+	u64 *fences = (u64 *) buf;
+	const char *prefix = "";
+
+	trace_seq_putc(p, '{');
+	for (int i = 0; i < count; i++) {
+		u64 context = fences[2 * i], seqno = fences[2 * i + 1];
+
+		trace_seq_printf(p, "%s(context:%llu, seqno:%lld)",
+				 prefix, context, seqno);
+		prefix = ",";
+	}
+	trace_seq_putc(p, '}');
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+#endif
+
 DECLARE_EVENT_CLASS(drm_sched_job,
-	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
-	    TP_ARGS(sched_job, entity),
+	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity,
+		     unsigned int dep_count),
+	    TP_ARGS(sched_job, entity, dep_count),
 	    TP_STRUCT__entry(
 			     __field(struct drm_sched_entity *, entity)
 			     __field(struct dma_fence *, fence)
@@ -46,9 +72,14 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
 			     __field(int, dev_index)
+			     __field(int, n_deps)
+			     __dynamic_array(u64, deps, dep_count * 2)
 			     ),
 
 	    TP_fast_assign(
+			   unsigned long idx;
+			   struct dma_fence *fence;
+			   u64 *dyn_arr;
 			   __entry->entity = entity;
 			   __entry->id = sched_job->id;
 			   __entry->fence = &sched_job->s_fence->finished;
@@ -57,21 +88,32 @@ DECLARE_EVENT_CLASS(drm_sched_job,
 			   __entry->hw_job_count = atomic_read(
 				   &sched_job->sched->credit_count);
 			   __entry->dev_index = sched_job->sched->dev->primary->index;
+			   __entry->n_deps = dep_count;
+			   if (dep_count) {
+				dyn_arr = __get_dynamic_array(deps);
+				xa_for_each(&sched_job->dependencies, idx, fence) {
+					dyn_arr[2 * idx] = fence->context;
+					dyn_arr[2 * idx + 1] = fence->seqno;
+				}
+			   }
 			   ),
-	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
+	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d, dependencies:%s",
 		      __entry->id,
 		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
-		      __entry->job_count, __entry->hw_job_count)
+		      __entry->job_count, __entry->hw_job_count,
+		      __print_dma_fence_array(p, __get_dynamic_array(deps), __entry->n_deps))
 );
 
 DEFINE_EVENT(drm_sched_job, drm_sched_job,
-	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
-	    TP_ARGS(sched_job, entity)
+	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity,
+		     unsigned int dep_count),
+	    TP_ARGS(sched_job, entity, dep_count)
 );
 
 DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
-	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
-	    TP_ARGS(sched_job, entity),
+	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity,
+		     unsigned int dep_count),
+	    TP_ARGS(sched_job, entity, 0),
 	    TP_printk("dev_index=%d id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
 		      __entry->dev_index, __entry->id,
 		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 194798b9ce09..d252f3aeed47 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -583,7 +583,13 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 	bool first;
 	ktime_t submit_ts;
 
-	trace_drm_sched_job(sched_job, entity);
+	if (trace_drm_sched_job_enabled()) {
+		unsigned long index;
+		void *f;
+
+		xa_for_each(&sched_job->dependencies, index, f) { }
+		trace_drm_sched_job(sched_job, entity, index);
+	}
 	atomic_inc(entity->rq->sched->score);
 	WRITE_ONCE(entity->last_user, current->group_leader);
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 74a2fe51e653..b3b89ea0d96d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1201,7 +1201,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
 	atomic_add(sched_job->credits, &sched->credit_count);
 	drm_sched_job_begin(sched_job);
 
-	trace_drm_run_job(sched_job, entity);
+	trace_drm_run_job(sched_job, entity, 0);
 	fence = sched->ops->run_job(sched_job);
 	complete_all(&entity->entity_idle);
 	drm_sched_fence_scheduled(s_fence, fence);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:18   ` Christian König
  2024-06-06 13:23     ` Ville Syrjälä
                       ` (2 more replies)
  2024-06-07  0:34   ` kernel test robot
  2024-06-08  0:36   ` kernel test robot
  2 siblings, 3 replies; 14+ messages in thread
From: Christian König @ 2024-06-06 13:18 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, alexander.deucher, ltuikov89,
	matthew.brost, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, dri-devel, ville.syrjala, rostedt

Am 06.06.24 um 15:06 schrieb Pierre-Eric Pelloux-Prayer:
> When tracing is enabled, being able to identify which device is sending
> events is useful; for this the next commit will extend events to include
> drm_device::primary::index.

That sounds like a rather bad idea since the primary index is really 
just an arbitrary number and not defined for all devices.

Why not use the device name instead? This way you don't need this change 
in the first place.

Regards,
Christian.

>
> Since the device member is only used in the drm_* log macros, we can
> replace it by a drm_device pointer.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  2 +-
>   drivers/gpu/drm/imagination/pvr_queue.c    |  2 +-
>   drivers/gpu/drm/lima/lima_sched.c          |  2 +-
>   drivers/gpu/drm/msm/msm_ringbuffer.c       |  2 +-
>   drivers/gpu/drm/nouveau/nouveau_sched.c    |  2 +-
>   drivers/gpu/drm/panfrost/panfrost_job.c    |  2 +-
>   drivers/gpu/drm/panthor/panthor_mmu.c      |  2 +-
>   drivers/gpu/drm/panthor/panthor_sched.c    |  2 +-
>   drivers/gpu/drm/scheduler/sched_entity.c   |  2 +-
>   drivers/gpu/drm/scheduler/sched_main.c     | 26 +++++++++++-----------
>   drivers/gpu/drm/v3d/v3d_sched.c            | 12 +++++-----
>   drivers/gpu/drm/xe/xe_execlist.c           |  2 +-
>   drivers/gpu/drm/xe/xe_gpu_scheduler.c      |  2 +-
>   drivers/gpu/drm/xe/xe_gpu_scheduler.h      |  2 +-
>   drivers/gpu/drm/xe/xe_guc_submit.c         |  2 +-
>   include/drm/gpu_scheduler.h                |  4 ++--
>   17 files changed, 35 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 932dc93b2e63..7f2a68ad8034 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2738,7 +2738,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>   				   ring->num_hw_submission, 0,
>   				   timeout, adev->reset_domain->wq,
>   				   ring->sched_score, ring->name,
> -				   adev->dev);
> +				   &adev->ddev);
>   		if (r) {
>   			DRM_ERROR("Failed to create scheduler on ring %s.\n",
>   				  ring->name);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index c4b04b0dee16..c4345b68a51f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -138,7 +138,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>   			     msecs_to_jiffies(500), NULL, NULL,
> -			     dev_name(gpu->dev), gpu->dev);
> +			     dev_name(gpu->dev), gpu->drm);
>   	if (ret)
>   		return ret;
>   
> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c
> index 5ed9c98fb599..cdbb6c01e952 100644
> --- a/drivers/gpu/drm/imagination/pvr_queue.c
> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
> @@ -1287,7 +1287,7 @@ struct pvr_queue *pvr_queue_create(struct pvr_context *ctx,
>   			     pvr_dev->sched_wq, 1, 64 * 1024, 1,
>   			     msecs_to_jiffies(500),
>   			     pvr_dev->sched_wq, NULL, "pvr-queue",
> -			     pvr_dev->base.dev);
> +			     &pvr_dev->base);
>   	if (err)
>   		goto err_release_ufo;
>   
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index bbf3f8feab94..db6ee7650468 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -526,7 +526,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>   			      1,
>   			      lima_job_hang_limit,
>   			      msecs_to_jiffies(timeout), NULL,
> -			      NULL, name, pipe->ldev->dev);
> +			      NULL, name, pipe->ldev->ddev);
>   }
>   
>   void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index 9d6655f96f0c..3a4b3816f2c9 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -101,7 +101,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>   	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     num_hw_submissions, 0, sched_timeout,
> -			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> +			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev);
>   	if (ret) {
>   		goto fail;
>   	}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index 32fa2e273965..386839bed8a2 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -419,7 +419,7 @@ nouveau_sched_init(struct nouveau_sched *sched, struct nouveau_drm *drm,
>   	ret = drm_sched_init(drm_sched, &nouveau_sched_ops, wq,
>   			     NOUVEAU_SCHED_PRIORITY_COUNT,
>   			     credit_limit, 0, job_hang_limit,
> -			     NULL, NULL, "nouveau_sched", drm->dev->dev);
> +			     NULL, NULL, "nouveau_sched", drm->dev);
>   	if (ret)
>   		goto fail_wq;
>   
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index a61ef0af9a4e..28c7680a8dbf 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -875,7 +875,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>   				     nentries, 0,
>   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>   				     pfdev->reset.wq,
> -				     NULL, "pan_js", pfdev->dev);
> +				     NULL, "pan_js", pfdev->ddev);
>   		if (ret) {
>   			dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
>   			goto err_sched;
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index fa0a002b1016..b9c5b500b7d1 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -2293,7 +2293,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
>   	ret = drm_sched_init(&vm->sched, &panthor_vm_bind_ops, ptdev->mmu->vm.wq,
>   			     1, 1, 0,
>   			     MAX_SCHEDULE_TIMEOUT, NULL, NULL,
> -			     "panthor-vm-bind", ptdev->base.dev);
> +			     "panthor-vm-bind", &ptdev->base);
>   	if (ret)
>   		goto err_free_io_pgtable;
>   
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 79ffcbc41d78..47e52f61571b 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3043,7 +3043,7 @@ group_create_queue(struct panthor_group *group,
>   			     args->ringbuf_size / (NUM_INSTRS_PER_SLOT * sizeof(u64)),
>   			     0, msecs_to_jiffies(JOB_TIMEOUT_MS),
>   			     group->ptdev->reset.wq,
> -			     NULL, "panthor-queue", group->ptdev->base.dev);
> +			     NULL, "panthor-queue", &group->ptdev->base);
>   	if (ret)
>   		goto err_free_queue;
>   
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 58c8161289fe..194798b9ce09 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -92,7 +92,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   		 * the lowest priority available.
>   		 */
>   		if (entity->priority >= sched_list[0]->num_rqs) {
> -			drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n",
> +			drm_err(sched_list[0]->dev, "entity with out-of-bounds priority:%u num_rqs:%u\n",
>   				entity->priority, sched_list[0]->num_rqs);
>   			entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1,
>   						 (s32) DRM_SCHED_PRIORITY_KERNEL);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 7e90c9f95611..74a2fe51e653 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -103,9 +103,9 @@ static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched)
>   {
>   	u32 credits;
>   
> -	drm_WARN_ON(sched, check_sub_overflow(sched->credit_limit,
> -					      atomic_read(&sched->credit_count),
> -					      &credits));
> +	drm_WARN_ON(sched->dev, check_sub_overflow(sched->credit_limit,
> +						  atomic_read(&sched->credit_count),
> +						  &credits));
>   
>   	return credits;
>   }
> @@ -130,14 +130,14 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched,
>   	if (sched->ops->update_job_credits) {
>   		s_job->credits = sched->ops->update_job_credits(s_job);
>   
> -		drm_WARN(sched, !s_job->credits,
> +		drm_WARN(sched->dev, !s_job->credits,
>   			 "Jobs with zero credits bypass job-flow control.\n");
>   	}
>   
>   	/* If a job exceeds the credit limit, truncate it to the credit limit
>   	 * itself to guarantee forward progress.
>   	 */
> -	if (drm_WARN(sched, s_job->credits > sched->credit_limit,
> +	if (drm_WARN(sched->dev, s_job->credits > sched->credit_limit,
>   		     "Jobs may not exceed the credit limit, truncate.\n"))
>   		s_job->credits = sched->credit_limit;
>   
> @@ -701,7 +701,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   			if (r == -ENOENT)
>   				drm_sched_job_done(s_job, fence->error);
>   			else if (r)
> -				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> +				DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n",
>   					  r);
>   		} else
>   			drm_sched_job_done(s_job, -ECANCELED);
> @@ -797,7 +797,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   		 * or worse--a blank screen--leave a trail in the
>   		 * logs, so this can be debugged easier.
>   		 */
> -		drm_err(job->sched, "%s: entity has no rq!\n", __func__);
> +		drm_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
>   		return -ENOENT;
>   	}
>   
> @@ -1215,7 +1215,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>   		if (r == -ENOENT)
>   			drm_sched_job_done(sched_job, fence->error);
>   		else if (r)
> -			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r);
> +			DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n", r);
>   	} else {
>   		drm_sched_job_done(sched_job, IS_ERR(fence) ?
>   				   PTR_ERR(fence) : 0);
> @@ -1240,7 +1240,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>    *		used
>    * @score: optional score atomic shared with other schedulers
>    * @name: name used for debugging
> - * @dev: target &struct device
> + * @dev: target &struct drm_device
>    *
>    * Return 0 on success, otherwise error code.
>    */
> @@ -1249,7 +1249,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		   struct workqueue_struct *submit_wq,
>   		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
>   		   long timeout, struct workqueue_struct *timeout_wq,
> -		   atomic_t *score, const char *name, struct device *dev)
> +		   atomic_t *score, const char *name, struct drm_device *dev)
>   {
>   	int i;
>   
> @@ -1265,7 +1265,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   	if (num_rqs > DRM_SCHED_PRIORITY_COUNT) {
>   		/* This is a gross violation--tell drivers what the  problem is.
>   		 */
> -		drm_err(sched, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
> +		drm_err(dev, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
>   			__func__);
>   		return -EINVAL;
>   	} else if (sched->sched_rq) {
> @@ -1273,7 +1273,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		 * fine-tune their DRM calling order, and return all
>   		 * is good.
>   		 */
> -		drm_warn(sched, "%s: scheduler already initialized!\n", __func__);
> +		drm_warn(dev, "%s: scheduler already initialized!\n", __func__);
>   		return 0;
>   	}
>   
> @@ -1322,7 +1322,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   Out_check_own:
>   	if (sched->own_submit_wq)
>   		destroy_workqueue(sched->submit_wq);
> -	drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
> +	drm_err(dev, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
>   	return -ENOMEM;
>   }
>   EXPORT_SYMBOL(drm_sched_init);
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 7cd8c335cd9b..73383b6ef9bb 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -740,7 +740,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     hw_jobs_limit, job_hang_limit,
>   			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_bin", v3d->drm.dev);
> +			     NULL, "v3d_bin", &v3d->drm);
>   	if (ret)
>   		return ret;
>   
> @@ -749,7 +749,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     hw_jobs_limit, job_hang_limit,
>   			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_render", v3d->drm.dev);
> +			     NULL, "v3d_render", &v3d->drm);
>   	if (ret)
>   		goto fail;
>   
> @@ -758,7 +758,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     hw_jobs_limit, job_hang_limit,
>   			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_tfu", v3d->drm.dev);
> +			     NULL, "v3d_tfu", &v3d->drm);
>   	if (ret)
>   		goto fail;
>   
> @@ -768,7 +768,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   				     DRM_SCHED_PRIORITY_COUNT,
>   				     hw_jobs_limit, job_hang_limit,
>   				     msecs_to_jiffies(hang_limit_ms), NULL,
> -				     NULL, "v3d_csd", v3d->drm.dev);
> +				     NULL, "v3d_csd", &v3d->drm);
>   		if (ret)
>   			goto fail;
>   
> @@ -777,7 +777,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   				     DRM_SCHED_PRIORITY_COUNT,
>   				     hw_jobs_limit, job_hang_limit,
>   				     msecs_to_jiffies(hang_limit_ms), NULL,
> -				     NULL, "v3d_cache_clean", v3d->drm.dev);
> +				     NULL, "v3d_cache_clean", &v3d->drm);
>   		if (ret)
>   			goto fail;
>   	}
> @@ -787,7 +787,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>   			     DRM_SCHED_PRIORITY_COUNT,
>   			     1, job_hang_limit,
>   			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_cpu", v3d->drm.dev);
> +			     NULL, "v3d_cpu", &v3d->drm);
>   	if (ret)
>   		goto fail;
>   
> diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
> index dece2785933c..dc81e9f39727 100644
> --- a/drivers/gpu/drm/xe/xe_execlist.c
> +++ b/drivers/gpu/drm/xe/xe_execlist.c
> @@ -336,7 +336,7 @@ static int execlist_exec_queue_init(struct xe_exec_queue *q)
>   			     q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES,
>   			     XE_SCHED_HANG_LIMIT, XE_SCHED_JOB_TIMEOUT,
>   			     NULL, NULL, q->hwe->name,
> -			     gt_to_xe(q->gt)->drm.dev);
> +			     &gt_to_xe(q->gt)->drm);
>   	if (err)
>   		goto err_free;
>   
> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> index e4ad1d6ce1d5..66d36cac82a0 100644
> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> @@ -61,7 +61,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
>   		  uint32_t hw_submission, unsigned hang_limit,
>   		  long timeout, struct workqueue_struct *timeout_wq,
>   		  atomic_t *score, const char *name,
> -		  struct device *dev)
> +		  struct drm_device *dev)
>   {
>   	sched->ops = xe_ops;
>   	INIT_LIST_HEAD(&sched->msgs);
> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> index 10c6bb9c9386..9a75457813f2 100644
> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> @@ -16,7 +16,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
>   		  uint32_t hw_submission, unsigned hang_limit,
>   		  long timeout, struct workqueue_struct *timeout_wq,
>   		  atomic_t *score, const char *name,
> -		  struct device *dev);
> +		  struct drm_device *dev);
>   void xe_sched_fini(struct xe_gpu_scheduler *sched);
>   
>   void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index e4e3658e6a13..b9c114f2c715 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -1208,7 +1208,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
>   			    get_submit_wq(guc),
>   			    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
>   			    timeout, guc_to_gt(guc)->ordered_wq, NULL,
> -			    q->name, gt_to_xe(q->gt)->drm.dev);
> +			    q->name, &gt_to_xe(q->gt)->drm);
>   	if (err)
>   		goto err_free;
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 5acc64954a88..0ba8716ec069 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -539,7 +539,7 @@ struct drm_gpu_scheduler {
>   	bool				free_guilty;
>   	bool				pause_submit;
>   	bool				own_submit_wq;
> -	struct device			*dev;
> +	struct drm_device		*dev;
>   };
>   
>   int drm_sched_init(struct drm_gpu_scheduler *sched,
> @@ -547,7 +547,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		   struct workqueue_struct *submit_wq,
>   		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
>   		   long timeout, struct workqueue_struct *timeout_wq,
> -		   atomic_t *score, const char *name, struct device *dev);
> +		   atomic_t *score, const char *name, struct drm_device *dev);
>   
>   void drm_sched_fini(struct drm_gpu_scheduler *sched);
>   int drm_sched_job_init(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events
  2024-06-06 13:06 ` [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
@ 2024-06-06 13:19   ` Steven Rostedt
  2024-06-06 13:23     ` Christian König
  2024-06-07 13:21     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 2 replies; 14+ messages in thread
From: Steven Rostedt @ 2024-06-06 13:19 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer
  Cc: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala

On Thu, 6 Jun 2024 15:06:24 +0200
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> wrote:

> Print identifiers instead of pointers:
> * "fence=%p" is replaced by "fence=(context:%llu, seqno:%lld)" to have a
> coherent way to print the fence. A possible follow up change would be
> to use the same format in traces/../dma-fence.h.
> * "entity=%p" is removed because the fence's context is already an
> identifier of the job owner.
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>  .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 27 ++++++++++---------
>  1 file changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> index 0a19c121bda5..2d7f2252eb5d 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
> @@ -58,9 +58,9 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>  				   &sched_job->sched->credit_count);
>  			   __entry->dev_index = sched_job->sched->dev->primary->index;
>  			   ),
> -	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
> -		      __entry->entity, __entry->id,
> -		      __entry->fence, __get_str(name),
> +	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
> +		      __entry->id,
> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>  		      __entry->job_count, __entry->hw_job_count)
>  );
>  
> @@ -72,9 +72,9 @@ DEFINE_EVENT(drm_sched_job, drm_sched_job,
>  DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
>  	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
>  	    TP_ARGS(sched_job, entity),
> -	    TP_printk("dev_index=%d entity=%p id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
> -		      __entry->dev_index, __entry->entity, __entry->id,
> -		      __entry->fence, __get_str(name),
> +	    TP_printk("dev_index=%d id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
> +		      __entry->dev_index, __entry->id,
> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>  		      __entry->job_count, __entry->hw_job_count)

NACK!

You can't dereference pointers from TP_printk(). This is called seconds,
minutes, hours, even days or months after that pointer was assigned. How do
you know that pointer still points to anything?

-- Steve


>  );
>  
> @@ -88,7 +88,8 @@ TRACE_EVENT(drm_sched_process_job,
>  	    TP_fast_assign(
>  		    __entry->fence = &fence->finished;
>  		    ),
> -	    TP_printk("fence=%p signaled", __entry->fence)
> +	    TP_printk("fence=(context:%llu, seqno:%lld) signaled",
> +		      __entry->fence->context, __entry->fence->seqno)
>  );
>  
>  TRACE_EVENT(drm_sched_job_wait_dep,
> @@ -96,23 +97,25 @@ TRACE_EVENT(drm_sched_job_wait_dep,
>  	    TP_ARGS(sched_job, fence),
>  	    TP_STRUCT__entry(
>  			     __string(name, sched_job->sched->name)
> +			     __field(uint64_t, fence_context)
>  			     __field(uint64_t, id)
>  			     __field(struct dma_fence *, fence)
>  			     __field(uint64_t, ctx)
> -			     __field(unsigned, seqno)
> +			     __field(uint64_t, seqno)
>  			     ),
>  
>  	    TP_fast_assign(
>  			   __assign_str(name);
> +			   /* Store the hw exec fence context. */
> +			   __entry->fence_context = sched_job->entity->fence_context + 1;
>  			   __entry->id = sched_job->id;
>  			   __entry->fence = fence;
>  			   __entry->ctx = fence->context;
>  			   __entry->seqno = fence->seqno;
>  			   ),
> -	    TP_printk("job ring=%s, id=%llu, depends fence=%p, context=%llu, seq=%u",
> -		      __get_str(name), __entry->id,
> -		      __entry->fence, __entry->ctx,
> -		      __entry->seqno)
> +	    TP_printk("job ring=%s, fence_context=%llu, id=%llu, depends fence=(context:%llu, seqno:%lld)",
> +		      __get_str(name), __entry->fence_context, __entry->id,
> +		      __entry->ctx, __entry->seqno)
>  );
>  
>  #endif


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events
  2024-06-06 13:19   ` Steven Rostedt
@ 2024-06-06 13:23     ` Christian König
  2024-06-07 13:21     ` Pierre-Eric Pelloux-Prayer
  1 sibling, 0 replies; 14+ messages in thread
From: Christian König @ 2024-06-06 13:23 UTC (permalink / raw)
  To: Steven Rostedt, Pierre-Eric Pelloux-Prayer
  Cc: alexander.deucher, ltuikov89, matthew.brost, maarten.lankhorst,
	mripard, tzimmermann, airlied, daniel, dri-devel, ville.syrjala

Am 06.06.24 um 15:19 schrieb Steven Rostedt:
> On Thu, 6 Jun 2024 15:06:24 +0200
> Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> wrote:
>
>> Print identifiers instead of pointers:
>> * "fence=%p" is replaced by "fence=(context:%llu, seqno:%lld)" to have a
>> coherent way to print the fence. A possible follow up change would be
>> to use the same format in traces/../dma-fence.h.
>> * "entity=%p" is removed because the fence's context is already an
>> identifier of the job owner.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 27 ++++++++++---------
>>   1 file changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> index 0a19c121bda5..2d7f2252eb5d 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> @@ -58,9 +58,9 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>>   				   &sched_job->sched->credit_count);
>>   			   __entry->dev_index = sched_job->sched->dev->primary->index;
>>   			   ),
>> -	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
>> -		      __entry->entity, __entry->id,
>> -		      __entry->fence, __get_str(name),
>> +	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
>> +		      __entry->id,
>> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>>   		      __entry->job_count, __entry->hw_job_count)
>>   );
>>   
>> @@ -72,9 +72,9 @@ DEFINE_EVENT(drm_sched_job, drm_sched_job,
>>   DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
>>   	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
>>   	    TP_ARGS(sched_job, entity),
>> -	    TP_printk("dev_index=%d entity=%p id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
>> -		      __entry->dev_index, __entry->entity, __entry->id,
>> -		      __entry->fence, __get_str(name),
>> +	    TP_printk("dev_index=%d id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
>> +		      __entry->dev_index, __entry->id,
>> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>>   		      __entry->job_count, __entry->hw_job_count)
> NACK!
>
> You can't dereference pointers from TP_printk(). This is called seconds,
> minutes, hours, even days or months after that pointer was assigned. How do
> you know that pointer still points to anything?

Yeah, just wanted to reply the same thing. That is a really really bad idea.

You could in theory grab a reference to the fence, but we usually try to 
avoid that as well since it prevents modules from unloading.

Rather move the context and seqno directly as values into the trace event.

Christian.

>
> -- Steve
>
>
>>   );
>>   
>> @@ -88,7 +88,8 @@ TRACE_EVENT(drm_sched_process_job,
>>   	    TP_fast_assign(
>>   		    __entry->fence = &fence->finished;
>>   		    ),
>> -	    TP_printk("fence=%p signaled", __entry->fence)
>> +	    TP_printk("fence=(context:%llu, seqno:%lld) signaled",
>> +		      __entry->fence->context, __entry->fence->seqno)
>>   );
>>   
>>   TRACE_EVENT(drm_sched_job_wait_dep,
>> @@ -96,23 +97,25 @@ TRACE_EVENT(drm_sched_job_wait_dep,
>>   	    TP_ARGS(sched_job, fence),
>>   	    TP_STRUCT__entry(
>>   			     __string(name, sched_job->sched->name)
>> +			     __field(uint64_t, fence_context)
>>   			     __field(uint64_t, id)
>>   			     __field(struct dma_fence *, fence)
>>   			     __field(uint64_t, ctx)
>> -			     __field(unsigned, seqno)
>> +			     __field(uint64_t, seqno)
>>   			     ),
>>   
>>   	    TP_fast_assign(
>>   			   __assign_str(name);
>> +			   /* Store the hw exec fence context. */
>> +			   __entry->fence_context = sched_job->entity->fence_context + 1;
>>   			   __entry->id = sched_job->id;
>>   			   __entry->fence = fence;
>>   			   __entry->ctx = fence->context;
>>   			   __entry->seqno = fence->seqno;
>>   			   ),
>> -	    TP_printk("job ring=%s, id=%llu, depends fence=%p, context=%llu, seq=%u",
>> -		      __get_str(name), __entry->id,
>> -		      __entry->fence, __entry->ctx,
>> -		      __entry->seqno)
>> +	    TP_printk("job ring=%s, fence_context=%llu, id=%llu, depends fence=(context:%llu, seqno:%lld)",
>> +		      __get_str(name), __entry->fence_context, __entry->id,
>> +		      __entry->ctx, __entry->seqno)
>>   );
>>   
>>   #endif


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:18   ` Christian König
@ 2024-06-06 13:23     ` Ville Syrjälä
  2024-06-06 16:38     ` Matthew Brost
  2024-06-07 13:55     ` Pierre-Eric Pelloux-Prayer
  2 siblings, 0 replies; 14+ messages in thread
From: Ville Syrjälä @ 2024-06-06 13:23 UTC (permalink / raw)
  To: Christian König
  Cc: Pierre-Eric Pelloux-Prayer, alexander.deucher, ltuikov89,
	matthew.brost, maarten.lankhorst, mripard, tzimmermann, airlied,
	daniel, dri-devel, rostedt

On Thu, Jun 06, 2024 at 03:18:14PM +0200, Christian König wrote:
> Am 06.06.24 um 15:06 schrieb Pierre-Eric Pelloux-Prayer:
> > When tracing is enabled, being able to identify which device is sending
> > events is useful; for this the next commit will extend events to include
> > drm_device::primary::index.
> 
> That sounds like a rather bad idea since the primary index is really 
> just an arbitrary number and not defined for all devices.
> 
> Why not use the device name instead? This way you don't need this change 
> in the first place.

FWIW dev_name() is what I added to all i915 display tracepoints.

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:18   ` Christian König
  2024-06-06 13:23     ` Ville Syrjälä
@ 2024-06-06 16:38     ` Matthew Brost
  2024-06-07 13:55     ` Pierre-Eric Pelloux-Prayer
  2 siblings, 0 replies; 14+ messages in thread
From: Matthew Brost @ 2024-06-06 16:38 UTC (permalink / raw)
  To: Christian König
  Cc: Pierre-Eric Pelloux-Prayer, alexander.deucher, ltuikov89,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala, rostedt

On Thu, Jun 06, 2024 at 03:18:14PM +0200, Christian König wrote:
> Am 06.06.24 um 15:06 schrieb Pierre-Eric Pelloux-Prayer:
> > When tracing is enabled, being able to identify which device is sending
> > events is useful; for this the next commit will extend events to include
> > drm_device::primary::index.
> 
> That sounds like a rather bad idea since the primary index is really just an
> arbitrary number and not defined for all devices.
> 
> Why not use the device name instead? This way you don't need this change in
> the first place.
> 

+1.

Matt

> Regards,
> Christian.
> 
> > 
> > Since the device member is only used in the drm_* log macros, we can
> > replace it by a drm_device pointer.
> > 
> > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  2 +-
> >   drivers/gpu/drm/imagination/pvr_queue.c    |  2 +-
> >   drivers/gpu/drm/lima/lima_sched.c          |  2 +-
> >   drivers/gpu/drm/msm/msm_ringbuffer.c       |  2 +-
> >   drivers/gpu/drm/nouveau/nouveau_sched.c    |  2 +-
> >   drivers/gpu/drm/panfrost/panfrost_job.c    |  2 +-
> >   drivers/gpu/drm/panthor/panthor_mmu.c      |  2 +-
> >   drivers/gpu/drm/panthor/panthor_sched.c    |  2 +-
> >   drivers/gpu/drm/scheduler/sched_entity.c   |  2 +-
> >   drivers/gpu/drm/scheduler/sched_main.c     | 26 +++++++++++-----------
> >   drivers/gpu/drm/v3d/v3d_sched.c            | 12 +++++-----
> >   drivers/gpu/drm/xe/xe_execlist.c           |  2 +-
> >   drivers/gpu/drm/xe/xe_gpu_scheduler.c      |  2 +-
> >   drivers/gpu/drm/xe/xe_gpu_scheduler.h      |  2 +-
> >   drivers/gpu/drm/xe/xe_guc_submit.c         |  2 +-
> >   include/drm/gpu_scheduler.h                |  4 ++--
> >   17 files changed, 35 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 932dc93b2e63..7f2a68ad8034 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2738,7 +2738,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> >   				   ring->num_hw_submission, 0,
> >   				   timeout, adev->reset_domain->wq,
> >   				   ring->sched_score, ring->name,
> > -				   adev->dev);
> > +				   &adev->ddev);
> >   		if (r) {
> >   			DRM_ERROR("Failed to create scheduler on ring %s.\n",
> >   				  ring->name);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index c4b04b0dee16..c4345b68a51f 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -138,7 +138,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> >   			     msecs_to_jiffies(500), NULL, NULL,
> > -			     dev_name(gpu->dev), gpu->dev);
> > +			     dev_name(gpu->dev), gpu->drm);
> >   	if (ret)
> >   		return ret;
> > diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c
> > index 5ed9c98fb599..cdbb6c01e952 100644
> > --- a/drivers/gpu/drm/imagination/pvr_queue.c
> > +++ b/drivers/gpu/drm/imagination/pvr_queue.c
> > @@ -1287,7 +1287,7 @@ struct pvr_queue *pvr_queue_create(struct pvr_context *ctx,
> >   			     pvr_dev->sched_wq, 1, 64 * 1024, 1,
> >   			     msecs_to_jiffies(500),
> >   			     pvr_dev->sched_wq, NULL, "pvr-queue",
> > -			     pvr_dev->base.dev);
> > +			     &pvr_dev->base);
> >   	if (err)
> >   		goto err_release_ufo;
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index bbf3f8feab94..db6ee7650468 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -526,7 +526,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
> >   			      1,
> >   			      lima_job_hang_limit,
> >   			      msecs_to_jiffies(timeout), NULL,
> > -			      NULL, name, pipe->ldev->dev);
> > +			      NULL, name, pipe->ldev->ddev);
> >   }
> >   void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > index 9d6655f96f0c..3a4b3816f2c9 100644
> > --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > @@ -101,7 +101,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
> >   	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     num_hw_submissions, 0, sched_timeout,
> > -			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> > +			     NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev);
> >   	if (ret) {
> >   		goto fail;
> >   	}
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > index 32fa2e273965..386839bed8a2 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > @@ -419,7 +419,7 @@ nouveau_sched_init(struct nouveau_sched *sched, struct nouveau_drm *drm,
> >   	ret = drm_sched_init(drm_sched, &nouveau_sched_ops, wq,
> >   			     NOUVEAU_SCHED_PRIORITY_COUNT,
> >   			     credit_limit, 0, job_hang_limit,
> > -			     NULL, NULL, "nouveau_sched", drm->dev->dev);
> > +			     NULL, NULL, "nouveau_sched", drm->dev);
> >   	if (ret)
> >   		goto fail_wq;
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index a61ef0af9a4e..28c7680a8dbf 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -875,7 +875,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
> >   				     nentries, 0,
> >   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
> >   				     pfdev->reset.wq,
> > -				     NULL, "pan_js", pfdev->dev);
> > +				     NULL, "pan_js", pfdev->ddev);
> >   		if (ret) {
> >   			dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
> >   			goto err_sched;
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > index fa0a002b1016..b9c5b500b7d1 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > @@ -2293,7 +2293,7 @@ panthor_vm_create(struct panthor_device *ptdev, bool for_mcu,
> >   	ret = drm_sched_init(&vm->sched, &panthor_vm_bind_ops, ptdev->mmu->vm.wq,
> >   			     1, 1, 0,
> >   			     MAX_SCHEDULE_TIMEOUT, NULL, NULL,
> > -			     "panthor-vm-bind", ptdev->base.dev);
> > +			     "panthor-vm-bind", &ptdev->base);
> >   	if (ret)
> >   		goto err_free_io_pgtable;
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > index 79ffcbc41d78..47e52f61571b 100644
> > --- a/drivers/gpu/drm/panthor/panthor_sched.c
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -3043,7 +3043,7 @@ group_create_queue(struct panthor_group *group,
> >   			     args->ringbuf_size / (NUM_INSTRS_PER_SLOT * sizeof(u64)),
> >   			     0, msecs_to_jiffies(JOB_TIMEOUT_MS),
> >   			     group->ptdev->reset.wq,
> > -			     NULL, "panthor-queue", group->ptdev->base.dev);
> > +			     NULL, "panthor-queue", &group->ptdev->base);
> >   	if (ret)
> >   		goto err_free_queue;
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 58c8161289fe..194798b9ce09 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -92,7 +92,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >   		 * the lowest priority available.
> >   		 */
> >   		if (entity->priority >= sched_list[0]->num_rqs) {
> > -			drm_err(sched_list[0], "entity with out-of-bounds priority:%u num_rqs:%u\n",
> > +			drm_err(sched_list[0]->dev, "entity with out-of-bounds priority:%u num_rqs:%u\n",
> >   				entity->priority, sched_list[0]->num_rqs);
> >   			entity->priority = max_t(s32, (s32) sched_list[0]->num_rqs - 1,
> >   						 (s32) DRM_SCHED_PRIORITY_KERNEL);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 7e90c9f95611..74a2fe51e653 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -103,9 +103,9 @@ static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched)
> >   {
> >   	u32 credits;
> > -	drm_WARN_ON(sched, check_sub_overflow(sched->credit_limit,
> > -					      atomic_read(&sched->credit_count),
> > -					      &credits));
> > +	drm_WARN_ON(sched->dev, check_sub_overflow(sched->credit_limit,
> > +						  atomic_read(&sched->credit_count),
> > +						  &credits));
> >   	return credits;
> >   }
> > @@ -130,14 +130,14 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched,
> >   	if (sched->ops->update_job_credits) {
> >   		s_job->credits = sched->ops->update_job_credits(s_job);
> > -		drm_WARN(sched, !s_job->credits,
> > +		drm_WARN(sched->dev, !s_job->credits,
> >   			 "Jobs with zero credits bypass job-flow control.\n");
> >   	}
> >   	/* If a job exceeds the credit limit, truncate it to the credit limit
> >   	 * itself to guarantee forward progress.
> >   	 */
> > -	if (drm_WARN(sched, s_job->credits > sched->credit_limit,
> > +	if (drm_WARN(sched->dev, s_job->credits > sched->credit_limit,
> >   		     "Jobs may not exceed the credit limit, truncate.\n"))
> >   		s_job->credits = sched->credit_limit;
> > @@ -701,7 +701,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
> >   			if (r == -ENOENT)
> >   				drm_sched_job_done(s_job, fence->error);
> >   			else if (r)
> > -				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> > +				DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n",
> >   					  r);
> >   		} else
> >   			drm_sched_job_done(s_job, -ECANCELED);
> > @@ -797,7 +797,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >   		 * or worse--a blank screen--leave a trail in the
> >   		 * logs, so this can be debugged easier.
> >   		 */
> > -		drm_err(job->sched, "%s: entity has no rq!\n", __func__);
> > +		drm_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
> >   		return -ENOENT;
> >   	}
> > @@ -1215,7 +1215,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
> >   		if (r == -ENOENT)
> >   			drm_sched_job_done(sched_job, fence->error);
> >   		else if (r)
> > -			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r);
> > +			DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed (%d)\n", r);
> >   	} else {
> >   		drm_sched_job_done(sched_job, IS_ERR(fence) ?
> >   				   PTR_ERR(fence) : 0);
> > @@ -1240,7 +1240,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
> >    *		used
> >    * @score: optional score atomic shared with other schedulers
> >    * @name: name used for debugging
> > - * @dev: target &struct device
> > + * @dev: target &struct drm_device
> >    *
> >    * Return 0 on success, otherwise error code.
> >    */
> > @@ -1249,7 +1249,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >   		   struct workqueue_struct *submit_wq,
> >   		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
> >   		   long timeout, struct workqueue_struct *timeout_wq,
> > -		   atomic_t *score, const char *name, struct device *dev)
> > +		   atomic_t *score, const char *name, struct drm_device *dev)
> >   {
> >   	int i;
> > @@ -1265,7 +1265,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >   	if (num_rqs > DRM_SCHED_PRIORITY_COUNT) {
> >   		/* This is a gross violation--tell drivers what the  problem is.
> >   		 */
> > -		drm_err(sched, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
> > +		drm_err(dev, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
> >   			__func__);
> >   		return -EINVAL;
> >   	} else if (sched->sched_rq) {
> > @@ -1273,7 +1273,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >   		 * fine-tune their DRM calling order, and return all
> >   		 * is good.
> >   		 */
> > -		drm_warn(sched, "%s: scheduler already initialized!\n", __func__);
> > +		drm_warn(dev, "%s: scheduler already initialized!\n", __func__);
> >   		return 0;
> >   	}
> > @@ -1322,7 +1322,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >   Out_check_own:
> >   	if (sched->own_submit_wq)
> >   		destroy_workqueue(sched->submit_wq);
> > -	drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
> > +	drm_err(dev, "%s: Failed to setup GPU scheduler--out of memory\n", __func__);
> >   	return -ENOMEM;
> >   }
> >   EXPORT_SYMBOL(drm_sched_init);
> > diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> > index 7cd8c335cd9b..73383b6ef9bb 100644
> > --- a/drivers/gpu/drm/v3d/v3d_sched.c
> > +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> > @@ -740,7 +740,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     hw_jobs_limit, job_hang_limit,
> >   			     msecs_to_jiffies(hang_limit_ms), NULL,
> > -			     NULL, "v3d_bin", v3d->drm.dev);
> > +			     NULL, "v3d_bin", &v3d->drm);
> >   	if (ret)
> >   		return ret;
> > @@ -749,7 +749,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     hw_jobs_limit, job_hang_limit,
> >   			     msecs_to_jiffies(hang_limit_ms), NULL,
> > -			     NULL, "v3d_render", v3d->drm.dev);
> > +			     NULL, "v3d_render", &v3d->drm);
> >   	if (ret)
> >   		goto fail;
> > @@ -758,7 +758,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     hw_jobs_limit, job_hang_limit,
> >   			     msecs_to_jiffies(hang_limit_ms), NULL,
> > -			     NULL, "v3d_tfu", v3d->drm.dev);
> > +			     NULL, "v3d_tfu", &v3d->drm);
> >   	if (ret)
> >   		goto fail;
> > @@ -768,7 +768,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   				     DRM_SCHED_PRIORITY_COUNT,
> >   				     hw_jobs_limit, job_hang_limit,
> >   				     msecs_to_jiffies(hang_limit_ms), NULL,
> > -				     NULL, "v3d_csd", v3d->drm.dev);
> > +				     NULL, "v3d_csd", &v3d->drm);
> >   		if (ret)
> >   			goto fail;
> > @@ -777,7 +777,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   				     DRM_SCHED_PRIORITY_COUNT,
> >   				     hw_jobs_limit, job_hang_limit,
> >   				     msecs_to_jiffies(hang_limit_ms), NULL,
> > -				     NULL, "v3d_cache_clean", v3d->drm.dev);
> > +				     NULL, "v3d_cache_clean", &v3d->drm);
> >   		if (ret)
> >   			goto fail;
> >   	}
> > @@ -787,7 +787,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >   			     DRM_SCHED_PRIORITY_COUNT,
> >   			     1, job_hang_limit,
> >   			     msecs_to_jiffies(hang_limit_ms), NULL,
> > -			     NULL, "v3d_cpu", v3d->drm.dev);
> > +			     NULL, "v3d_cpu", &v3d->drm);
> >   	if (ret)
> >   		goto fail;
> > diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
> > index dece2785933c..dc81e9f39727 100644
> > --- a/drivers/gpu/drm/xe/xe_execlist.c
> > +++ b/drivers/gpu/drm/xe/xe_execlist.c
> > @@ -336,7 +336,7 @@ static int execlist_exec_queue_init(struct xe_exec_queue *q)
> >   			     q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES,
> >   			     XE_SCHED_HANG_LIMIT, XE_SCHED_JOB_TIMEOUT,
> >   			     NULL, NULL, q->hwe->name,
> > -			     gt_to_xe(q->gt)->drm.dev);
> > +			     &gt_to_xe(q->gt)->drm);
> >   	if (err)
> >   		goto err_free;
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index e4ad1d6ce1d5..66d36cac82a0 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -61,7 +61,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
> >   		  uint32_t hw_submission, unsigned hang_limit,
> >   		  long timeout, struct workqueue_struct *timeout_wq,
> >   		  atomic_t *score, const char *name,
> > -		  struct device *dev)
> > +		  struct drm_device *dev)
> >   {
> >   	sched->ops = xe_ops;
> >   	INIT_LIST_HEAD(&sched->msgs);
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > index 10c6bb9c9386..9a75457813f2 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > @@ -16,7 +16,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
> >   		  uint32_t hw_submission, unsigned hang_limit,
> >   		  long timeout, struct workqueue_struct *timeout_wq,
> >   		  atomic_t *score, const char *name,
> > -		  struct device *dev);
> > +		  struct drm_device *dev);
> >   void xe_sched_fini(struct xe_gpu_scheduler *sched);
> >   void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index e4e3658e6a13..b9c114f2c715 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -1208,7 +1208,7 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
> >   			    get_submit_wq(guc),
> >   			    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
> >   			    timeout, guc_to_gt(guc)->ordered_wq, NULL,
> > -			    q->name, gt_to_xe(q->gt)->drm.dev);
> > +			    q->name, &gt_to_xe(q->gt)->drm);
> >   	if (err)
> >   		goto err_free;
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 5acc64954a88..0ba8716ec069 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -539,7 +539,7 @@ struct drm_gpu_scheduler {
> >   	bool				free_guilty;
> >   	bool				pause_submit;
> >   	bool				own_submit_wq;
> > -	struct device			*dev;
> > +	struct drm_device		*dev;
> >   };
> >   int drm_sched_init(struct drm_gpu_scheduler *sched,
> > @@ -547,7 +547,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >   		   struct workqueue_struct *submit_wq,
> >   		   u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
> >   		   long timeout, struct workqueue_struct *timeout_wq,
> > -		   atomic_t *score, const char *name, struct device *dev);
> > +		   atomic_t *score, const char *name, struct drm_device *dev);
> >   void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >   int drm_sched_job_init(struct drm_sched_job *job,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
  2024-06-06 13:18   ` Christian König
@ 2024-06-07  0:34   ` kernel test robot
  2024-06-08  0:36   ` kernel test robot
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2024-06-07  0:34 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, alexander.deucher, christian.koenig,
	ltuikov89, matthew.brost, maarten.lankhorst, mripard, tzimmermann,
	airlied, daniel, dri-devel, ville.syrjala, rostedt
  Cc: oe-kbuild-all, Pierre-Eric Pelloux-Prayer

Hi Pierre-Eric,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.10-rc2 next-20240606]
[cannot apply to drm-xe/drm-xe-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pierre-Eric-Pelloux-Prayer/drm-sched-store-the-drm_device-instead-of-the-device/20240606-211050
base:   linus/master
patch link:    https://lore.kernel.org/r/20240606130629.214827-2-pierre-eric.pelloux-prayer%40amd.com
patch subject: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
config: arm64-defconfig (https://download.01.org/0day-ci/archive/20240607/202406070826.ASQuMJ48-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240607/202406070826.ASQuMJ48-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406070826.ASQuMJ48-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/device.h:15,
                    from include/drm/drm_print.h:31,
                    from include/drm/drm_mm.h:51,
                    from include/drm/drm_vma_manager.h:26,
                    from include/drm/drm_gem.h:42,
                    from drivers/gpu/drm/imagination/pvr_gem.h:12,
                    from drivers/gpu/drm/imagination/pvr_fw.h:9,
                    from drivers/gpu/drm/imagination/pvr_device.h:9,
                    from drivers/gpu/drm/imagination/pvr_context.h:17,
                    from drivers/gpu/drm/imagination/pvr_queue.c:8:
   drivers/gpu/drm/imagination/pvr_queue.c: In function 'pvr_queue_timedout_job':
>> drivers/gpu/drm/imagination/pvr_queue.c:807:22: error: passing argument 1 of '_dev_err' from incompatible pointer type [-Werror=incompatible-pointer-types]
     807 |         dev_err(sched->dev, "Job timeout\n");
         |                 ~~~~~^~~~~
         |                      |
         |                      struct drm_device *
   include/linux/dev_printk.h:110:25: note: in definition of macro 'dev_printk_index_wrap'
     110 |                 _p_func(dev, fmt, ##__VA_ARGS__);                       \
         |                         ^~~
   drivers/gpu/drm/imagination/pvr_queue.c:807:9: note: in expansion of macro 'dev_err'
     807 |         dev_err(sched->dev, "Job timeout\n");
         |         ^~~~~~~
   include/linux/dev_printk.h:50:36: note: expected 'const struct device *' but argument is of type 'struct drm_device *'
      50 | void _dev_err(const struct device *dev, const char *fmt, ...);
         |               ~~~~~~~~~~~~~~~~~~~~~^~~
   cc1: some warnings being treated as errors


vim +/_dev_err +807 drivers/gpu/drm/imagination/pvr_queue.c

eaf01ee5ba28b9 Sarah Walker 2023-11-22  787  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  788  /**
eaf01ee5ba28b9 Sarah Walker 2023-11-22  789   * pvr_queue_timedout_job() - Handle a job timeout event.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  790   * @s_job: The job this timeout occurred on.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  791   *
eaf01ee5ba28b9 Sarah Walker 2023-11-22  792   * FIXME: We don't do anything here to unblock the situation, we just stop+start
eaf01ee5ba28b9 Sarah Walker 2023-11-22  793   * the scheduler, and re-assign parent fences in the middle.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  794   *
eaf01ee5ba28b9 Sarah Walker 2023-11-22  795   * Return:
eaf01ee5ba28b9 Sarah Walker 2023-11-22  796   *  * DRM_GPU_SCHED_STAT_NOMINAL.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  797   */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  798  static enum drm_gpu_sched_stat
eaf01ee5ba28b9 Sarah Walker 2023-11-22  799  pvr_queue_timedout_job(struct drm_sched_job *s_job)
eaf01ee5ba28b9 Sarah Walker 2023-11-22  800  {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  801  	struct drm_gpu_scheduler *sched = s_job->sched;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  802  	struct pvr_queue *queue = container_of(sched, struct pvr_queue, scheduler);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  803  	struct pvr_device *pvr_dev = queue->ctx->pvr_dev;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  804  	struct pvr_job *job;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  805  	u32 job_count = 0;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  806  
eaf01ee5ba28b9 Sarah Walker 2023-11-22 @807  	dev_err(sched->dev, "Job timeout\n");
eaf01ee5ba28b9 Sarah Walker 2023-11-22  808  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  809  	/* Before we stop the scheduler, make sure the queue is out of any list, so
eaf01ee5ba28b9 Sarah Walker 2023-11-22  810  	 * any call to pvr_queue_update_active_state_locked() that might happen
eaf01ee5ba28b9 Sarah Walker 2023-11-22  811  	 * until the scheduler is really stopped doesn't end up re-inserting the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  812  	 * queue in the active list. This would cause
eaf01ee5ba28b9 Sarah Walker 2023-11-22  813  	 * pvr_queue_signal_done_fences() and drm_sched_stop() to race with each
eaf01ee5ba28b9 Sarah Walker 2023-11-22  814  	 * other when accessing the pending_list, since drm_sched_stop() doesn't
eaf01ee5ba28b9 Sarah Walker 2023-11-22  815  	 * grab the job_list_lock when modifying the list (it's assuming the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  816  	 * only other accessor is the scheduler, and it's safe to not grab the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  817  	 * lock since it's stopped).
eaf01ee5ba28b9 Sarah Walker 2023-11-22  818  	 */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  819  	mutex_lock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  820  	list_del_init(&queue->node);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  821  	mutex_unlock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  822  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  823  	drm_sched_stop(sched, s_job);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  824  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  825  	/* Re-assign job parent fences. */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  826  	list_for_each_entry(job, &sched->pending_list, base.list) {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  827  		job->base.s_fence->parent = dma_fence_get(job->done_fence);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  828  		job_count++;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  829  	}
eaf01ee5ba28b9 Sarah Walker 2023-11-22  830  	WARN_ON(atomic_read(&queue->in_flight_job_count) != job_count);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  831  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  832  	/* Re-insert the queue in the proper list, and kick a queue processing
eaf01ee5ba28b9 Sarah Walker 2023-11-22  833  	 * operation if there were jobs pending.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  834  	 */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  835  	mutex_lock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  836  	if (!job_count) {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  837  		list_move_tail(&queue->node, &pvr_dev->queues.idle);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  838  	} else {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  839  		atomic_set(&queue->in_flight_job_count, job_count);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  840  		list_move_tail(&queue->node, &pvr_dev->queues.active);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  841  		pvr_queue_process(queue);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  842  	}
eaf01ee5ba28b9 Sarah Walker 2023-11-22  843  	mutex_unlock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  844  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  845  	drm_sched_start(sched, true);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  846  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  847  	return DRM_GPU_SCHED_STAT_NOMINAL;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  848  }
eaf01ee5ba28b9 Sarah Walker 2023-11-22  849  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events
  2024-06-06 13:19   ` Steven Rostedt
  2024-06-06 13:23     ` Christian König
@ 2024-06-07 13:21     ` Pierre-Eric Pelloux-Prayer
  1 sibling, 0 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-07 13:21 UTC (permalink / raw)
  To: Steven Rostedt, Pierre-Eric Pelloux-Prayer
  Cc: alexander.deucher, christian.koenig, ltuikov89, matthew.brost,
	maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
	dri-devel, ville.syrjala

Hi,

Le 06/06/2024 à 15:19, Steven Rostedt a écrit :
> On Thu, 6 Jun 2024 15:06:24 +0200
> Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> wrote:
> 
>> Print identifiers instead of pointers:
>> * "fence=%p" is replaced by "fence=(context:%llu, seqno:%lld)" to have a
>> coherent way to print the fence. A possible follow up change would be
>> to use the same format in traces/../dma-fence.h.
>> * "entity=%p" is removed because the fence's context is already an
>> identifier of the job owner.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 27 ++++++++++---------
>>   1 file changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> index 0a19c121bda5..2d7f2252eb5d 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
>> @@ -58,9 +58,9 @@ DECLARE_EVENT_CLASS(drm_sched_job,
>>   				   &sched_job->sched->credit_count);
>>   			   __entry->dev_index = sched_job->sched->dev->primary->index;
>>   			   ),
>> -	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
>> -		      __entry->entity, __entry->id,
>> -		      __entry->fence, __get_str(name),
>> +	    TP_printk("id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
>> +		      __entry->id,
>> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>>   		      __entry->job_count, __entry->hw_job_count)
>>   );
>>   
>> @@ -72,9 +72,9 @@ DEFINE_EVENT(drm_sched_job, drm_sched_job,
>>   DEFINE_EVENT_PRINT(drm_sched_job, drm_run_job,
>>   	    TP_PROTO(struct drm_sched_job *sched_job, struct drm_sched_entity *entity),
>>   	    TP_ARGS(sched_job, entity),
>> -	    TP_printk("dev_index=%d entity=%p id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
>> -		      __entry->dev_index, __entry->entity, __entry->id,
>> -		      __entry->fence, __get_str(name),
>> +	    TP_printk("dev_index=%d id=%llu, fence=(context:%llu, seqno:%lld), ring=%s, job count:%u, hw job count:%d",
>> +		      __entry->dev_index, __entry->id,
>> +		      __entry->fence->context, __entry->fence->seqno, __get_str(name),
>>   		      __entry->job_count, __entry->hw_job_count)
> 
> NACK!
> 
> You can't dereference pointers from TP_printk(). This is called seconds,
> minutes, hours, even days or months after that pointer was assigned. How do
> you know that pointer still points to anything?

Now that you pointed it out, the problem is obvious indeed.
I have fixed it locally.

Thanks!
Pierre-Eric

> 
> -- Steve
> 
> 
>>   );
>>   
>> @@ -88,7 +88,8 @@ TRACE_EVENT(drm_sched_process_job,
>>   	    TP_fast_assign(
>>   		    __entry->fence = &fence->finished;
>>   		    ),
>> -	    TP_printk("fence=%p signaled", __entry->fence)
>> +	    TP_printk("fence=(context:%llu, seqno:%lld) signaled",
>> +		      __entry->fence->context, __entry->fence->seqno)
>>   );
>>   
>>   TRACE_EVENT(drm_sched_job_wait_dep,
>> @@ -96,23 +97,25 @@ TRACE_EVENT(drm_sched_job_wait_dep,
>>   	    TP_ARGS(sched_job, fence),
>>   	    TP_STRUCT__entry(
>>   			     __string(name, sched_job->sched->name)
>> +			     __field(uint64_t, fence_context)
>>   			     __field(uint64_t, id)
>>   			     __field(struct dma_fence *, fence)
>>   			     __field(uint64_t, ctx)
>> -			     __field(unsigned, seqno)
>> +			     __field(uint64_t, seqno)
>>   			     ),
>>   
>>   	    TP_fast_assign(
>>   			   __assign_str(name);
>> +			   /* Store the hw exec fence context. */
>> +			   __entry->fence_context = sched_job->entity->fence_context + 1;
>>   			   __entry->id = sched_job->id;
>>   			   __entry->fence = fence;
>>   			   __entry->ctx = fence->context;
>>   			   __entry->seqno = fence->seqno;
>>   			   ),
>> -	    TP_printk("job ring=%s, id=%llu, depends fence=%p, context=%llu, seq=%u",
>> -		      __get_str(name), __entry->id,
>> -		      __entry->fence, __entry->ctx,
>> -		      __entry->seqno)
>> +	    TP_printk("job ring=%s, fence_context=%llu, id=%llu, depends fence=(context:%llu, seqno:%lld)",
>> +		      __get_str(name), __entry->fence_context, __entry->id,
>> +		      __entry->ctx, __entry->seqno)
>>   );
>>   
>>   #endif

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:18   ` Christian König
  2024-06-06 13:23     ` Ville Syrjälä
  2024-06-06 16:38     ` Matthew Brost
@ 2024-06-07 13:55     ` Pierre-Eric Pelloux-Prayer
  2 siblings, 0 replies; 14+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-06-07 13:55 UTC (permalink / raw)
  To: Christian König, Pierre-Eric Pelloux-Prayer,
	alexander.deucher, ltuikov89, matthew.brost, maarten.lankhorst,
	mripard, tzimmermann, airlied, daniel, dri-devel, ville.syrjala,
	rostedt

Hi,

Le 06/06/2024 à 15:18, Christian König a écrit :
> Am 06.06.24 um 15:06 schrieb Pierre-Eric Pelloux-Prayer:
>> When tracing is enabled, being able to identify which device is sending
>> events is useful; for this the next commit will extend events to include
>> drm_device::primary::index.
> 
> That sounds like a rather bad idea since the primary index is really 
> just an arbitrary number and not defined for all devices.
> 
> Why not use the device name instead? This way you don't need this change 
> in the first place.

Good point, it's indeed a better idea. I'll drop this patch and will 
update the next one.

Thanks,
Pierre-Eric

> 
> Regards,
> Christian.
> 
>>
>> Since the device member is only used in the drm_* log macros, we can
>> replace it by a drm_device pointer.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer 
>> <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  2 +-
>>   drivers/gpu/drm/imagination/pvr_queue.c    |  2 +-
>>   drivers/gpu/drm/lima/lima_sched.c          |  2 +-
>>   drivers/gpu/drm/msm/msm_ringbuffer.c       |  2 +-
>>   drivers/gpu/drm/nouveau/nouveau_sched.c    |  2 +-
>>   drivers/gpu/drm/panfrost/panfrost_job.c    |  2 +-
>>   drivers/gpu/drm/panthor/panthor_mmu.c      |  2 +-
>>   drivers/gpu/drm/panthor/panthor_sched.c    |  2 +-
>>   drivers/gpu/drm/scheduler/sched_entity.c   |  2 +-
>>   drivers/gpu/drm/scheduler/sched_main.c     | 26 +++++++++++-----------
>>   drivers/gpu/drm/v3d/v3d_sched.c            | 12 +++++-----
>>   drivers/gpu/drm/xe/xe_execlist.c           |  2 +-
>>   drivers/gpu/drm/xe/xe_gpu_scheduler.c      |  2 +-
>>   drivers/gpu/drm/xe/xe_gpu_scheduler.h      |  2 +-
>>   drivers/gpu/drm/xe/xe_guc_submit.c         |  2 +-
>>   include/drm/gpu_scheduler.h                |  4 ++--
>>   17 files changed, 35 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 932dc93b2e63..7f2a68ad8034 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2738,7 +2738,7 @@ static int amdgpu_device_init_schedulers(struct 
>> amdgpu_device *adev)
>>                      ring->num_hw_submission, 0,
>>                      timeout, adev->reset_domain->wq,
>>                      ring->sched_score, ring->name,
>> -                   adev->dev);
>> +                   &adev->ddev);
>>           if (r) {
>>               DRM_ERROR("Failed to create scheduler on ring %s.\n",
>>                     ring->name);
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
>> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> index c4b04b0dee16..c4345b68a51f 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>> @@ -138,7 +138,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>                    msecs_to_jiffies(500), NULL, NULL,
>> -                 dev_name(gpu->dev), gpu->dev);
>> +                 dev_name(gpu->dev), gpu->drm);
>>       if (ret)
>>           return ret;
>> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c 
>> b/drivers/gpu/drm/imagination/pvr_queue.c
>> index 5ed9c98fb599..cdbb6c01e952 100644
>> --- a/drivers/gpu/drm/imagination/pvr_queue.c
>> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
>> @@ -1287,7 +1287,7 @@ struct pvr_queue *pvr_queue_create(struct 
>> pvr_context *ctx,
>>                    pvr_dev->sched_wq, 1, 64 * 1024, 1,
>>                    msecs_to_jiffies(500),
>>                    pvr_dev->sched_wq, NULL, "pvr-queue",
>> -                 pvr_dev->base.dev);
>> +                 &pvr_dev->base);
>>       if (err)
>>           goto err_release_ufo;
>> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
>> b/drivers/gpu/drm/lima/lima_sched.c
>> index bbf3f8feab94..db6ee7650468 100644
>> --- a/drivers/gpu/drm/lima/lima_sched.c
>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>> @@ -526,7 +526,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe 
>> *pipe, const char *name)
>>                     1,
>>                     lima_job_hang_limit,
>>                     msecs_to_jiffies(timeout), NULL,
>> -                  NULL, name, pipe->ldev->dev);
>> +                  NULL, name, pipe->ldev->ddev);
>>   }
>>   void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
>> b/drivers/gpu/drm/msm/msm_ringbuffer.c
>> index 9d6655f96f0c..3a4b3816f2c9 100644
>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>> @@ -101,7 +101,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct 
>> msm_gpu *gpu, int id,
>>       ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    num_hw_submissions, 0, sched_timeout,
>> -                 NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>> +                 NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev);
>>       if (ret) {
>>           goto fail;
>>       }
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c 
>> b/drivers/gpu/drm/nouveau/nouveau_sched.c
>> index 32fa2e273965..386839bed8a2 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>> @@ -419,7 +419,7 @@ nouveau_sched_init(struct nouveau_sched *sched, 
>> struct nouveau_drm *drm,
>>       ret = drm_sched_init(drm_sched, &nouveau_sched_ops, wq,
>>                    NOUVEAU_SCHED_PRIORITY_COUNT,
>>                    credit_limit, 0, job_hang_limit,
>> -                 NULL, NULL, "nouveau_sched", drm->dev->dev);
>> +                 NULL, NULL, "nouveau_sched", drm->dev);
>>       if (ret)
>>           goto fail_wq;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
>> b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index a61ef0af9a4e..28c7680a8dbf 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -875,7 +875,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>                        nentries, 0,
>>                        msecs_to_jiffies(JOB_TIMEOUT_MS),
>>                        pfdev->reset.wq,
>> -                     NULL, "pan_js", pfdev->dev);
>> +                     NULL, "pan_js", pfdev->ddev);
>>           if (ret) {
>>               dev_err(pfdev->dev, "Failed to create scheduler: %d.", 
>> ret);
>>               goto err_sched;
>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c 
>> b/drivers/gpu/drm/panthor/panthor_mmu.c
>> index fa0a002b1016..b9c5b500b7d1 100644
>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>> @@ -2293,7 +2293,7 @@ panthor_vm_create(struct panthor_device *ptdev, 
>> bool for_mcu,
>>       ret = drm_sched_init(&vm->sched, &panthor_vm_bind_ops, 
>> ptdev->mmu->vm.wq,
>>                    1, 1, 0,
>>                    MAX_SCHEDULE_TIMEOUT, NULL, NULL,
>> -                 "panthor-vm-bind", ptdev->base.dev);
>> +                 "panthor-vm-bind", &ptdev->base);
>>       if (ret)
>>           goto err_free_io_pgtable;
>> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c 
>> b/drivers/gpu/drm/panthor/panthor_sched.c
>> index 79ffcbc41d78..47e52f61571b 100644
>> --- a/drivers/gpu/drm/panthor/panthor_sched.c
>> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
>> @@ -3043,7 +3043,7 @@ group_create_queue(struct panthor_group *group,
>>                    args->ringbuf_size / (NUM_INSTRS_PER_SLOT * 
>> sizeof(u64)),
>>                    0, msecs_to_jiffies(JOB_TIMEOUT_MS),
>>                    group->ptdev->reset.wq,
>> -                 NULL, "panthor-queue", group->ptdev->base.dev);
>> +                 NULL, "panthor-queue", &group->ptdev->base);
>>       if (ret)
>>           goto err_free_queue;
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>> b/drivers/gpu/drm/scheduler/sched_entity.c
>> index 58c8161289fe..194798b9ce09 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -92,7 +92,7 @@ int drm_sched_entity_init(struct drm_sched_entity 
>> *entity,
>>            * the lowest priority available.
>>            */
>>           if (entity->priority >= sched_list[0]->num_rqs) {
>> -            drm_err(sched_list[0], "entity with out-of-bounds 
>> priority:%u num_rqs:%u\n",
>> +            drm_err(sched_list[0]->dev, "entity with out-of-bounds 
>> priority:%u num_rqs:%u\n",
>>                   entity->priority, sched_list[0]->num_rqs);
>>               entity->priority = max_t(s32, (s32) 
>> sched_list[0]->num_rqs - 1,
>>                            (s32) DRM_SCHED_PRIORITY_KERNEL);
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 7e90c9f95611..74a2fe51e653 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -103,9 +103,9 @@ static u32 drm_sched_available_credits(struct 
>> drm_gpu_scheduler *sched)
>>   {
>>       u32 credits;
>> -    drm_WARN_ON(sched, check_sub_overflow(sched->credit_limit,
>> -                          atomic_read(&sched->credit_count),
>> -                          &credits));
>> +    drm_WARN_ON(sched->dev, check_sub_overflow(sched->credit_limit,
>> +                          atomic_read(&sched->credit_count),
>> +                          &credits));
>>       return credits;
>>   }
>> @@ -130,14 +130,14 @@ static bool drm_sched_can_queue(struct 
>> drm_gpu_scheduler *sched,
>>       if (sched->ops->update_job_credits) {
>>           s_job->credits = sched->ops->update_job_credits(s_job);
>> -        drm_WARN(sched, !s_job->credits,
>> +        drm_WARN(sched->dev, !s_job->credits,
>>                "Jobs with zero credits bypass job-flow control.\n");
>>       }
>>       /* If a job exceeds the credit limit, truncate it to the credit 
>> limit
>>        * itself to guarantee forward progress.
>>        */
>> -    if (drm_WARN(sched, s_job->credits > sched->credit_limit,
>> +    if (drm_WARN(sched->dev, s_job->credits > sched->credit_limit,
>>                "Jobs may not exceed the credit limit, truncate.\n"))
>>           s_job->credits = sched->credit_limit;
>> @@ -701,7 +701,7 @@ void drm_sched_start(struct drm_gpu_scheduler 
>> *sched, bool full_recovery)
>>               if (r == -ENOENT)
>>                   drm_sched_job_done(s_job, fence->error);
>>               else if (r)
>> -                DRM_DEV_ERROR(sched->dev, "fence add callback failed 
>> (%d)\n",
>> +                DRM_DEV_ERROR(sched->dev->dev, "fence add callback 
>> failed (%d)\n",
>>                         r);
>>           } else
>>               drm_sched_job_done(s_job, -ECANCELED);
>> @@ -797,7 +797,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>            * or worse--a blank screen--leave a trail in the
>>            * logs, so this can be debugged easier.
>>            */
>> -        drm_err(job->sched, "%s: entity has no rq!\n", __func__);
>> +        drm_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
>>           return -ENOENT;
>>       }
>> @@ -1215,7 +1215,7 @@ static void drm_sched_run_job_work(struct 
>> work_struct *w)
>>           if (r == -ENOENT)
>>               drm_sched_job_done(sched_job, fence->error);
>>           else if (r)
>> -            DRM_DEV_ERROR(sched->dev, "fence add callback failed 
>> (%d)\n", r);
>> +            DRM_DEV_ERROR(sched->dev->dev, "fence add callback failed 
>> (%d)\n", r);
>>       } else {
>>           drm_sched_job_done(sched_job, IS_ERR(fence) ?
>>                      PTR_ERR(fence) : 0);
>> @@ -1240,7 +1240,7 @@ static void drm_sched_run_job_work(struct 
>> work_struct *w)
>>    *        used
>>    * @score: optional score atomic shared with other schedulers
>>    * @name: name used for debugging
>> - * @dev: target &struct device
>> + * @dev: target &struct drm_device
>>    *
>>    * Return 0 on success, otherwise error code.
>>    */
>> @@ -1249,7 +1249,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>              struct workqueue_struct *submit_wq,
>>              u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
>>              long timeout, struct workqueue_struct *timeout_wq,
>> -           atomic_t *score, const char *name, struct device *dev)
>> +           atomic_t *score, const char *name, struct drm_device *dev)
>>   {
>>       int i;
>> @@ -1265,7 +1265,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>       if (num_rqs > DRM_SCHED_PRIORITY_COUNT) {
>>           /* This is a gross violation--tell drivers what the  problem 
>> is.
>>            */
>> -        drm_err(sched, "%s: num_rqs cannot be greater than 
>> DRM_SCHED_PRIORITY_COUNT\n",
>> +        drm_err(dev, "%s: num_rqs cannot be greater than 
>> DRM_SCHED_PRIORITY_COUNT\n",
>>               __func__);
>>           return -EINVAL;
>>       } else if (sched->sched_rq) {
>> @@ -1273,7 +1273,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>            * fine-tune their DRM calling order, and return all
>>            * is good.
>>            */
>> -        drm_warn(sched, "%s: scheduler already initialized!\n", 
>> __func__);
>> +        drm_warn(dev, "%s: scheduler already initialized!\n", __func__);
>>           return 0;
>>       }
>> @@ -1322,7 +1322,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   Out_check_own:
>>       if (sched->own_submit_wq)
>>           destroy_workqueue(sched->submit_wq);
>> -    drm_err(sched, "%s: Failed to setup GPU scheduler--out of 
>> memory\n", __func__);
>> +    drm_err(dev, "%s: Failed to setup GPU scheduler--out of 
>> memory\n", __func__);
>>       return -ENOMEM;
>>   }
>>   EXPORT_SYMBOL(drm_sched_init);
>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c 
>> b/drivers/gpu/drm/v3d/v3d_sched.c
>> index 7cd8c335cd9b..73383b6ef9bb 100644
>> --- a/drivers/gpu/drm/v3d/v3d_sched.c
>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
>> @@ -740,7 +740,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    hw_jobs_limit, job_hang_limit,
>>                    msecs_to_jiffies(hang_limit_ms), NULL,
>> -                 NULL, "v3d_bin", v3d->drm.dev);
>> +                 NULL, "v3d_bin", &v3d->drm);
>>       if (ret)
>>           return ret;
>> @@ -749,7 +749,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    hw_jobs_limit, job_hang_limit,
>>                    msecs_to_jiffies(hang_limit_ms), NULL,
>> -                 NULL, "v3d_render", v3d->drm.dev);
>> +                 NULL, "v3d_render", &v3d->drm);
>>       if (ret)
>>           goto fail;
>> @@ -758,7 +758,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    hw_jobs_limit, job_hang_limit,
>>                    msecs_to_jiffies(hang_limit_ms), NULL,
>> -                 NULL, "v3d_tfu", v3d->drm.dev);
>> +                 NULL, "v3d_tfu", &v3d->drm);
>>       if (ret)
>>           goto fail;
>> @@ -768,7 +768,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                        DRM_SCHED_PRIORITY_COUNT,
>>                        hw_jobs_limit, job_hang_limit,
>>                        msecs_to_jiffies(hang_limit_ms), NULL,
>> -                     NULL, "v3d_csd", v3d->drm.dev);
>> +                     NULL, "v3d_csd", &v3d->drm);
>>           if (ret)
>>               goto fail;
>> @@ -777,7 +777,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                        DRM_SCHED_PRIORITY_COUNT,
>>                        hw_jobs_limit, job_hang_limit,
>>                        msecs_to_jiffies(hang_limit_ms), NULL,
>> -                     NULL, "v3d_cache_clean", v3d->drm.dev);
>> +                     NULL, "v3d_cache_clean", &v3d->drm);
>>           if (ret)
>>               goto fail;
>>       }
>> @@ -787,7 +787,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>>                    DRM_SCHED_PRIORITY_COUNT,
>>                    1, job_hang_limit,
>>                    msecs_to_jiffies(hang_limit_ms), NULL,
>> -                 NULL, "v3d_cpu", v3d->drm.dev);
>> +                 NULL, "v3d_cpu", &v3d->drm);
>>       if (ret)
>>           goto fail;
>> diff --git a/drivers/gpu/drm/xe/xe_execlist.c 
>> b/drivers/gpu/drm/xe/xe_execlist.c
>> index dece2785933c..dc81e9f39727 100644
>> --- a/drivers/gpu/drm/xe/xe_execlist.c
>> +++ b/drivers/gpu/drm/xe/xe_execlist.c
>> @@ -336,7 +336,7 @@ static int execlist_exec_queue_init(struct 
>> xe_exec_queue *q)
>>                    q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES,
>>                    XE_SCHED_HANG_LIMIT, XE_SCHED_JOB_TIMEOUT,
>>                    NULL, NULL, q->hwe->name,
>> -                 gt_to_xe(q->gt)->drm.dev);
>> +                 &gt_to_xe(q->gt)->drm);
>>       if (err)
>>           goto err_free;
>> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c 
>> b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> index e4ad1d6ce1d5..66d36cac82a0 100644
>> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
>> @@ -61,7 +61,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
>>             uint32_t hw_submission, unsigned hang_limit,
>>             long timeout, struct workqueue_struct *timeout_wq,
>>             atomic_t *score, const char *name,
>> -          struct device *dev)
>> +          struct drm_device *dev)
>>   {
>>       sched->ops = xe_ops;
>>       INIT_LIST_HEAD(&sched->msgs);
>> diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h 
>> b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> index 10c6bb9c9386..9a75457813f2 100644
>> --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
>> @@ -16,7 +16,7 @@ int xe_sched_init(struct xe_gpu_scheduler *sched,
>>             uint32_t hw_submission, unsigned hang_limit,
>>             long timeout, struct workqueue_struct *timeout_wq,
>>             atomic_t *score, const char *name,
>> -          struct device *dev);
>> +          struct drm_device *dev);
>>   void xe_sched_fini(struct xe_gpu_scheduler *sched);
>>   void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c 
>> b/drivers/gpu/drm/xe/xe_guc_submit.c
>> index e4e3658e6a13..b9c114f2c715 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> @@ -1208,7 +1208,7 @@ static int guc_exec_queue_init(struct 
>> xe_exec_queue *q)
>>                   get_submit_wq(guc),
>>                   q->lrc[0].ring.size / MAX_JOB_SIZE_BYTES, 64,
>>                   timeout, guc_to_gt(guc)->ordered_wq, NULL,
>> -                q->name, gt_to_xe(q->gt)->drm.dev);
>> +                q->name, &gt_to_xe(q->gt)->drm);
>>       if (err)
>>           goto err_free;
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 5acc64954a88..0ba8716ec069 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -539,7 +539,7 @@ struct drm_gpu_scheduler {
>>       bool                free_guilty;
>>       bool                pause_submit;
>>       bool                own_submit_wq;
>> -    struct device            *dev;
>> +    struct drm_device        *dev;
>>   };
>>   int drm_sched_init(struct drm_gpu_scheduler *sched,
>> @@ -547,7 +547,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>              struct workqueue_struct *submit_wq,
>>              u32 num_rqs, u32 credit_limit, unsigned int hang_limit,
>>              long timeout, struct workqueue_struct *timeout_wq,
>> -           atomic_t *score, const char *name, struct device *dev);
>> +           atomic_t *score, const char *name, struct drm_device *dev);
>>   void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>   int drm_sched_job_init(struct drm_sched_job *job,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
  2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
  2024-06-06 13:18   ` Christian König
  2024-06-07  0:34   ` kernel test robot
@ 2024-06-08  0:36   ` kernel test robot
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2024-06-08  0:36 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, alexander.deucher, christian.koenig,
	ltuikov89, matthew.brost, maarten.lankhorst, mripard, tzimmermann,
	airlied, daniel, dri-devel, ville.syrjala, rostedt
  Cc: llvm, oe-kbuild-all, Pierre-Eric Pelloux-Prayer

Hi Pierre-Eric,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.10-rc2 next-20240607]
[cannot apply to drm-xe/drm-xe-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pierre-Eric-Pelloux-Prayer/drm-sched-store-the-drm_device-instead-of-the-device/20240606-211050
base:   linus/master
patch link:    https://lore.kernel.org/r/20240606130629.214827-2-pierre-eric.pelloux-prayer%40amd.com
patch subject: [PATCH v3 1/4] drm/sched: store the drm_device instead of the device
config: arm64-allmodconfig (https://download.01.org/0day-ci/archive/20240608/202406080802.ExIh6NC2-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project d7d2d4f53fc79b4b58e8d8d08151b577c3699d4a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240608/202406080802.ExIh6NC2-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406080802.ExIh6NC2-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/imagination/pvr_queue.c:8:
   In file included from drivers/gpu/drm/imagination/pvr_context.h:17:
   In file included from drivers/gpu/drm/imagination/pvr_device.h:9:
   In file included from drivers/gpu/drm/imagination/pvr_fw.h:8:
   In file included from drivers/gpu/drm/imagination/pvr_fw_trace.h:7:
   In file included from include/drm/drm_file.h:39:
   In file included from include/drm/drm_prime.h:37:
   In file included from include/linux/scatterlist.h:8:
   In file included from include/linux/mm.h:2253:
   include/linux/vmstat.h:500:43: error: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Werror,-Wenum-enum-conversion]
     500 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     501 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:507:43: error: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Werror,-Wenum-enum-conversion]
     507 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     508 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:514:36: error: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Werror,-Wenum-enum-conversion]
     514 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:519:43: error: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Werror,-Wenum-enum-conversion]
     519 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     520 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:528:43: error: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Werror,-Wenum-enum-conversion]
     528 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     529 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
>> drivers/gpu/drm/imagination/pvr_queue.c:807:10: error: incompatible pointer types passing 'struct drm_device *' to parameter of type 'const struct device *' [-Werror,-Wincompatible-pointer-types]
     807 |         dev_err(sched->dev, "Job timeout\n");
         |                 ^~~~~~~~~~
   include/linux/dev_printk.h:154:44: note: expanded from macro 'dev_err'
     154 |         dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), ##__VA_ARGS__)
         |                                                   ^~~
   include/linux/dev_printk.h:110:11: note: expanded from macro 'dev_printk_index_wrap'
     110 |                 _p_func(dev, fmt, ##__VA_ARGS__);                       \
         |                         ^~~
   include/linux/dev_printk.h:50:36: note: passing argument to parameter 'dev' here
      50 | void _dev_err(const struct device *dev, const char *fmt, ...);
         |                                    ^
   6 errors generated.


vim +807 drivers/gpu/drm/imagination/pvr_queue.c

eaf01ee5ba28b9 Sarah Walker 2023-11-22  787  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  788  /**
eaf01ee5ba28b9 Sarah Walker 2023-11-22  789   * pvr_queue_timedout_job() - Handle a job timeout event.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  790   * @s_job: The job this timeout occurred on.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  791   *
eaf01ee5ba28b9 Sarah Walker 2023-11-22  792   * FIXME: We don't do anything here to unblock the situation, we just stop+start
eaf01ee5ba28b9 Sarah Walker 2023-11-22  793   * the scheduler, and re-assign parent fences in the middle.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  794   *
eaf01ee5ba28b9 Sarah Walker 2023-11-22  795   * Return:
eaf01ee5ba28b9 Sarah Walker 2023-11-22  796   *  * DRM_GPU_SCHED_STAT_NOMINAL.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  797   */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  798  static enum drm_gpu_sched_stat
eaf01ee5ba28b9 Sarah Walker 2023-11-22  799  pvr_queue_timedout_job(struct drm_sched_job *s_job)
eaf01ee5ba28b9 Sarah Walker 2023-11-22  800  {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  801  	struct drm_gpu_scheduler *sched = s_job->sched;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  802  	struct pvr_queue *queue = container_of(sched, struct pvr_queue, scheduler);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  803  	struct pvr_device *pvr_dev = queue->ctx->pvr_dev;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  804  	struct pvr_job *job;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  805  	u32 job_count = 0;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  806  
eaf01ee5ba28b9 Sarah Walker 2023-11-22 @807  	dev_err(sched->dev, "Job timeout\n");
eaf01ee5ba28b9 Sarah Walker 2023-11-22  808  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  809  	/* Before we stop the scheduler, make sure the queue is out of any list, so
eaf01ee5ba28b9 Sarah Walker 2023-11-22  810  	 * any call to pvr_queue_update_active_state_locked() that might happen
eaf01ee5ba28b9 Sarah Walker 2023-11-22  811  	 * until the scheduler is really stopped doesn't end up re-inserting the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  812  	 * queue in the active list. This would cause
eaf01ee5ba28b9 Sarah Walker 2023-11-22  813  	 * pvr_queue_signal_done_fences() and drm_sched_stop() to race with each
eaf01ee5ba28b9 Sarah Walker 2023-11-22  814  	 * other when accessing the pending_list, since drm_sched_stop() doesn't
eaf01ee5ba28b9 Sarah Walker 2023-11-22  815  	 * grab the job_list_lock when modifying the list (it's assuming the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  816  	 * only other accessor is the scheduler, and it's safe to not grab the
eaf01ee5ba28b9 Sarah Walker 2023-11-22  817  	 * lock since it's stopped).
eaf01ee5ba28b9 Sarah Walker 2023-11-22  818  	 */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  819  	mutex_lock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  820  	list_del_init(&queue->node);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  821  	mutex_unlock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  822  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  823  	drm_sched_stop(sched, s_job);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  824  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  825  	/* Re-assign job parent fences. */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  826  	list_for_each_entry(job, &sched->pending_list, base.list) {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  827  		job->base.s_fence->parent = dma_fence_get(job->done_fence);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  828  		job_count++;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  829  	}
eaf01ee5ba28b9 Sarah Walker 2023-11-22  830  	WARN_ON(atomic_read(&queue->in_flight_job_count) != job_count);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  831  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  832  	/* Re-insert the queue in the proper list, and kick a queue processing
eaf01ee5ba28b9 Sarah Walker 2023-11-22  833  	 * operation if there were jobs pending.
eaf01ee5ba28b9 Sarah Walker 2023-11-22  834  	 */
eaf01ee5ba28b9 Sarah Walker 2023-11-22  835  	mutex_lock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  836  	if (!job_count) {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  837  		list_move_tail(&queue->node, &pvr_dev->queues.idle);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  838  	} else {
eaf01ee5ba28b9 Sarah Walker 2023-11-22  839  		atomic_set(&queue->in_flight_job_count, job_count);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  840  		list_move_tail(&queue->node, &pvr_dev->queues.active);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  841  		pvr_queue_process(queue);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  842  	}
eaf01ee5ba28b9 Sarah Walker 2023-11-22  843  	mutex_unlock(&pvr_dev->queues.lock);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  844  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  845  	drm_sched_start(sched, true);
eaf01ee5ba28b9 Sarah Walker 2023-11-22  846  
eaf01ee5ba28b9 Sarah Walker 2023-11-22  847  	return DRM_GPU_SCHED_STAT_NOMINAL;
eaf01ee5ba28b9 Sarah Walker 2023-11-22  848  }
eaf01ee5ba28b9 Sarah Walker 2023-11-22  849  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-06-08  0:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-06 13:06 [PATCH v3 0/4] Improve gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
2024-06-06 13:06 ` [PATCH v3 1/4] drm/sched: store the drm_device instead of the device Pierre-Eric Pelloux-Prayer
2024-06-06 13:18   ` Christian König
2024-06-06 13:23     ` Ville Syrjälä
2024-06-06 16:38     ` Matthew Brost
2024-06-07 13:55     ` Pierre-Eric Pelloux-Prayer
2024-06-07  0:34   ` kernel test robot
2024-06-08  0:36   ` kernel test robot
2024-06-06 13:06 ` [PATCH v3 2/4] drm/sched: add dev_index=xx to the drm_sched_process_job event Pierre-Eric Pelloux-Prayer
2024-06-06 13:06 ` [PATCH v3 3/4] drm/sched: cleanup gpu_scheduler trace events Pierre-Eric Pelloux-Prayer
2024-06-06 13:19   ` Steven Rostedt
2024-06-06 13:23     ` Christian König
2024-06-07 13:21     ` Pierre-Eric Pelloux-Prayer
2024-06-06 13:06 ` [PATCH v3 4/4] drm/sched: trace dependencies for gpu jobs Pierre-Eric Pelloux-Prayer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.