[Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe
@ 2023-09-19  5:01 Matthew Brost
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
                   ` (12 more replies)
  0 siblings, 13 replies; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
have been asked to merge our common DRM scheduler patches first.

This a continuation of a RFC [3] with all comments addressed, ready for
a full review, and hopefully in state which can merged in the near
future. More details of this series can found in the cover letter of the
RFC [3].

These changes have been tested with the Xe driver.

v2:
 - Break run job, free job, and process message in own work items
 - This might break other drivers as run job and free job now can run in
   parallel, can fix up if needed

v3:
 - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
 - Fix issue with setting timestamp to early
 - Don't dequeue jobs for single entity after calling entity fini
 - Flush pending jobs on entity fini
 - Add documentation for entity teardown
 - Add Matthew Brost to maintainers of DRM scheduler

v4:
 - Drop message interface
 - Drop 'Flush pending jobs on entity fini'
 - Drop 'Add documentation for entity teardown'
 - Address all feedback

Matt

Matthew Brost (10):
  drm/sched: Add drm_sched_submit_* helpers
  drm/sched: Convert drm scheduler to use a work queue rather than
    kthread
  drm/sched: Move schedule policy to scheduler
  drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
  drm/sched: Split free_job into own work item
  drm/sched: Add drm_sched_start_timeout_unlocked helper
  drm/sched: Start submission before TDR in drm_sched_start
  drm/sched: Submit job before starting TDR
  drm/sched: Add helper to queue TDR immediately for current and future
    jobs
  drm/sched: Update maintainers of GPU scheduler

 MAINTAINERS                                   |   1 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
 drivers/gpu/drm/lima/lima_sched.c             |   5 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |   5 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c      |  85 ++-
 drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c        | 491 ++++++++++++------
 drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
 include/drm/gpu_scheduler.h                   |  48 +-
 15 files changed, 495 insertions(+), 220 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-19  5:58   ` Christian König
  2023-09-27  1:07   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Add scheduler submit ready, stop, and start helpers to hide the
implementation details of the scheduler from the drivers.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 15 +++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 12 +++---
 drivers/gpu/drm/msm/adreno/adreno_device.c    |  6 ++-
 drivers/gpu/drm/scheduler/sched_main.c        | 40 ++++++++++++++++++-
 include/drm/gpu_scheduler.h                   |  3 ++
 6 files changed, 60 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index 625db444df1c..36a1accbc846 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus
 	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
 		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
 
-		if (!(ring && ring->sched.thread))
+		if (!(ring && drm_sched_submit_ready(&ring->sched)))
 			continue;
 
 		/* stop secheduler and drain ring. */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index a4faea4fa0b5..fb5dad687168 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
 
-		if (!ring || !ring->sched.thread)
+		if (!ring || !drm_sched_submit_ready(&ring->sched))
 			continue;
-		kthread_park(ring->sched.thread);
+		drm_sched_submit_stop(&ring->sched);
 	}
 
 	seq_puts(m, "run ib test:\n");
@@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
 
-		if (!ring || !ring->sched.thread)
+		if (!ring || !drm_sched_submit_ready(&ring->sched))
 			continue;
-		kthread_unpark(ring->sched.thread);
+		drm_sched_submit_start(&ring->sched);
 	}
 
 	up_write(&adev->reset_domain->sem);
@@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 
 	ring = adev->rings[val];
 
-	if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread)
+	if (!ring || !ring->funcs->preempt_ib ||
+	    !drm_sched_submit_ready(&ring->sched))
 		return -EINVAL;
 
 	/* the last preemption failed */
@@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 		goto pro_end;
 
 	/* stop the scheduler */
-	kthread_park(ring->sched.thread);
+	drm_sched_submit_stop(&ring->sched);
 
 	/* preempt the IB */
 	r = amdgpu_ring_preempt_ib(ring);
@@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 
 failure:
 	/* restart the scheduler */
-	kthread_unpark(ring->sched.thread);
+	drm_sched_submit_start(&ring->sched);
 
 	up_read(&adev->reset_domain->sem);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 30c4f5cca02c..e366f61c3aed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4588,7 +4588,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
 	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 		struct amdgpu_ring *ring = adev->rings[i];
 
-		if (!ring || !ring->sched.thread)
+		if (!ring || !drm_sched_submit_ready(&ring->sched))
 			continue;
 
 		spin_lock(&ring->sched.job_list_lock);
@@ -4727,7 +4727,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
 	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 		struct amdgpu_ring *ring = adev->rings[i];
 
-		if (!ring || !ring->sched.thread)
+		if (!ring || !drm_sched_submit_ready(&ring->sched))
 			continue;
 
 		/* Clear job fence from fence drv to avoid force_completion
@@ -5266,7 +5266,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 			struct amdgpu_ring *ring = tmp_adev->rings[i];
 
-			if (!ring || !ring->sched.thread)
+			if (!ring || !drm_sched_submit_ready(&ring->sched))
 				continue;
 
 			drm_sched_stop(&ring->sched, job ? &job->base : NULL);
@@ -5341,7 +5341,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 			struct amdgpu_ring *ring = tmp_adev->rings[i];
 
-			if (!ring || !ring->sched.thread)
+			if (!ring || !drm_sched_submit_ready(&ring->sched))
 				continue;
 
 			drm_sched_start(&ring->sched, true);
@@ -5667,7 +5667,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 			struct amdgpu_ring *ring = adev->rings[i];
 
-			if (!ring || !ring->sched.thread)
+			if (!ring || !drm_sched_submit_ready(&ring->sched))
 				continue;
 
 			drm_sched_stop(&ring->sched, NULL);
@@ -5795,7 +5795,7 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
 	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 		struct amdgpu_ring *ring = adev->rings[i];
 
-		if (!ring || !ring->sched.thread)
+		if (!ring || !drm_sched_submit_ready(&ring->sched))
 			continue;
 
 		drm_sched_start(&ring->sched, true);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index fa527935ffd4..e046dc5ff72a 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -809,7 +809,8 @@ static void suspend_scheduler(struct msm_gpu *gpu)
 	 */
 	for (i = 0; i < gpu->nr_rings; i++) {
 		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
-		kthread_park(sched->thread);
+
+		drm_sched_submit_stop(sched);
 	}
 }
 
@@ -819,7 +820,8 @@ static void resume_scheduler(struct msm_gpu *gpu)
 
 	for (i = 0; i < gpu->nr_rings; i++) {
 		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
-		kthread_unpark(sched->thread);
+
+		drm_sched_submit_start(sched);
 	}
 }
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 506371c42745..e4fa62abca41 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -439,7 +439,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 {
 	struct drm_sched_job *s_job, *tmp;
 
-	kthread_park(sched->thread);
+	drm_sched_submit_stop(sched);
 
 	/*
 	 * Reinsert back the bad job here - now it's safe as
@@ -552,7 +552,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	kthread_unpark(sched->thread);
+	drm_sched_submit_start(sched);
 }
 EXPORT_SYMBOL(drm_sched_start);
 
@@ -1206,3 +1206,39 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
 	}
 }
 EXPORT_SYMBOL(drm_sched_increase_karma);
+
+/**
+ * drm_sched_submit_ready - scheduler ready for submission
+ *
+ * @sched: scheduler instance
+ *
+ * Returns true if submission is ready
+ */
+bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
+{
+	return !!sched->thread;
+
+}
+EXPORT_SYMBOL(drm_sched_submit_ready);
+
+/**
+ * drm_sched_submit_stop - stop scheduler submission
+ *
+ * @sched: scheduler instance
+ */
+void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
+{
+	kthread_park(sched->thread);
+}
+EXPORT_SYMBOL(drm_sched_submit_stop);
+
+/**
+ * drm_sched_submit_start - start scheduler submission
+ *
+ * @sched: scheduler instance
+ */
+void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
+{
+	kthread_unpark(sched->thread);
+}
+EXPORT_SYMBOL(drm_sched_submit_start);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index f9544d9b670d..f12c5aea5294 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -550,6 +550,9 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
+bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
+void drm_sched_submit_stop(struct drm_gpu_scheduler *sched);
+void drm_sched_submit_start(struct drm_gpu_scheduler *sched);
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad);
 void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery);
 void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-27  3:32   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler Matthew Brost
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In XE the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in XE we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In XE submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
  - (Rob Clark) Fix msm build
  - Pass in run work queue
v3:
  - (Boris) don't have loop in worker
v4:
  - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
  - (Boris) default to ordered work queue

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
 drivers/gpu/drm/lima/lima_sched.c          |   2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
 drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
 include/drm/gpu_scheduler.h                |  14 ++-
 9 files changed, 79 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e366f61c3aed..16f3cfe1574a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 			break;
 		}
 
-		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
+		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
 				   ring->num_hw_submission, 0,
 				   timeout, adev->reset_domain->wq,
 				   ring->sched_score, ring->name,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 345fec6cb1a4..618a804ddc34 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
 {
 	int ret;
 
-	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
+	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
 			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
 			     msecs_to_jiffies(500), NULL, NULL,
 			     dev_name(gpu->dev), gpu->dev);
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ffd91a5ee299..8d858aed0e56 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
 
 	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
 
-	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
+	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
 			      lima_job_hang_limit,
 			      msecs_to_jiffies(timeout), NULL,
 			      NULL, name, pipe->ldev->dev);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 40c0bc35a44c..b8865e61b40f 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
 	 /* currently managing hangcheck ourselves: */
 	sched_timeout = MAX_SCHEDULE_TIMEOUT;
 
-	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
+	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
 			num_hw_submissions, 0, sched_timeout,
 			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
 	if (ret) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 88217185e0f3..d458c2227d4f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
 	if (!drm->sched_wq)
 		return -ENOMEM;
 
-	return drm_sched_init(sched, &nouveau_sched_ops,
+	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
 			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
 			      NULL, NULL, "nouveau_sched", drm->dev->dev);
 }
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 033f5e684707..326ca1ddf1d7 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 		js->queue[j].fence_context = dma_fence_context_alloc(1);
 
 		ret = drm_sched_init(&js->queue[j].sched,
-				     &panfrost_sched_ops,
+				     &panfrost_sched_ops, NULL,
 				     nentries, 0,
 				     msecs_to_jiffies(JOB_TIMEOUT_MS),
 				     pfdev->reset.wq,
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index e4fa62abca41..ee6281942e36 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,7 +48,6 @@
  * through the jobs entity pointer.
  */
 
-#include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
@@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
 	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
 }
 
+/**
+ * drm_sched_submit_queue - scheduler queue submission
+ * @sched: scheduler instance
+ */
+static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
+{
+	if (!READ_ONCE(sched->pause_submit))
+		queue_work(sched->submit_wq, &sched->work_submit);
+}
+
 /**
  * drm_sched_job_done - complete a job
  * @s_job: pointer to the job which is done
@@ -275,7 +284,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
 	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence, result);
 	dma_fence_put(&s_fence->finished);
-	wake_up_interruptible(&sched->wake_up_worker);
+	drm_sched_submit_queue(sched);
 }
 
 /**
@@ -868,7 +877,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
 void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
 {
 	if (drm_sched_can_queue(sched))
-		wake_up_interruptible(&sched->wake_up_worker);
+		drm_sched_submit_queue(sched);
 }
 
 /**
@@ -978,61 +987,42 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
 }
 EXPORT_SYMBOL(drm_sched_pick_best);
 
-/**
- * drm_sched_blocked - check if the scheduler is blocked
- *
- * @sched: scheduler instance
- *
- * Returns true if blocked, otherwise false.
- */
-static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
-{
-	if (kthread_should_park()) {
-		kthread_parkme();
-		return true;
-	}
-
-	return false;
-}
-
 /**
  * drm_sched_main - main scheduler thread
  *
  * @param: scheduler instance
- *
- * Returns 0.
  */
-static int drm_sched_main(void *param)
+static void drm_sched_main(struct work_struct *w)
 {
-	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
+	struct drm_gpu_scheduler *sched =
+		container_of(w, struct drm_gpu_scheduler, work_submit);
+	struct drm_sched_entity *entity;
+	struct drm_sched_job *cleanup_job;
 	int r;
 
-	sched_set_fifo_low(current);
+	if (READ_ONCE(sched->pause_submit))
+		return;
 
-	while (!kthread_should_stop()) {
-		struct drm_sched_entity *entity = NULL;
-		struct drm_sched_fence *s_fence;
-		struct drm_sched_job *sched_job;
-		struct dma_fence *fence;
-		struct drm_sched_job *cleanup_job = NULL;
+	cleanup_job = drm_sched_get_cleanup_job(sched);
+	entity = drm_sched_select_entity(sched);
 
-		wait_event_interruptible(sched->wake_up_worker,
-					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
-					 (!drm_sched_blocked(sched) &&
-					  (entity = drm_sched_select_entity(sched))) ||
-					 kthread_should_stop());
+	if (!entity && !cleanup_job)
+		return;	/* No more work */
 
-		if (cleanup_job)
-			sched->ops->free_job(cleanup_job);
+	if (cleanup_job)
+		sched->ops->free_job(cleanup_job);
 
-		if (!entity)
-			continue;
+	if (entity) {
+		struct dma_fence *fence;
+		struct drm_sched_fence *s_fence;
+		struct drm_sched_job *sched_job;
 
 		sched_job = drm_sched_entity_pop_job(entity);
-
 		if (!sched_job) {
 			complete_all(&entity->entity_idle);
-			continue;
+			if (!cleanup_job)
+				return;	/* No more work */
+			goto again;
 		}
 
 		s_fence = sched_job->s_fence;
@@ -1063,7 +1053,9 @@ static int drm_sched_main(void *param)
 
 		wake_up(&sched->job_scheduled);
 	}
-	return 0;
+
+again:
+	drm_sched_submit_queue(sched);
 }
 
 /**
@@ -1071,6 +1063,8 @@ static int drm_sched_main(void *param)
  *
  * @sched: scheduler instance
  * @ops: backend operations for this scheduler
+ * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is
+ *	       allocated and used
  * @hw_submission: number of hw submissions that can be in flight
  * @hang_limit: number of times to allow a job to hang before dropping it
  * @timeout: timeout value in jiffies for the scheduler
@@ -1084,14 +1078,25 @@ static int drm_sched_main(void *param)
  */
 int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   const struct drm_sched_backend_ops *ops,
+		   struct workqueue_struct *submit_wq,
 		   unsigned hw_submission, unsigned hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
 		   atomic_t *score, const char *name, struct device *dev)
 {
-	int i, ret;
+	int i;
 	sched->ops = ops;
 	sched->hw_submission_limit = hw_submission;
 	sched->name = name;
+	if (!submit_wq) {
+		sched->submit_wq = alloc_ordered_workqueue(name, 0);
+		if (!sched->submit_wq)
+			return -ENOMEM;
+
+		sched->alloc_submit_wq = true;
+	} else {
+		sched->submit_wq = submit_wq;
+		sched->alloc_submit_wq = false;
+	}
 	sched->timeout = timeout;
 	sched->timeout_wq = timeout_wq ? : system_wq;
 	sched->hang_limit = hang_limit;
@@ -1100,23 +1105,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
 		drm_sched_rq_init(sched, &sched->sched_rq[i]);
 
-	init_waitqueue_head(&sched->wake_up_worker);
 	init_waitqueue_head(&sched->job_scheduled);
 	INIT_LIST_HEAD(&sched->pending_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
+	INIT_WORK(&sched->work_submit, drm_sched_main);
 	atomic_set(&sched->_score, 0);
 	atomic64_set(&sched->job_id_count, 0);
-
-	/* Each scheduler will run on a seperate kernel thread */
-	sched->thread = kthread_run(drm_sched_main, sched, sched->name);
-	if (IS_ERR(sched->thread)) {
-		ret = PTR_ERR(sched->thread);
-		sched->thread = NULL;
-		DRM_DEV_ERROR(sched->dev, "Failed to create scheduler for %s.\n", name);
-		return ret;
-	}
+	sched->pause_submit = false;
 
 	sched->ready = true;
 	return 0;
@@ -1135,8 +1132,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	struct drm_sched_entity *s_entity;
 	int i;
 
-	if (sched->thread)
-		kthread_stop(sched->thread);
+	drm_sched_submit_stop(sched);
 
 	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
 		struct drm_sched_rq *rq = &sched->sched_rq[i];
@@ -1159,6 +1155,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	/* Confirm no work left behind accessing device structures */
 	cancel_delayed_work_sync(&sched->work_tdr);
 
+	if (sched->alloc_submit_wq)
+		destroy_workqueue(sched->submit_wq);
 	sched->ready = false;
 }
 EXPORT_SYMBOL(drm_sched_fini);
@@ -1216,7 +1214,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
  */
 bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
 {
-	return !!sched->thread;
+	return sched->ready;
 
 }
 EXPORT_SYMBOL(drm_sched_submit_ready);
@@ -1228,7 +1226,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
  */
 void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
 {
-	kthread_park(sched->thread);
+	WRITE_ONCE(sched->pause_submit, true);
+	cancel_work_sync(&sched->work_submit);
 }
 EXPORT_SYMBOL(drm_sched_submit_stop);
 
@@ -1239,6 +1238,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
  */
 void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
 {
-	kthread_unpark(sched->thread);
+	WRITE_ONCE(sched->pause_submit, false);
+	queue_work(sched->submit_wq, &sched->work_submit);
 }
 EXPORT_SYMBOL(drm_sched_submit_start);
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 06238e6d7f5c..38e092ea41e6 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -388,7 +388,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 	int ret;
 
 	ret = drm_sched_init(&v3d->queue[V3D_BIN].sched,
-			     &v3d_bin_sched_ops,
+			     &v3d_bin_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
 			     NULL, "v3d_bin", v3d->drm.dev);
@@ -396,7 +396,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 		return ret;
 
 	ret = drm_sched_init(&v3d->queue[V3D_RENDER].sched,
-			     &v3d_render_sched_ops,
+			     &v3d_render_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
 			     NULL, "v3d_render", v3d->drm.dev);
@@ -404,7 +404,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 		goto fail;
 
 	ret = drm_sched_init(&v3d->queue[V3D_TFU].sched,
-			     &v3d_tfu_sched_ops,
+			     &v3d_tfu_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
 			     NULL, "v3d_tfu", v3d->drm.dev);
@@ -413,7 +413,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 
 	if (v3d_has_csd(v3d)) {
 		ret = drm_sched_init(&v3d->queue[V3D_CSD].sched,
-				     &v3d_csd_sched_ops,
+				     &v3d_csd_sched_ops, NULL,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
 				     NULL, "v3d_csd", v3d->drm.dev);
@@ -421,7 +421,7 @@ v3d_sched_init(struct v3d_dev *v3d)
 			goto fail;
 
 		ret = drm_sched_init(&v3d->queue[V3D_CACHE_CLEAN].sched,
-				     &v3d_cache_clean_sched_ops,
+				     &v3d_cache_clean_sched_ops, NULL,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
 				     NULL, "v3d_cache_clean", v3d->drm.dev);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index f12c5aea5294..95927c52383c 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -473,17 +473,16 @@ struct drm_sched_backend_ops {
  * @timeout: the time after which a job is removed from the scheduler.
  * @name: name of the ring for which this scheduler is being used.
  * @sched_rq: priority wise array of run queues.
- * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
- *                  is ready to be scheduled.
  * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
  *                 waits on this wait queue until all the scheduled jobs are
  *                 finished.
  * @hw_rq_count: the number of jobs currently in the hardware queue.
  * @job_id_count: used to assign unique id to the each job.
+ * @submit_wq: workqueue used to queue @work_submit
  * @timeout_wq: workqueue used to queue @work_tdr
+ * @work_submit: schedules jobs and cleans up entities
  * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
  *            timeout interval is over.
- * @thread: the kthread on which the scheduler which run.
  * @pending_list: the list of jobs which are currently in the job queue.
  * @job_list_lock: lock to protect the pending_list.
  * @hang_limit: once the hangs by a job crosses this limit then it is marked
@@ -492,6 +491,8 @@ struct drm_sched_backend_ops {
  * @_score: score used when the driver doesn't provide one
  * @ready: marks if the underlying HW is ready to work
  * @free_guilty: A hit to time out handler to free the guilty job.
+ * @pause_submit: pause queuing of @work_submit on @submit_wq
+ * @alloc_submit_wq: scheduler own allocation of @submit_wq
  * @dev: system &struct device
  *
  * One scheduler is implemented for each hardware ring.
@@ -502,13 +503,13 @@ struct drm_gpu_scheduler {
 	long				timeout;
 	const char			*name;
 	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
-	wait_queue_head_t		wake_up_worker;
 	wait_queue_head_t		job_scheduled;
 	atomic_t			hw_rq_count;
 	atomic64_t			job_id_count;
+	struct workqueue_struct		*submit_wq;
 	struct workqueue_struct		*timeout_wq;
+	struct work_struct		work_submit;
 	struct delayed_work		work_tdr;
-	struct task_struct		*thread;
 	struct list_head		pending_list;
 	spinlock_t			job_list_lock;
 	int				hang_limit;
@@ -516,11 +517,14 @@ struct drm_gpu_scheduler {
 	atomic_t                        _score;
 	bool				ready;
 	bool				free_guilty;
+	bool				pause_submit;
+	bool				alloc_submit_wq;
 	struct device			*dev;
 };
 
 int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   const struct drm_sched_backend_ops *ops,
+		   struct workqueue_struct *submit_wq,
 		   uint32_t hw_submission, unsigned hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
 		   atomic_t *score, const char *name, struct device *dev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-24  1:18   ` kernel test robot
  2023-09-27 12:13   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Rather than a global modparam for scheduling policy, move the scheduling
policy to scheduler so user can control each scheduler policy.

v2:
  - s/DRM_SCHED_POLICY_MAX/DRM_SCHED_POLICY_COUNT (Luben)
  - Only include policy in scheduler (Luben)
v3:
  - use a ternary operator as opposed to an if-control (Luben)
  - s/DRM_SCHED_POLICY_DEFAULT/DRM_SCHED_POLICY_UNSET/ (Luben)
  - s/default_drm_sched_policy/drm_sched_policy_default/ (Luben)
  - Update commit message (Boris)
  - Fix v3d build (CI)
  - s/bad_policies/drm_sched_policy_mismatch/ (Luben)
  - Don't update modparam doc (Luben)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  3 ++-
 drivers/gpu/drm/lima/lima_sched.c          |  3 ++-
 drivers/gpu/drm/msm/msm_ringbuffer.c       |  3 ++-
 drivers/gpu/drm/nouveau/nouveau_sched.c    |  3 ++-
 drivers/gpu/drm/panfrost/panfrost_job.c    |  3 ++-
 drivers/gpu/drm/scheduler/sched_entity.c   | 24 ++++++++++++++++++----
 drivers/gpu/drm/scheduler/sched_main.c     | 19 ++++++++++++-----
 drivers/gpu/drm/v3d/v3d_sched.c            | 15 +++++++++-----
 include/drm/gpu_scheduler.h                | 20 ++++++++++++------
 10 files changed, 69 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 16f3cfe1574a..d937e0c71486 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2283,6 +2283,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 				   ring->num_hw_submission, 0,
 				   timeout, adev->reset_domain->wq,
 				   ring->sched_score, ring->name,
+				   DRM_SCHED_POLICY_UNSET,
 				   adev->dev);
 		if (r) {
 			DRM_ERROR("Failed to create scheduler on ring %s.\n",
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 618a804ddc34..15b0e2f1abe5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -137,7 +137,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
 	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
 			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
 			     msecs_to_jiffies(500), NULL, NULL,
-			     dev_name(gpu->dev), gpu->dev);
+			     dev_name(gpu->dev), DRM_SCHED_POLICY_UNSET,
+			     gpu->dev);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index 8d858aed0e56..50c2075228aa 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -491,7 +491,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
 	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
 			      lima_job_hang_limit,
 			      msecs_to_jiffies(timeout), NULL,
-			      NULL, name, pipe->ldev->dev);
+			      NULL, name, DRM_SCHED_POLICY_UNSET,
+			      pipe->ldev->dev);
 }
 
 void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index b8865e61b40f..a1c8834c359d 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -96,7 +96,8 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
 
 	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
 			num_hw_submissions, 0, sched_timeout,
-			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
+			NULL, NULL, to_msm_bo(ring->bo)->name,
+			DRM_SCHED_POLICY_UNSET, gpu->dev->dev);
 	if (ret) {
 		goto fail;
 	}
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index d458c2227d4f..f26a814a9920 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -431,7 +431,8 @@ int nouveau_sched_init(struct nouveau_drm *drm)
 
 	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
 			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
-			      NULL, NULL, "nouveau_sched", drm->dev->dev);
+			      NULL, NULL, "nouveau_sched",
+			      DRM_SCHED_POLICY_UNSET, drm->dev->dev);
 }
 
 void nouveau_sched_fini(struct nouveau_drm *drm)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 326ca1ddf1d7..241e62801586 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -835,7 +835,8 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 				     nentries, 0,
 				     msecs_to_jiffies(JOB_TIMEOUT_MS),
 				     pfdev->reset.wq,
-				     NULL, "pan_js", pfdev->dev);
+				     NULL, "pan_js", DRM_SCHED_POLICY_UNSET,
+				     pfdev->dev);
 		if (ret) {
 			dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
 			goto err_sched;
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index a42763e1429d..cf42e2265d64 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -33,6 +33,20 @@
 #define to_drm_sched_job(sched_job)		\
 		container_of((sched_job), struct drm_sched_job, queue_node)
 
+static bool drm_sched_policy_mismatch(struct drm_gpu_scheduler **sched_list,
+				      unsigned int num_sched_list)
+{
+	enum drm_sched_policy sched_policy = sched_list[0]->sched_policy;
+	unsigned int i;
+
+	/* All schedule policies must match */
+	for (i = 1; i < num_sched_list; ++i)
+		if (sched_policy != sched_list[i]->sched_policy)
+			return true;
+
+	return false;
+}
+
 /**
  * drm_sched_entity_init - Init a context entity used by scheduler when
  * submit to HW ring.
@@ -62,7 +76,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 			  unsigned int num_sched_list,
 			  atomic_t *guilty)
 {
-	if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
+	if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])) ||
+	    drm_sched_policy_mismatch(sched_list, num_sched_list))
 		return -EINVAL;
 
 	memset(entity, 0, sizeof(struct drm_sched_entity));
@@ -486,7 +501,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 	 * Update the entity's location in the min heap according to
 	 * the timestamp of the next job, if any.
 	 */
-	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) {
+	if (entity->rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO) {
 		struct drm_sched_job *next;
 
 		next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
@@ -558,7 +573,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 {
 	struct drm_sched_entity *entity = sched_job->entity;
-	bool first;
+	bool first, fifo = entity->rq->sched->sched_policy ==
+		DRM_SCHED_POLICY_FIFO;
 	ktime_t submit_ts;
 
 	trace_drm_sched_job(sched_job, entity);
@@ -587,7 +603,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 		drm_sched_rq_add_entity(entity->rq, entity);
 		spin_unlock(&entity->rq_lock);
 
-		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
+		if (fifo)
 			drm_sched_rq_update_fifo(entity, submit_ts);
 
 		drm_sched_wakeup_if_can_queue(entity->rq->sched);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index ee6281942e36..f645f32977ed 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -66,14 +66,14 @@
 #define to_drm_sched_job(sched_job)		\
 		container_of((sched_job), struct drm_sched_job, queue_node)
 
-int drm_sched_policy = DRM_SCHED_POLICY_FIFO;
+int drm_sched_policy_default = DRM_SCHED_POLICY_FIFO;
 
 /**
  * DOC: sched_policy (int)
  * Used to override default entities scheduling policy in a run queue.
  */
 MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default).");
-module_param_named(sched_policy, drm_sched_policy, int, 0444);
+module_param_named(sched_policy, drm_sched_policy_default, int, 0444);
 
 static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a,
 							    const struct rb_node *b)
@@ -177,7 +177,7 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 	if (rq->current_entity == entity)
 		rq->current_entity = NULL;
 
-	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
+	if (rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO)
 		drm_sched_rq_remove_fifo_locked(entity);
 
 	spin_unlock(&rq->lock);
@@ -898,7 +898,7 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 
 	/* Kernel run queue has higher priority than normal run queue*/
 	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
-		entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
+		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
 			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
 			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
 		if (entity)
@@ -1072,6 +1072,7 @@ static void drm_sched_main(struct work_struct *w)
  *		used
  * @score: optional score atomic shared with other schedulers
  * @name: name used for debugging
+ * @sched_policy: schedule policy
  * @dev: target &struct device
  *
  * Return 0 on success, otherwise error code.
@@ -1081,9 +1082,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   struct workqueue_struct *submit_wq,
 		   unsigned hw_submission, unsigned hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
-		   atomic_t *score, const char *name, struct device *dev)
+		   atomic_t *score, const char *name,
+		   enum drm_sched_policy sched_policy,
+		   struct device *dev)
 {
 	int i;
+
+	if (sched_policy >= DRM_SCHED_POLICY_COUNT)
+		return -EINVAL;
+
 	sched->ops = ops;
 	sched->hw_submission_limit = hw_submission;
 	sched->name = name;
@@ -1102,6 +1109,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 	sched->hang_limit = hang_limit;
 	sched->score = score ? score : &sched->_score;
 	sched->dev = dev;
+	sched->sched_policy = sched_policy == DRM_SCHED_POLICY_UNSET ?
+		drm_sched_policy_default : sched_policy;
 	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
 		drm_sched_rq_init(sched, &sched->sched_rq[i]);
 
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 38e092ea41e6..dec89c5b8cb1 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -391,7 +391,8 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     &v3d_bin_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_bin", v3d->drm.dev);
+			     NULL, "v3d_bin", DRM_SCHED_POLICY_UNSET,
+			     v3d->drm.dev);
 	if (ret)
 		return ret;
 
@@ -399,7 +400,8 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     &v3d_render_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_render", v3d->drm.dev);
+			     NULL, "v3d_render", DRM_SCHED_POLICY_UNSET,
+			     v3d->drm.dev);
 	if (ret)
 		goto fail;
 
@@ -407,7 +409,8 @@ v3d_sched_init(struct v3d_dev *v3d)
 			     &v3d_tfu_sched_ops, NULL,
 			     hw_jobs_limit, job_hang_limit,
 			     msecs_to_jiffies(hang_limit_ms), NULL,
-			     NULL, "v3d_tfu", v3d->drm.dev);
+			     NULL, "v3d_tfu", DRM_SCHED_POLICY_UNSET,
+			     v3d->drm.dev);
 	if (ret)
 		goto fail;
 
@@ -416,7 +419,8 @@ v3d_sched_init(struct v3d_dev *v3d)
 				     &v3d_csd_sched_ops, NULL,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
-				     NULL, "v3d_csd", v3d->drm.dev);
+				     NULL, "v3d_csd", DRM_SCHED_POLICY_UNSET,
+				     v3d->drm.dev);
 		if (ret)
 			goto fail;
 
@@ -424,7 +428,8 @@ v3d_sched_init(struct v3d_dev *v3d)
 				     &v3d_cache_clean_sched_ops, NULL,
 				     hw_jobs_limit, job_hang_limit,
 				     msecs_to_jiffies(hang_limit_ms), NULL,
-				     NULL, "v3d_cache_clean", v3d->drm.dev);
+				     NULL, "v3d_cache_clean",
+				     DRM_SCHED_POLICY_UNSET, v3d->drm.dev);
 		if (ret)
 			goto fail;
 	}
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 95927c52383c..9f830ff84bad 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -72,11 +72,15 @@ enum drm_sched_priority {
 	DRM_SCHED_PRIORITY_UNSET = -2
 };
 
-/* Used to chose between FIFO and RR jobs scheduling */
-extern int drm_sched_policy;
-
-#define DRM_SCHED_POLICY_RR    0
-#define DRM_SCHED_POLICY_FIFO  1
+/* Used to chose default scheduling policy*/
+extern int default_drm_sched_policy;
+
+enum drm_sched_policy {
+	DRM_SCHED_POLICY_UNSET,
+	DRM_SCHED_POLICY_RR,
+	DRM_SCHED_POLICY_FIFO,
+	DRM_SCHED_POLICY_COUNT,
+};
 
 /**
  * struct drm_sched_entity - A wrapper around a job queue (typically
@@ -489,6 +493,7 @@ struct drm_sched_backend_ops {
  *              guilty and it will no longer be considered for scheduling.
  * @score: score to help loadbalancer pick a idle sched
  * @_score: score used when the driver doesn't provide one
+ * @sched_policy: Schedule policy for scheduler
  * @ready: marks if the underlying HW is ready to work
  * @free_guilty: A hit to time out handler to free the guilty job.
  * @pause_submit: pause queuing of @work_submit on @submit_wq
@@ -515,6 +520,7 @@ struct drm_gpu_scheduler {
 	int				hang_limit;
 	atomic_t                        *score;
 	atomic_t                        _score;
+	enum drm_sched_policy		sched_policy;
 	bool				ready;
 	bool				free_guilty;
 	bool				pause_submit;
@@ -527,7 +533,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   struct workqueue_struct *submit_wq,
 		   uint32_t hw_submission, unsigned hang_limit,
 		   long timeout, struct workqueue_struct *timeout_wq,
-		   atomic_t *score, const char *name, struct device *dev);
+		   atomic_t *score, const char *name,
+		   enum drm_sched_policy sched_policy,
+		   struct device *dev);
 
 void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (2 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-27 14:36   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item Matthew Brost
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

DRM_SCHED_POLICY_SINGLE_ENTITY creates a 1 to 1 relationship between
scheduler and entity. No priorities or run queue used in this mode.
Intended for devices with firmware schedulers.

v2:
  - Drop sched / rq union (Luben)
v3:
  - Don't pick entity if stopped in drm_sched_select_entity (Danilo)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 69 ++++++++++++++++++------
 drivers/gpu/drm/scheduler/sched_fence.c  |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 64 +++++++++++++++++++---
 include/drm/gpu_scheduler.h              |  8 +++
 4 files changed, 120 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index cf42e2265d64..437c50867c99 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -83,6 +83,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 	memset(entity, 0, sizeof(struct drm_sched_entity));
 	INIT_LIST_HEAD(&entity->list);
 	entity->rq = NULL;
+	entity->single_sched = NULL;
 	entity->guilty = guilty;
 	entity->num_sched_list = num_sched_list;
 	entity->priority = priority;
@@ -90,8 +91,17 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 	RCU_INIT_POINTER(entity->last_scheduled, NULL);
 	RB_CLEAR_NODE(&entity->rb_tree_node);
 
-	if(num_sched_list)
-		entity->rq = &sched_list[0]->sched_rq[entity->priority];
+	if (num_sched_list) {
+		if (sched_list[0]->sched_policy !=
+		    DRM_SCHED_POLICY_SINGLE_ENTITY) {
+			entity->rq = &sched_list[0]->sched_rq[entity->priority];
+		} else {
+			if (num_sched_list != 1 || sched_list[0]->single_entity)
+				return -EINVAL;
+			sched_list[0]->single_entity = entity;
+			entity->single_sched = sched_list[0];
+		}
+	}
 
 	init_completion(&entity->entity_idle);
 
@@ -124,7 +134,8 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
 				    unsigned int num_sched_list)
 {
-	WARN_ON(!num_sched_list || !sched_list);
+	WARN_ON(!num_sched_list || !sched_list ||
+		!!entity->single_sched);
 
 	entity->sched_list = sched_list;
 	entity->num_sched_list = num_sched_list;
@@ -231,13 +242,15 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
 {
 	struct drm_sched_job *job;
 	struct dma_fence *prev;
+	bool single_entity = !!entity->single_sched;
 
-	if (!entity->rq)
+	if (!entity->rq && !single_entity)
 		return;
 
 	spin_lock(&entity->rq_lock);
 	entity->stopped = true;
-	drm_sched_rq_remove_entity(entity->rq, entity);
+	if (!single_entity)
+		drm_sched_rq_remove_entity(entity->rq, entity);
 	spin_unlock(&entity->rq_lock);
 
 	/* Make sure this entity is not used by the scheduler at the moment */
@@ -259,6 +272,20 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
 	dma_fence_put(prev);
 }
 
+/**
+ * drm_sched_entity_to_scheduler - Schedule entity to GPU scheduler
+ * @entity: scheduler entity
+ *
+ * Returns GPU scheduler for the entity
+ */
+struct drm_gpu_scheduler *
+drm_sched_entity_to_scheduler(struct drm_sched_entity *entity)
+{
+	bool single_entity = !!entity->single_sched;
+
+	return single_entity ? entity->single_sched : entity->rq->sched;
+}
+
 /**
  * drm_sched_entity_flush - Flush a context entity
  *
@@ -276,11 +303,12 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 	struct drm_gpu_scheduler *sched;
 	struct task_struct *last_user;
 	long ret = timeout;
+	bool single_entity = !!entity->single_sched;
 
-	if (!entity->rq)
+	if (!entity->rq && !single_entity)
 		return 0;
 
-	sched = entity->rq->sched;
+	sched = drm_sched_entity_to_scheduler(entity);
 	/**
 	 * The client will not queue more IBs during this fini, consume existing
 	 * queued IBs or discard them on SIGKILL
@@ -373,7 +401,7 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
 		container_of(cb, struct drm_sched_entity, cb);
 
 	drm_sched_entity_clear_dep(f, cb);
-	drm_sched_wakeup_if_can_queue(entity->rq->sched);
+	drm_sched_wakeup_if_can_queue(drm_sched_entity_to_scheduler(entity));
 }
 
 /**
@@ -387,6 +415,8 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
 void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority)
 {
+	WARN_ON(!!entity->single_sched);
+
 	spin_lock(&entity->rq_lock);
 	entity->priority = priority;
 	spin_unlock(&entity->rq_lock);
@@ -399,7 +429,7 @@ EXPORT_SYMBOL(drm_sched_entity_set_priority);
  */
 static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
 {
-	struct drm_gpu_scheduler *sched = entity->rq->sched;
+	struct drm_gpu_scheduler *sched = drm_sched_entity_to_scheduler(entity);
 	struct dma_fence *fence = entity->dependency;
 	struct drm_sched_fence *s_fence;
 
@@ -501,7 +531,8 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 	 * Update the entity's location in the min heap according to
 	 * the timestamp of the next job, if any.
 	 */
-	if (entity->rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO) {
+	if (drm_sched_entity_to_scheduler(entity)->sched_policy ==
+	    DRM_SCHED_POLICY_FIFO) {
 		struct drm_sched_job *next;
 
 		next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
@@ -524,6 +555,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_rq *rq;
 
+	WARN_ON(!!entity->single_sched);
+
 	/* single possible engine and already selected */
 	if (!entity->sched_list)
 		return;
@@ -573,12 +606,13 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 {
 	struct drm_sched_entity *entity = sched_job->entity;
-	bool first, fifo = entity->rq->sched->sched_policy ==
-		DRM_SCHED_POLICY_FIFO;
+	bool single_entity = !!entity->single_sched;
+	bool first;
 	ktime_t submit_ts;
 
 	trace_drm_sched_job(sched_job, entity);
-	atomic_inc(entity->rq->sched->score);
+	if (!single_entity)
+		atomic_inc(entity->rq->sched->score);
 	WRITE_ONCE(entity->last_user, current->group_leader);
 
 	/*
@@ -591,6 +625,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 
 	/* first job wakes up scheduler */
 	if (first) {
+		struct drm_gpu_scheduler *sched =
+			drm_sched_entity_to_scheduler(entity);
+		bool fifo = sched->sched_policy == DRM_SCHED_POLICY_FIFO;
+
 		/* Add the entity to the run queue */
 		spin_lock(&entity->rq_lock);
 		if (entity->stopped) {
@@ -600,13 +638,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 			return;
 		}
 
-		drm_sched_rq_add_entity(entity->rq, entity);
+		if (!single_entity)
+			drm_sched_rq_add_entity(entity->rq, entity);
 		spin_unlock(&entity->rq_lock);
 
 		if (fifo)
 			drm_sched_rq_update_fifo(entity, submit_ts);
 
-		drm_sched_wakeup_if_can_queue(entity->rq->sched);
+		drm_sched_wakeup_if_can_queue(sched);
 	}
 }
 EXPORT_SYMBOL(drm_sched_entity_push_job);
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 06cedfe4b486..f6b926f5e188 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -225,7 +225,7 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
 {
 	unsigned seq;
 
-	fence->sched = entity->rq->sched;
+	fence->sched = drm_sched_entity_to_scheduler(entity);
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index f645f32977ed..588c735f7498 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -32,7 +32,8 @@
  * backend operations to the scheduler like submitting a job to hardware run queue,
  * returning the dependencies of a job etc.
  *
- * The organisation of the scheduler is the following:
+ * The organisation of the scheduler is the following for scheduling policies
+ * DRM_SCHED_POLICY_RR and DRM_SCHED_POLICY_FIFO:
  *
  * 1. Each hw run queue has one scheduler
  * 2. Each scheduler has multiple run queues with different priorities
@@ -43,6 +44,23 @@
  *
  * The jobs in a entity are always scheduled in the order that they were pushed.
  *
+ * The organisation of the scheduler is the following for scheduling policy
+ * DRM_SCHED_POLICY_SINGLE_ENTITY:
+ *
+ * 1. One to one relationship between scheduler and entity
+ * 2. No priorities implemented per scheduler (single job queue)
+ * 3. No run queues in scheduler rather jobs are directly dequeued from entity
+ * 4. The entity maintains a queue of jobs that will be scheduled on the
+ * hardware
+ *
+ * The jobs in a entity are always scheduled in the order that they were pushed
+ * regardless of scheduling policy.
+ *
+ * A policy of DRM_SCHED_POLICY_RR or DRM_SCHED_POLICY_FIFO is expected to used
+ * when the KMD is scheduling directly on the hardware while a scheduling policy
+ * of DRM_SCHED_POLICY_SINGLE_ENTITY is expected to be used when there is a
+ * firmware scheduler.
+ *
  * Note that once a job was taken from the entities queue and pushed to the
  * hardware, i.e. the pending queue, the entity must not be referenced anymore
  * through the jobs entity pointer.
@@ -96,6 +114,8 @@ static inline void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *enti
 
 void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
 {
+	WARN_ON(!!entity->single_sched);
+
 	/*
 	 * Both locks need to be grabbed, one to protect from entity->rq change
 	 * for entity from within concurrent drm_sched_entity_select_rq and the
@@ -126,6 +146,8 @@ void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
 static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
 			      struct drm_sched_rq *rq)
 {
+	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
+
 	spin_lock_init(&rq->lock);
 	INIT_LIST_HEAD(&rq->entities);
 	rq->rb_tree_root = RB_ROOT_CACHED;
@@ -144,6 +166,8 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
 void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 			     struct drm_sched_entity *entity)
 {
+	WARN_ON(!!entity->single_sched);
+
 	if (!list_empty(&entity->list))
 		return;
 
@@ -166,6 +190,8 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 				struct drm_sched_entity *entity)
 {
+	WARN_ON(!!entity->single_sched);
+
 	if (list_empty(&entity->list))
 		return;
 
@@ -641,7 +667,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner)
 {
-	if (!entity->rq)
+	if (!entity->rq && !entity->single_sched)
 		return -ENOENT;
 
 	job->entity = entity;
@@ -674,13 +700,16 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 {
 	struct drm_gpu_scheduler *sched;
 	struct drm_sched_entity *entity = job->entity;
+	bool single_entity = !!entity->single_sched;
 
 	BUG_ON(!entity);
-	drm_sched_entity_select_rq(entity);
-	sched = entity->rq->sched;
+	if (!single_entity)
+		drm_sched_entity_select_rq(entity);
+	sched = drm_sched_entity_to_scheduler(entity);
 
 	job->sched = sched;
-	job->s_priority = entity->rq - sched->sched_rq;
+	if (!single_entity)
+		job->s_priority = entity->rq - sched->sched_rq;
 	job->id = atomic64_inc_return(&sched->job_id_count);
 
 	drm_sched_fence_init(job->s_fence, job->entity);
@@ -896,6 +925,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	if (!drm_sched_can_queue(sched))
 		return NULL;
 
+	if (sched->single_entity) {
+		if (!READ_ONCE(sched->single_entity->stopped) &&
+		    drm_sched_entity_is_ready(sched->single_entity))
+			return sched->single_entity;
+
+		return NULL;
+	}
+
 	/* Kernel run queue has higher priority than normal run queue*/
 	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
 		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
@@ -1092,6 +1129,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		return -EINVAL;
 
 	sched->ops = ops;
+	sched->single_entity = NULL;
 	sched->hw_submission_limit = hw_submission;
 	sched->name = name;
 	if (!submit_wq) {
@@ -1111,7 +1149,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 	sched->dev = dev;
 	sched->sched_policy = sched_policy == DRM_SCHED_POLICY_UNSET ?
 		drm_sched_policy_default : sched_policy;
-	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
+	for (i = DRM_SCHED_PRIORITY_MIN; sched_policy !=
+	     DRM_SCHED_POLICY_SINGLE_ENTITY && i < DRM_SCHED_PRIORITY_COUNT;
+	     i++)
 		drm_sched_rq_init(sched, &sched->sched_rq[i]);
 
 	init_waitqueue_head(&sched->job_scheduled);
@@ -1143,7 +1183,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 
 	drm_sched_submit_stop(sched);
 
-	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+	if (sched->single_entity) {
+		spin_lock(&sched->single_entity->rq_lock);
+		sched->single_entity->stopped = true;
+		spin_unlock(&sched->single_entity->rq_lock);
+	}
+
+	for (i = DRM_SCHED_PRIORITY_COUNT - 1; sched->sched_policy !=
+	     DRM_SCHED_POLICY_SINGLE_ENTITY && i >= DRM_SCHED_PRIORITY_MIN;
+	     i--) {
 		struct drm_sched_rq *rq = &sched->sched_rq[i];
 
 		spin_lock(&rq->lock);
@@ -1186,6 +1234,8 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
 	struct drm_sched_entity *entity;
 	struct drm_gpu_scheduler *sched = bad->sched;
 
+	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
+
 	/* don't change @bad's karma if it's from KERNEL RQ,
 	 * because sometimes GPU hang would cause kernel jobs (like VM updating jobs)
 	 * corrupt but keep in mind that kernel jobs always considered good.
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 9f830ff84bad..655675f797ea 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -79,6 +79,7 @@ enum drm_sched_policy {
 	DRM_SCHED_POLICY_UNSET,
 	DRM_SCHED_POLICY_RR,
 	DRM_SCHED_POLICY_FIFO,
+	DRM_SCHED_POLICY_SINGLE_ENTITY,
 	DRM_SCHED_POLICY_COUNT,
 };
 
@@ -112,6 +113,9 @@ struct drm_sched_entity {
 	 */
 	struct drm_sched_rq		*rq;
 
+	/** @single_sched: Single scheduler */
+	struct drm_gpu_scheduler	*single_sched;
+
 	/**
 	 * @sched_list:
 	 *
@@ -473,6 +477,7 @@ struct drm_sched_backend_ops {
  * struct drm_gpu_scheduler - scheduler instance-specific data
  *
  * @ops: backend operations provided by the driver.
+ * @single_entity: Single entity for the scheduler
  * @hw_submission_limit: the max size of the hardware queue.
  * @timeout: the time after which a job is removed from the scheduler.
  * @name: name of the ring for which this scheduler is being used.
@@ -504,6 +509,7 @@ struct drm_sched_backend_ops {
  */
 struct drm_gpu_scheduler {
 	const struct drm_sched_backend_ops	*ops;
+	struct drm_sched_entity		*single_entity;
 	uint32_t			hw_submission_limit;
 	long				timeout;
 	const char			*name;
@@ -587,6 +593,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 			  struct drm_gpu_scheduler **sched_list,
 			  unsigned int num_sched_list,
 			  atomic_t *guilty);
+struct drm_gpu_scheduler *
+drm_sched_entity_to_scheduler(struct drm_sched_entity *entity);
 long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout);
 void drm_sched_entity_fini(struct drm_sched_entity *entity);
 void drm_sched_entity_destroy(struct drm_sched_entity *entity);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (3 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-28 16:14   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Rather than call free_job and run_job in same work item have a dedicated
work item for each. This aligns with the design and intended use of work
queues.

v2:
   - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
     timestamp in free_job() work item (Danilo)
v3:
  - Drop forward dec of drm_sched_select_entity (Boris)
  - Return in drm_sched_run_job_work if entity NULL (Boris)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 290 +++++++++++++++----------
 include/drm/gpu_scheduler.h            |   8 +-
 2 files changed, 182 insertions(+), 116 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 588c735f7498..1e21d234fb5c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -213,11 +213,12 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
  * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run
  *
  * @rq: scheduler run queue to check.
+ * @dequeue: dequeue selected entity
  *
  * Try to find a ready entity, returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq, bool dequeue)
 {
 	struct drm_sched_entity *entity;
 
@@ -227,8 +228,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 	if (entity) {
 		list_for_each_entry_continue(entity, &rq->entities, list) {
 			if (drm_sched_entity_is_ready(entity)) {
-				rq->current_entity = entity;
-				reinit_completion(&entity->entity_idle);
+				if (dequeue) {
+					rq->current_entity = entity;
+					reinit_completion(&entity->entity_idle);
+				}
 				spin_unlock(&rq->lock);
 				return entity;
 			}
@@ -238,8 +241,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 	list_for_each_entry(entity, &rq->entities, list) {
 
 		if (drm_sched_entity_is_ready(entity)) {
-			rq->current_entity = entity;
-			reinit_completion(&entity->entity_idle);
+			if (dequeue) {
+				rq->current_entity = entity;
+				reinit_completion(&entity->entity_idle);
+			}
 			spin_unlock(&rq->lock);
 			return entity;
 		}
@@ -257,11 +262,12 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run
  *
  * @rq: scheduler run queue to check.
+ * @dequeue: dequeue selected entity
  *
  * Find oldest waiting ready entity, returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq, bool dequeue)
 {
 	struct rb_node *rb;
 
@@ -271,8 +277,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
 
 		entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node);
 		if (drm_sched_entity_is_ready(entity)) {
-			rq->current_entity = entity;
-			reinit_completion(&entity->entity_idle);
+			if (dequeue) {
+				rq->current_entity = entity;
+				reinit_completion(&entity->entity_idle);
+			}
 			break;
 		}
 	}
@@ -282,13 +290,102 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
 }
 
 /**
- * drm_sched_submit_queue - scheduler queue submission
+ * drm_sched_run_job_queue - queue job submission
+ * @sched: scheduler instance
+ */
+static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
+{
+	if (!READ_ONCE(sched->pause_submit))
+		queue_work(sched->submit_wq, &sched->work_run_job);
+}
+
+/**
+ * drm_sched_can_queue -- Can we queue more to the hardware?
+ * @sched: scheduler instance
+ *
+ * Return true if we can push more jobs to the hw, otherwise false.
+ */
+static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
+{
+	return atomic_read(&sched->hw_rq_count) <
+		sched->hw_submission_limit;
+}
+
+/**
+ * drm_sched_select_entity - Select next entity to process
+ *
+ * @sched: scheduler instance
+ * @dequeue: dequeue selected entity
+ *
+ * Returns the entity to process or NULL if none are found.
+ */
+static struct drm_sched_entity *
+drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool dequeue)
+{
+	struct drm_sched_entity *entity;
+	int i;
+
+	if (!drm_sched_can_queue(sched))
+		return NULL;
+
+	if (sched->single_entity) {
+		if (!READ_ONCE(sched->single_entity->stopped) &&
+		    drm_sched_entity_is_ready(sched->single_entity))
+			return sched->single_entity;
+
+		return NULL;
+	}
+
+	/* Kernel run queue has higher priority than normal run queue*/
+	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
+		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
+			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i],
+							dequeue) :
+			drm_sched_rq_select_entity_rr(&sched->sched_rq[i],
+						      dequeue);
+		if (entity)
+			break;
+	}
+
+	return entity;
+}
+
+/**
+ * drm_sched_run_job_queue_if_ready - queue job submission if ready
  * @sched: scheduler instance
  */
-static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
+static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
+{
+	if (drm_sched_select_entity(sched, false))
+		drm_sched_run_job_queue(sched);
+}
+
+/**
+ * drm_sched_free_job_queue - queue free job
+ *
+ * @sched: scheduler instance to queue free job
+ */
+static void drm_sched_free_job_queue(struct drm_gpu_scheduler *sched)
 {
 	if (!READ_ONCE(sched->pause_submit))
-		queue_work(sched->submit_wq, &sched->work_submit);
+		queue_work(sched->submit_wq, &sched->work_free_job);
+}
+
+/**
+ * drm_sched_free_job_queue_if_ready - queue free job if ready
+ *
+ * @sched: scheduler instance to queue free job
+ */
+static void drm_sched_free_job_queue_if_ready(struct drm_gpu_scheduler *sched)
+{
+	struct drm_sched_job *job;
+
+	spin_lock(&sched->job_list_lock);
+	job = list_first_entry_or_null(&sched->pending_list,
+				       struct drm_sched_job, list);
+	if (job && dma_fence_is_signaled(&job->s_fence->finished))
+		drm_sched_free_job_queue(sched);
+	spin_unlock(&sched->job_list_lock);
 }
 
 /**
@@ -310,7 +407,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
 	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence, result);
 	dma_fence_put(&s_fence->finished);
-	drm_sched_submit_queue(sched);
+	drm_sched_free_job_queue(sched);
 }
 
 /**
@@ -885,18 +982,6 @@ void drm_sched_job_cleanup(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_cleanup);
 
-/**
- * drm_sched_can_queue -- Can we queue more to the hardware?
- * @sched: scheduler instance
- *
- * Return true if we can push more jobs to the hw, otherwise false.
- */
-static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
-{
-	return atomic_read(&sched->hw_rq_count) <
-		sched->hw_submission_limit;
-}
-
 /**
  * drm_sched_wakeup_if_can_queue - Wake up the scheduler
  * @sched: scheduler instance
@@ -906,43 +991,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
 void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
 {
 	if (drm_sched_can_queue(sched))
-		drm_sched_submit_queue(sched);
-}
-
-/**
- * drm_sched_select_entity - Select next entity to process
- *
- * @sched: scheduler instance
- *
- * Returns the entity to process or NULL if none are found.
- */
-static struct drm_sched_entity *
-drm_sched_select_entity(struct drm_gpu_scheduler *sched)
-{
-	struct drm_sched_entity *entity;
-	int i;
-
-	if (!drm_sched_can_queue(sched))
-		return NULL;
-
-	if (sched->single_entity) {
-		if (!READ_ONCE(sched->single_entity->stopped) &&
-		    drm_sched_entity_is_ready(sched->single_entity))
-			return sched->single_entity;
-
-		return NULL;
-	}
-
-	/* Kernel run queue has higher priority than normal run queue*/
-	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
-		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
-			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
-			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
-		if (entity)
-			break;
-	}
-
-	return entity;
+		drm_sched_run_job_queue(sched);
 }
 
 /**
@@ -974,8 +1023,10 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 						typeof(*next), list);
 
 		if (next) {
-			next->s_fence->scheduled.timestamp =
-				job->s_fence->finished.timestamp;
+			if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
+				     &next->s_fence->scheduled.flags))
+				next->s_fence->scheduled.timestamp =
+					job->s_fence->finished.timestamp;
 			/* start TO timer for next job */
 			drm_sched_start_timeout(sched);
 		}
@@ -1025,74 +1076,84 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
 EXPORT_SYMBOL(drm_sched_pick_best);
 
 /**
- * drm_sched_main - main scheduler thread
+ * drm_sched_free_job_work - worker to call free_job
  *
- * @param: scheduler instance
+ * @w: free job work
  */
-static void drm_sched_main(struct work_struct *w)
+static void drm_sched_free_job_work(struct work_struct *w)
 {
 	struct drm_gpu_scheduler *sched =
-		container_of(w, struct drm_gpu_scheduler, work_submit);
-	struct drm_sched_entity *entity;
+		container_of(w, struct drm_gpu_scheduler, work_free_job);
 	struct drm_sched_job *cleanup_job;
-	int r;
 
 	if (READ_ONCE(sched->pause_submit))
 		return;
 
 	cleanup_job = drm_sched_get_cleanup_job(sched);
-	entity = drm_sched_select_entity(sched);
-
-	if (!entity && !cleanup_job)
-		return;	/* No more work */
-
-	if (cleanup_job)
+	if (cleanup_job) {
 		sched->ops->free_job(cleanup_job);
 
-	if (entity) {
-		struct dma_fence *fence;
-		struct drm_sched_fence *s_fence;
-		struct drm_sched_job *sched_job;
-
-		sched_job = drm_sched_entity_pop_job(entity);
-		if (!sched_job) {
-			complete_all(&entity->entity_idle);
-			if (!cleanup_job)
-				return;	/* No more work */
-			goto again;
-		}
+		drm_sched_free_job_queue_if_ready(sched);
+		drm_sched_run_job_queue_if_ready(sched);
+	}
+}
+
+/**
+ * drm_sched_run_job_work - worker to call run_job
+ *
+ * @w: run job work
+ */
+static void drm_sched_run_job_work(struct work_struct *w)
+{
+	struct drm_gpu_scheduler *sched =
+		container_of(w, struct drm_gpu_scheduler, work_run_job);
+	struct drm_sched_entity *entity;
+	struct dma_fence *fence;
+	struct drm_sched_fence *s_fence;
+	struct drm_sched_job *sched_job;
+	int r;
 
-		s_fence = sched_job->s_fence;
+	if (READ_ONCE(sched->pause_submit))
+		return;
 
-		atomic_inc(&sched->hw_rq_count);
-		drm_sched_job_begin(sched_job);
+	entity = drm_sched_select_entity(sched, true);
+	if (!entity)
+		return;
 
-		trace_drm_run_job(sched_job, entity);
-		fence = sched->ops->run_job(sched_job);
+	sched_job = drm_sched_entity_pop_job(entity);
+	if (!sched_job) {
 		complete_all(&entity->entity_idle);
-		drm_sched_fence_scheduled(s_fence, fence);
+		return;	/* No more work */
+	}
 
-		if (!IS_ERR_OR_NULL(fence)) {
-			/* Drop for original kref_init of the fence */
-			dma_fence_put(fence);
+	s_fence = sched_job->s_fence;
 
-			r = dma_fence_add_callback(fence, &sched_job->cb,
-						   drm_sched_job_done_cb);
-			if (r == -ENOENT)
-				drm_sched_job_done(sched_job, fence->error);
-			else if (r)
-				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
-					  r);
-		} else {
-			drm_sched_job_done(sched_job, IS_ERR(fence) ?
-					   PTR_ERR(fence) : 0);
-		}
+	atomic_inc(&sched->hw_rq_count);
+	drm_sched_job_begin(sched_job);
+
+	trace_drm_run_job(sched_job, entity);
+	fence = sched->ops->run_job(sched_job);
+	complete_all(&entity->entity_idle);
+	drm_sched_fence_scheduled(s_fence, fence);
 
-		wake_up(&sched->job_scheduled);
+	if (!IS_ERR_OR_NULL(fence)) {
+		/* Drop for original kref_init of the fence */
+		dma_fence_put(fence);
+
+		r = dma_fence_add_callback(fence, &sched_job->cb,
+					   drm_sched_job_done_cb);
+		if (r == -ENOENT)
+			drm_sched_job_done(sched_job, fence->error);
+		else if (r)
+			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
+				  r);
+	} else {
+		drm_sched_job_done(sched_job, IS_ERR(fence) ?
+				   PTR_ERR(fence) : 0);
 	}
 
-again:
-	drm_sched_submit_queue(sched);
+	wake_up(&sched->job_scheduled);
+	drm_sched_run_job_queue_if_ready(sched);
 }
 
 /**
@@ -1159,7 +1220,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
-	INIT_WORK(&sched->work_submit, drm_sched_main);
+	INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
+	INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
 	atomic_set(&sched->_score, 0);
 	atomic64_set(&sched->job_id_count, 0);
 	sched->pause_submit = false;
@@ -1286,7 +1348,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
 void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
 {
 	WRITE_ONCE(sched->pause_submit, true);
-	cancel_work_sync(&sched->work_submit);
+	cancel_work_sync(&sched->work_run_job);
+	cancel_work_sync(&sched->work_free_job);
 }
 EXPORT_SYMBOL(drm_sched_submit_stop);
 
@@ -1298,6 +1361,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
 void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
 {
 	WRITE_ONCE(sched->pause_submit, false);
-	queue_work(sched->submit_wq, &sched->work_submit);
+	queue_work(sched->submit_wq, &sched->work_run_job);
+	queue_work(sched->submit_wq, &sched->work_free_job);
 }
 EXPORT_SYMBOL(drm_sched_submit_start);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 655675f797ea..7e6c121003ca 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -487,9 +487,10 @@ struct drm_sched_backend_ops {
  *                 finished.
  * @hw_rq_count: the number of jobs currently in the hardware queue.
  * @job_id_count: used to assign unique id to the each job.
- * @submit_wq: workqueue used to queue @work_submit
+ * @submit_wq: workqueue used to queue @work_run_job and @work_free_job
  * @timeout_wq: workqueue used to queue @work_tdr
- * @work_submit: schedules jobs and cleans up entities
+ * @work_run_job: schedules jobs
+ * @work_free_job: cleans up jobs
  * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
  *            timeout interval is over.
  * @pending_list: the list of jobs which are currently in the job queue.
@@ -519,7 +520,8 @@ struct drm_gpu_scheduler {
 	atomic64_t			job_id_count;
 	struct workqueue_struct		*submit_wq;
 	struct workqueue_struct		*timeout_wq;
-	struct work_struct		work_submit;
+	struct work_struct		work_run_job;
+	struct work_struct		work_free_job;
 	struct delayed_work		work_tdr;
 	struct list_head		pending_list;
 	spinlock_t			job_list_lock;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (4 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-29 21:23   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start Matthew Brost
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Also add a lockdep assert to drm_sched_start_timeout.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 1e21d234fb5c..09ef07b9e9d5 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -431,11 +431,20 @@ static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
  */
 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 {
+	lockdep_assert_held(&sched->job_list_lock);
+
 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
 	    !list_empty(&sched->pending_list))
 		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
 }
 
+static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
+{
+	spin_lock(&sched->job_list_lock);
+	drm_sched_start_timeout(sched);
+	spin_unlock(&sched->job_list_lock);
+}
+
 /**
  * drm_sched_fault - immediately start timeout handler
  *
@@ -548,11 +557,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		spin_unlock(&sched->job_list_lock);
 	}
 
-	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
-		spin_lock(&sched->job_list_lock);
-		drm_sched_start_timeout(sched);
-		spin_unlock(&sched->job_list_lock);
-	}
+	if (status != DRM_GPU_SCHED_STAT_ENODEV)
+		drm_sched_start_timeout_unlocked(sched);
 }
 
 /**
@@ -678,11 +684,8 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 			drm_sched_job_done(s_job, -ECANCELED);
 	}
 
-	if (full_recovery) {
-		spin_lock(&sched->job_list_lock);
-		drm_sched_start_timeout(sched);
-		spin_unlock(&sched->job_list_lock);
-	}
+	if (full_recovery)
+		drm_sched_start_timeout_unlocked(sched);
 
 	drm_sched_submit_start(sched);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (5 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-29 21:53   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR Matthew Brost
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

If the TDR is set to a very small value it can fire before the
submission is started in the function drm_sched_start. The submission is
expected to running when the TDR fires, fix this ordering so this
expectation is always met.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 09ef07b9e9d5..a5cc9b6c2faa 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 			drm_sched_job_done(s_job, -ECANCELED);
 	}
 
+	drm_sched_submit_start(sched);
+
 	if (full_recovery)
 		drm_sched_start_timeout_unlocked(sched);
-
-	drm_sched_submit_start(sched);
 }
 EXPORT_SYMBOL(drm_sched_start);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (6 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-29 21:58   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs Matthew Brost
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

If the TDR is set to a value, it can fire before a job is submitted in
drm_sched_main. The job should be always be submitted before the TDR
fires, fix this ordering.

v2:
  - Add to pending list before run_job, start TDR after (Luben, Boris)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index a5cc9b6c2faa..e8a3e6033f66 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -517,7 +517,6 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 
 	spin_lock(&sched->job_list_lock);
 	list_add_tail(&s_job->list, &sched->pending_list);
-	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
 
@@ -1138,6 +1137,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
 	fence = sched->ops->run_job(sched_job);
 	complete_all(&entity->entity_idle);
 	drm_sched_fence_scheduled(s_fence, fence);
+	drm_sched_start_timeout_unlocked(sched);
 
 	if (!IS_ERR_OR_NULL(fence)) {
 		/* Drop for original kref_init of the fence */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (7 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-29 22:44   ` Luben Tuikov
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 10/10] drm/sched: Update maintainers of GPU scheduler Matthew Brost
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Add helper to queue TDR immediately for current and future jobs. This
will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup
a drm_scheduler that encounter errors.

v2:
 - Drop timeout args, rename function, use mod delayed work (Luben)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
 include/drm/gpu_scheduler.h            |  1 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index e8a3e6033f66..88ef8be2d3c7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -435,7 +435,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 
 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
 	    !list_empty(&sched->pending_list))
-		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
+		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
 }
 
 static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
@@ -445,6 +445,23 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
 	spin_unlock(&sched->job_list_lock);
 }
 
+/**
+ * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
+ * jobs
+ *
+ * @sched: scheduler where the timeout handling should be started.
+ *
+ * Start timeout handling immediately for current and future jobs
+ */
+void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
+{
+	spin_lock(&sched->job_list_lock);
+	sched->timeout = 0;
+	drm_sched_start_timeout(sched);
+	spin_unlock(&sched->job_list_lock);
+}
+EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
+
 /**
  * drm_sched_fault - immediately start timeout handler
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 7e6c121003ca..27f5778bbd6d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
 
+void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
 bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] [PATCH v4 10/10] drm/sched: Update maintainers of GPU scheduler
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (8 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs Matthew Brost
@ 2023-09-19  5:01 ` Matthew Brost
  2023-09-19  5:32 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev6) Patchwork
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Matthew Brost @ 2023-09-19  5:01 UTC (permalink / raw)
  To: dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Add Matthew Brost to maintainers of GPU scheduler

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 60c2d97e427b..43c51d1abee5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7134,6 +7134,7 @@ F:	drivers/gpu/drm/xlnx/
 
 DRM GPU SCHEDULER
 M:	Luben Tuikov <luben.tuikov@amd.com>
+M:	Matthew Brost <matthew.brost@intel.com>
 L:	dri-devel@lists.freedesktop.org
 S:	Maintained
 T:	git git://anongit.freedesktop.org/drm/drm-misc
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev6)
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (9 preceding siblings ...)
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 10/10] drm/sched: Update maintainers of GPU scheduler Matthew Brost
@ 2023-09-19  5:32 ` Patchwork
  2023-09-19 11:44 ` [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Danilo Krummrich
  2023-09-27  7:33 ` Boris Brezillon
  12 siblings, 0 replies; 45+ messages in thread
From: Patchwork @ 2023-09-19  5:32 UTC (permalink / raw)
  To: Danilo Krummrich; +Cc: intel-xe

== Series Details ==

Series: DRM scheduler changes for Xe (rev6)
URL   : https://patchwork.freedesktop.org/series/121744/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: 22df933d8 drm/xe: Rename exec_queue_kill_compute to xe_vm_remove_compute_exec_queue
=== git am output follows ===
error: patch failed: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:290
error: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c: patch does not apply
error: patch failed: drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1659
error: drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c: patch does not apply
error: patch failed: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4588
error: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: patch does not apply
error: patch failed: drivers/gpu/drm/msm/adreno/adreno_device.c:809
error: drivers/gpu/drm/msm/adreno/adreno_device.c: patch does not apply
error: patch failed: drivers/gpu/drm/scheduler/sched_main.c:439
error: drivers/gpu/drm/scheduler/sched_main.c: patch does not apply
error: patch failed: include/drm/gpu_scheduler.h:550
error: include/drm/gpu_scheduler.h: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch
Applying: drm/sched: Add drm_sched_submit_* helpers
Patch failed at 0001 drm/sched: Add drm_sched_submit_* helpers
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
@ 2023-09-19  5:58   ` Christian König
  2023-09-21  3:41     ` Luben Tuikov
  2023-09-27  1:07   ` Luben Tuikov
  1 sibling, 1 reply; 45+ messages in thread
From: Christian König @ 2023-09-19  5:58 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, lina, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, luben.tuikov, dakr, donald.robson, daniel,
	boris.brezillon, airlied, faith.ekstrand

Am 19.09.23 um 07:01 schrieb Matthew Brost:
> Add scheduler submit ready, stop, and start helpers to hide the
> implementation details of the scheduler from the drivers.
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Christian König <christian.koenig@amd.com> for this one.

No idea when I have time to look into the rest :( But Luben should take 
a look.

Regards,
Christian

> ---
>   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 15 +++----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 12 +++---
>   drivers/gpu/drm/msm/adreno/adreno_device.c    |  6 ++-
>   drivers/gpu/drm/scheduler/sched_main.c        | 40 ++++++++++++++++++-
>   include/drm/gpu_scheduler.h                   |  3 ++
>   6 files changed, 60 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> index 625db444df1c..36a1accbc846 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus
>   	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>   		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
>   
> -		if (!(ring && ring->sched.thread))
> +		if (!(ring && drm_sched_submit_ready(&ring->sched)))
>   			continue;
>   
>   		/* stop secheduler and drain ring. */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index a4faea4fa0b5..fb5dad687168 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   		struct amdgpu_ring *ring = adev->rings[i];
>   
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>   			continue;
> -		kthread_park(ring->sched.thread);
> +		drm_sched_submit_stop(&ring->sched);
>   	}
>   
>   	seq_puts(m, "run ib test:\n");
> @@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   		struct amdgpu_ring *ring = adev->rings[i];
>   
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>   			continue;
> -		kthread_unpark(ring->sched.thread);
> +		drm_sched_submit_start(&ring->sched);
>   	}
>   
>   	up_write(&adev->reset_domain->sem);
> @@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>   
>   	ring = adev->rings[val];
>   
> -	if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread)
> +	if (!ring || !ring->funcs->preempt_ib ||
> +	    !drm_sched_submit_ready(&ring->sched))
>   		return -EINVAL;
>   
>   	/* the last preemption failed */
> @@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>   		goto pro_end;
>   
>   	/* stop the scheduler */
> -	kthread_park(ring->sched.thread);
> +	drm_sched_submit_stop(&ring->sched);
>   
>   	/* preempt the IB */
>   	r = amdgpu_ring_preempt_ib(ring);
> @@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>   
>   failure:
>   	/* restart the scheduler */
> -	kthread_unpark(ring->sched.thread);
> +	drm_sched_submit_start(&ring->sched);
>   
>   	up_read(&adev->reset_domain->sem);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 30c4f5cca02c..e366f61c3aed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4588,7 +4588,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   		struct amdgpu_ring *ring = adev->rings[i];
>   
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>   			continue;
>   
>   		spin_lock(&ring->sched.job_list_lock);
> @@ -4727,7 +4727,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   		struct amdgpu_ring *ring = adev->rings[i];
>   
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>   			continue;
>   
>   		/* Clear job fence from fence drv to avoid force_completion
> @@ -5266,7 +5266,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   			struct amdgpu_ring *ring = tmp_adev->rings[i];
>   
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>   				continue;
>   
>   			drm_sched_stop(&ring->sched, job ? &job->base : NULL);
> @@ -5341,7 +5341,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   			struct amdgpu_ring *ring = tmp_adev->rings[i];
>   
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>   				continue;
>   
>   			drm_sched_start(&ring->sched, true);
> @@ -5667,7 +5667,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   			struct amdgpu_ring *ring = adev->rings[i];
>   
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>   				continue;
>   
>   			drm_sched_stop(&ring->sched, NULL);
> @@ -5795,7 +5795,7 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   		struct amdgpu_ring *ring = adev->rings[i];
>   
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>   			continue;
>   
>   		drm_sched_start(&ring->sched, true);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index fa527935ffd4..e046dc5ff72a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -809,7 +809,8 @@ static void suspend_scheduler(struct msm_gpu *gpu)
>   	 */
>   	for (i = 0; i < gpu->nr_rings; i++) {
>   		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> -		kthread_park(sched->thread);
> +
> +		drm_sched_submit_stop(sched);
>   	}
>   }
>   
> @@ -819,7 +820,8 @@ static void resume_scheduler(struct msm_gpu *gpu)
>   
>   	for (i = 0; i < gpu->nr_rings; i++) {
>   		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> -		kthread_unpark(sched->thread);
> +
> +		drm_sched_submit_start(sched);
>   	}
>   }
>   
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 506371c42745..e4fa62abca41 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -439,7 +439,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   {
>   	struct drm_sched_job *s_job, *tmp;
>   
> -	kthread_park(sched->thread);
> +	drm_sched_submit_stop(sched);
>   
>   	/*
>   	 * Reinsert back the bad job here - now it's safe as
> @@ -552,7 +552,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   		spin_unlock(&sched->job_list_lock);
>   	}
>   
> -	kthread_unpark(sched->thread);
> +	drm_sched_submit_start(sched);
>   }
>   EXPORT_SYMBOL(drm_sched_start);
>   
> @@ -1206,3 +1206,39 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
>   	}
>   }
>   EXPORT_SYMBOL(drm_sched_increase_karma);
> +
> +/**
> + * drm_sched_submit_ready - scheduler ready for submission
> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if submission is ready
> + */
> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
> +{
> +	return !!sched->thread;
> +
> +}
> +EXPORT_SYMBOL(drm_sched_submit_ready);
> +
> +/**
> + * drm_sched_submit_stop - stop scheduler submission
> + *
> + * @sched: scheduler instance
> + */
> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
> +{
> +	kthread_park(sched->thread);
> +}
> +EXPORT_SYMBOL(drm_sched_submit_stop);
> +
> +/**
> + * drm_sched_submit_start - start scheduler submission
> + *
> + * @sched: scheduler instance
> + */
> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
> +{
> +	kthread_unpark(sched->thread);
> +}
> +EXPORT_SYMBOL(drm_sched_submit_start);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index f9544d9b670d..f12c5aea5294 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -550,6 +550,9 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>   
>   void drm_sched_job_cleanup(struct drm_sched_job *job);
>   void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched);
> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched);
>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad);
>   void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery);
>   void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (10 preceding siblings ...)
  2023-09-19  5:32 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev6) Patchwork
@ 2023-09-19 11:44 ` Danilo Krummrich
  2023-09-25 21:47   ` Danilo Krummrich
  2023-09-27  7:33 ` Boris Brezillon
  12 siblings, 1 reply; 45+ messages in thread
From: Danilo Krummrich @ 2023-09-19 11:44 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, frank.binns, mcanal,
	Liviu.Dudau, luben.tuikov, lina, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

Hi Matt,

On 9/19/23 07:01, Matthew Brost wrote:
> As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
> have been asked to merge our common DRM scheduler patches first.
> 
> This a continuation of a RFC [3] with all comments addressed, ready for
> a full review, and hopefully in state which can merged in the near
> future. More details of this series can found in the cover letter of the
> RFC [3].
> 
> These changes have been tested with the Xe driver.
> 
> v2:
>   - Break run job, free job, and process message in own work items
>   - This might break other drivers as run job and free job now can run in
>     parallel, can fix up if needed
> 
> v3:
>   - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
>   - Fix issue with setting timestamp to early
>   - Don't dequeue jobs for single entity after calling entity fini
>   - Flush pending jobs on entity fini
>   - Add documentation for entity teardown
>   - Add Matthew Brost to maintainers of DRM scheduler
> 
> v4:
>   - Drop message interface
>   - Drop 'Flush pending jobs on entity fini'
>   - Drop 'Add documentation for entity teardown'
>   - Address all feedback

There is some feedback from V3 that doesn't seem to be addressed yet.

(1) Document tear down of struct drm_gpu_scheduler. [1]
(2) Implement helpers to tear down struct drm_gpu_scheduler. [2]
(3) Document implications of using a workqueue in terms of free_job() being
     or not being part of the fence signaling path respectively. [3]

I think at least (1) and (3) should be part of this series. I think (2) could
also happen subsequently. Christian's idea [2] how to address this is quite
interesting, but might exceed the scope of this series.

I will try to rebase my Nouveau changes onto your V4 tomorrow for a quick test.

- Danilo

[1] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m2e8c1c1e68e8127d5dd62509b5e424a12217300b
[2] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m16a0d6fa2e617383776515af45d3c6b9db543d47
[3] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m807ff95284089fdb51985f1c187666772314bd8a

> 
> Matt
> 
> Matthew Brost (10):
>    drm/sched: Add drm_sched_submit_* helpers
>    drm/sched: Convert drm scheduler to use a work queue rather than
>      kthread
>    drm/sched: Move schedule policy to scheduler
>    drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
>    drm/sched: Split free_job into own work item
>    drm/sched: Add drm_sched_start_timeout_unlocked helper
>    drm/sched: Start submission before TDR in drm_sched_start
>    drm/sched: Submit job before starting TDR
>    drm/sched: Add helper to queue TDR immediately for current and future
>      jobs
>    drm/sched: Update maintainers of GPU scheduler
> 
>   MAINTAINERS                                   |   1 +
>   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
>   drivers/gpu/drm/lima/lima_sched.c             |   5 +-
>   drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
>   drivers/gpu/drm/msm/msm_ringbuffer.c          |   5 +-
>   drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
>   drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
>   drivers/gpu/drm/scheduler/sched_entity.c      |  85 ++-
>   drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
>   drivers/gpu/drm/scheduler/sched_main.c        | 491 ++++++++++++------
>   drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
>   include/drm/gpu_scheduler.h                   |  48 +-
>   15 files changed, 495 insertions(+), 220 deletions(-)
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers
  2023-09-19  5:58   ` Christian König
@ 2023-09-21  3:41     ` Luben Tuikov
  0 siblings, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-09-21  3:41 UTC (permalink / raw)
  To: Christian König, Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, faith.ekstrand

On 2023-09-19 01:58, Christian König wrote:
> Am 19.09.23 um 07:01 schrieb Matthew Brost:
>> Add scheduler submit ready, stop, and start helpers to hide the
>> implementation details of the scheduler from the drivers.
>>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> Reviewed-by: Christian König <christian.koenig@amd.com> for this one.
> 
> No idea when I have time to look into the rest :( But Luben should take 
> a look.

Hi Christian,

Yes, I'll finish up with v3 and v4 tomorrow morning and afternoon.

Regards,
Luben

> 
> Regards,
> Christian
> 
>> ---
>>   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 15 +++----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 12 +++---
>>   drivers/gpu/drm/msm/adreno/adreno_device.c    |  6 ++-
>>   drivers/gpu/drm/scheduler/sched_main.c        | 40 ++++++++++++++++++-
>>   include/drm/gpu_scheduler.h                   |  3 ++
>>   6 files changed, 60 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
>> index 625db444df1c..36a1accbc846 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
>> @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus
>>   	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>>   		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
>>   
>> -		if (!(ring && ring->sched.thread))
>> +		if (!(ring && drm_sched_submit_ready(&ring->sched)))
>>   			continue;
>>   
>>   		/* stop secheduler and drain ring. */
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> index a4faea4fa0b5..fb5dad687168 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> @@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -		if (!ring || !ring->sched.thread)
>> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   			continue;
>> -		kthread_park(ring->sched.thread);
>> +		drm_sched_submit_stop(&ring->sched);
>>   	}
>>   
>>   	seq_puts(m, "run ib test:\n");
>> @@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -		if (!ring || !ring->sched.thread)
>> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   			continue;
>> -		kthread_unpark(ring->sched.thread);
>> +		drm_sched_submit_start(&ring->sched);
>>   	}
>>   
>>   	up_write(&adev->reset_domain->sem);
>> @@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>>   
>>   	ring = adev->rings[val];
>>   
>> -	if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread)
>> +	if (!ring || !ring->funcs->preempt_ib ||
>> +	    !drm_sched_submit_ready(&ring->sched))
>>   		return -EINVAL;
>>   
>>   	/* the last preemption failed */
>> @@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>>   		goto pro_end;
>>   
>>   	/* stop the scheduler */
>> -	kthread_park(ring->sched.thread);
>> +	drm_sched_submit_stop(&ring->sched);
>>   
>>   	/* preempt the IB */
>>   	r = amdgpu_ring_preempt_ib(ring);
>> @@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>>   
>>   failure:
>>   	/* restart the scheduler */
>> -	kthread_unpark(ring->sched.thread);
>> +	drm_sched_submit_start(&ring->sched);
>>   
>>   	up_read(&adev->reset_domain->sem);
>>   
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 30c4f5cca02c..e366f61c3aed 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4588,7 +4588,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -		if (!ring || !ring->sched.thread)
>> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   			continue;
>>   
>>   		spin_lock(&ring->sched.job_list_lock);
>> @@ -4727,7 +4727,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -		if (!ring || !ring->sched.thread)
>> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   			continue;
>>   
>>   		/* Clear job fence from fence drv to avoid force_completion
>> @@ -5266,7 +5266,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   			struct amdgpu_ring *ring = tmp_adev->rings[i];
>>   
>> -			if (!ring || !ring->sched.thread)
>> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   				continue;
>>   
>>   			drm_sched_stop(&ring->sched, job ? &job->base : NULL);
>> @@ -5341,7 +5341,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   			struct amdgpu_ring *ring = tmp_adev->rings[i];
>>   
>> -			if (!ring || !ring->sched.thread)
>> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   				continue;
>>   
>>   			drm_sched_start(&ring->sched, true);
>> @@ -5667,7 +5667,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   			struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -			if (!ring || !ring->sched.thread)
>> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   				continue;
>>   
>>   			drm_sched_stop(&ring->sched, NULL);
>> @@ -5795,7 +5795,7 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
>>   	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>>   		struct amdgpu_ring *ring = adev->rings[i];
>>   
>> -		if (!ring || !ring->sched.thread)
>> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>>   			continue;
>>   
>>   		drm_sched_start(&ring->sched, true);
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> index fa527935ffd4..e046dc5ff72a 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
>> @@ -809,7 +809,8 @@ static void suspend_scheduler(struct msm_gpu *gpu)
>>   	 */
>>   	for (i = 0; i < gpu->nr_rings; i++) {
>>   		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
>> -		kthread_park(sched->thread);
>> +
>> +		drm_sched_submit_stop(sched);
>>   	}
>>   }
>>   
>> @@ -819,7 +820,8 @@ static void resume_scheduler(struct msm_gpu *gpu)
>>   
>>   	for (i = 0; i < gpu->nr_rings; i++) {
>>   		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
>> -		kthread_unpark(sched->thread);
>> +
>> +		drm_sched_submit_start(sched);
>>   	}
>>   }
>>   
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 506371c42745..e4fa62abca41 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -439,7 +439,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>   {
>>   	struct drm_sched_job *s_job, *tmp;
>>   
>> -	kthread_park(sched->thread);
>> +	drm_sched_submit_stop(sched);
>>   
>>   	/*
>>   	 * Reinsert back the bad job here - now it's safe as
>> @@ -552,7 +552,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   		spin_unlock(&sched->job_list_lock);
>>   	}
>>   
>> -	kthread_unpark(sched->thread);
>> +	drm_sched_submit_start(sched);
>>   }
>>   EXPORT_SYMBOL(drm_sched_start);
>>   
>> @@ -1206,3 +1206,39 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
>>   	}
>>   }
>>   EXPORT_SYMBOL(drm_sched_increase_karma);
>> +
>> +/**
>> + * drm_sched_submit_ready - scheduler ready for submission
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Returns true if submission is ready
>> + */
>> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
>> +{
>> +	return !!sched->thread;
>> +
>> +}
>> +EXPORT_SYMBOL(drm_sched_submit_ready);
>> +
>> +/**
>> + * drm_sched_submit_stop - stop scheduler submission
>> + *
>> + * @sched: scheduler instance
>> + */
>> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
>> +{
>> +	kthread_park(sched->thread);
>> +}
>> +EXPORT_SYMBOL(drm_sched_submit_stop);
>> +
>> +/**
>> + * drm_sched_submit_start - start scheduler submission
>> + *
>> + * @sched: scheduler instance
>> + */
>> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
>> +{
>> +	kthread_unpark(sched->thread);
>> +}
>> +EXPORT_SYMBOL(drm_sched_submit_start);
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index f9544d9b670d..f12c5aea5294 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -550,6 +550,9 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>   
>>   void drm_sched_job_cleanup(struct drm_sched_job *job);
>>   void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
>> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
>> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched);
>> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched);
>>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad);
>>   void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery);
>>   void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);
> 

-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler Matthew Brost
@ 2023-09-24  1:18   ` kernel test robot
  2023-09-27 12:13   ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: kernel test robot @ 2023-09-24  1:18 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, mcanal, sarah.walker, ketil.johnsen, lina,
	oe-kbuild-all, Liviu.Dudau, boris.brezillon, luben.tuikov, dakr,
	donald.robson, christian.koenig, faith.ekstrand

Hi Matthew,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on drm-exynos/exynos-drm-next drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip linus/master v6.6-rc2 next-20230921]
[cannot apply to drm-misc/drm-misc-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Matthew-Brost/drm-sched-Add-drm_sched_submit_-helpers/20230919-130353
base:   git://anongit.freedesktop.org/drm/drm drm-next
patch link:    https://lore.kernel.org/r/20230919050155.2647172-4-matthew.brost%40intel.com
patch subject: [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler
config: i386-randconfig-063-20230924 (https://download.01.org/0day-ci/archive/20230924/202309240829.jXx6CtQ0-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230924/202309240829.jXx6CtQ0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309240829.jXx6CtQ0-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/scheduler/sched_main.c:69:5: sparse: sparse: symbol 'drm_sched_policy_default' was not declared. Should it be static?

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe
  2023-09-19 11:44 ` [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Danilo Krummrich
@ 2023-09-25 21:47   ` Danilo Krummrich
  0 siblings, 0 replies; 45+ messages in thread
From: Danilo Krummrich @ 2023-09-25 21:47 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, frank.binns, mcanal,
	Liviu.Dudau, luben.tuikov, lina, donald.robson, daniel,
	boris.brezillon, airlied, christian.koenig, faith.ekstrand

On 9/19/23 13:44, Danilo Krummrich wrote:
> Hi Matt,
> 
> On 9/19/23 07:01, Matthew Brost wrote:
>> As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
>> have been asked to merge our common DRM scheduler patches first.
>>
>> This a continuation of a RFC [3] with all comments addressed, ready for
>> a full review, and hopefully in state which can merged in the near
>> future. More details of this series can found in the cover letter of the
>> RFC [3].
>>
>> These changes have been tested with the Xe driver.
>>
>> v2:
>>   - Break run job, free job, and process message in own work items
>>   - This might break other drivers as run job and free job now can run in
>>     parallel, can fix up if needed
>>
>> v3:
>>   - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
>>   - Fix issue with setting timestamp to early
>>   - Don't dequeue jobs for single entity after calling entity fini
>>   - Flush pending jobs on entity fini
>>   - Add documentation for entity teardown
>>   - Add Matthew Brost to maintainers of DRM scheduler
>>
>> v4:
>>   - Drop message interface
>>   - Drop 'Flush pending jobs on entity fini'
>>   - Drop 'Add documentation for entity teardown'
>>   - Address all feedback
> 
> There is some feedback from V3 that doesn't seem to be addressed yet.
> 
> (1) Document tear down of struct drm_gpu_scheduler. [1]
> (2) Implement helpers to tear down struct drm_gpu_scheduler. [2]
> (3) Document implications of using a workqueue in terms of free_job() being
>      or not being part of the fence signaling path respectively. [3]
> 
> I think at least (1) and (3) should be part of this series. I think (2) could
> also happen subsequently. Christian's idea [2] how to address this is quite
> interesting, but might exceed the scope of this series.
> 
> I will try to rebase my Nouveau changes onto your V4 tomorrow for a quick test.

Tested-by: Danilo Krummrich <dakr@redhat.com>

> 
> - Danilo
> 
> [1] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m2e8c1c1e68e8127d5dd62509b5e424a12217300b
> [2] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m16a0d6fa2e617383776515af45d3c6b9db543d47
> [3] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m807ff95284089fdb51985f1c187666772314bd8a
> 
>>
>> Matt
>>
>> Matthew Brost (10):
>>    drm/sched: Add drm_sched_submit_* helpers
>>    drm/sched: Convert drm scheduler to use a work queue rather than
>>      kthread
>>    drm/sched: Move schedule policy to scheduler
>>    drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
>>    drm/sched: Split free_job into own work item
>>    drm/sched: Add drm_sched_start_timeout_unlocked helper
>>    drm/sched: Start submission before TDR in drm_sched_start
>>    drm/sched: Submit job before starting TDR
>>    drm/sched: Add helper to queue TDR immediately for current and future
>>      jobs
>>    drm/sched: Update maintainers of GPU scheduler
>>
>>   MAINTAINERS                                   |   1 +
>>   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
>>   drivers/gpu/drm/lima/lima_sched.c             |   5 +-
>>   drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
>>   drivers/gpu/drm/msm/msm_ringbuffer.c          |   5 +-
>>   drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
>>   drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
>>   drivers/gpu/drm/scheduler/sched_entity.c      |  85 ++-
>>   drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
>>   drivers/gpu/drm/scheduler/sched_main.c        | 491 ++++++++++++------
>>   drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
>>   include/drm/gpu_scheduler.h                   |  48 +-
>>   15 files changed, 495 insertions(+), 220 deletions(-)
>>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
  2023-09-19  5:58   ` Christian König
@ 2023-09-27  1:07   ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-09-27  1:07 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

On 2023-09-19 01:01, Matthew Brost wrote:
> Add scheduler submit ready, stop, and start helpers to hide the
> implementation details of the scheduler from the drivers.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 15 +++----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 12 +++---
>  drivers/gpu/drm/msm/adreno/adreno_device.c    |  6 ++-
>  drivers/gpu/drm/scheduler/sched_main.c        | 40 ++++++++++++++++++-
>  include/drm/gpu_scheduler.h                   |  3 ++
>  6 files changed, 60 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> index 625db444df1c..36a1accbc846 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus
>  	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>  		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
>  
> -		if (!(ring && ring->sched.thread))
> +		if (!(ring && drm_sched_submit_ready(&ring->sched)))
>  			continue;
>  
>  		/* stop secheduler and drain ring. */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index a4faea4fa0b5..fb5dad687168 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>  		struct amdgpu_ring *ring = adev->rings[i];
>  
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>  			continue;
> -		kthread_park(ring->sched.thread);
> +		drm_sched_submit_stop(&ring->sched);
>  	}
>  
>  	seq_puts(m, "run ib test:\n");
> @@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>  	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>  		struct amdgpu_ring *ring = adev->rings[i];
>  
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>  			continue;
> -		kthread_unpark(ring->sched.thread);
> +		drm_sched_submit_start(&ring->sched);
>  	}
>  
>  	up_write(&adev->reset_domain->sem);
> @@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>  
>  	ring = adev->rings[val];
>  
> -	if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread)
> +	if (!ring || !ring->funcs->preempt_ib ||
> +	    !drm_sched_submit_ready(&ring->sched))
>  		return -EINVAL;
>  
>  	/* the last preemption failed */
> @@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>  		goto pro_end;
>  
>  	/* stop the scheduler */
> -	kthread_park(ring->sched.thread);
> +	drm_sched_submit_stop(&ring->sched);
>  
>  	/* preempt the IB */
>  	r = amdgpu_ring_preempt_ib(ring);
> @@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
>  
>  failure:
>  	/* restart the scheduler */
> -	kthread_unpark(ring->sched.thread);
> +	drm_sched_submit_start(&ring->sched);
>  
>  	up_read(&adev->reset_domain->sem);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 30c4f5cca02c..e366f61c3aed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4588,7 +4588,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>  	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  		struct amdgpu_ring *ring = adev->rings[i];
>  
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>  			continue;
>  
>  		spin_lock(&ring->sched.job_list_lock);
> @@ -4727,7 +4727,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
>  	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  		struct amdgpu_ring *ring = adev->rings[i];
>  
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>  			continue;
>  
>  		/* Clear job fence from fence drv to avoid force_completion
> @@ -5266,7 +5266,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>  		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  			struct amdgpu_ring *ring = tmp_adev->rings[i];
>  
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>  				continue;
>  
>  			drm_sched_stop(&ring->sched, job ? &job->base : NULL);
> @@ -5341,7 +5341,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>  		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  			struct amdgpu_ring *ring = tmp_adev->rings[i];
>  
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>  				continue;
>  
>  			drm_sched_start(&ring->sched, true);
> @@ -5667,7 +5667,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
>  		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  			struct amdgpu_ring *ring = adev->rings[i];
>  
> -			if (!ring || !ring->sched.thread)
> +			if (!ring || !drm_sched_submit_ready(&ring->sched))
>  				continue;
>  
>  			drm_sched_stop(&ring->sched, NULL);
> @@ -5795,7 +5795,7 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
>  	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>  		struct amdgpu_ring *ring = adev->rings[i];
>  
> -		if (!ring || !ring->sched.thread)
> +		if (!ring || !drm_sched_submit_ready(&ring->sched))
>  			continue;
>  
>  		drm_sched_start(&ring->sched, true);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index fa527935ffd4..e046dc5ff72a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -809,7 +809,8 @@ static void suspend_scheduler(struct msm_gpu *gpu)
>  	 */
>  	for (i = 0; i < gpu->nr_rings; i++) {
>  		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> -		kthread_park(sched->thread);
> +
> +		drm_sched_submit_stop(sched);
>  	}
>  }
>  
> @@ -819,7 +820,8 @@ static void resume_scheduler(struct msm_gpu *gpu)
>  
>  	for (i = 0; i < gpu->nr_rings; i++) {
>  		struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> -		kthread_unpark(sched->thread);
> +
> +		drm_sched_submit_start(sched);
>  	}
>  }
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 506371c42745..e4fa62abca41 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -439,7 +439,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  {
>  	struct drm_sched_job *s_job, *tmp;
>  
> -	kthread_park(sched->thread);
> +	drm_sched_submit_stop(sched);
>  
>  	/*
>  	 * Reinsert back the bad job here - now it's safe as
> @@ -552,7 +552,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>  		spin_unlock(&sched->job_list_lock);
>  	}
>  
> -	kthread_unpark(sched->thread);
> +	drm_sched_submit_start(sched);
>  }
>  EXPORT_SYMBOL(drm_sched_start);
>  
> @@ -1206,3 +1206,39 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
>  	}
>  }
>  EXPORT_SYMBOL(drm_sched_increase_karma);
> +
> +/**
> + * drm_sched_submit_ready - scheduler ready for submission

"Is the scheduler ready for submission" is so much more clear
and approachable. Let's have that go in the kernel, yes?

> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if submission is ready
> + */
> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
> +{
> +	return !!sched->thread;
> +
> +}

Remove the extra white line after the return-statement.
(Please run your patches through checkpatch.pl to catch those.)

With these two changes this patch is:
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

> +EXPORT_SYMBOL(drm_sched_submit_ready);
> +
> +/**
> + * drm_sched_submit_stop - stop scheduler submission
> + *
> + * @sched: scheduler instance
> + */
> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
> +{
> +	kthread_park(sched->thread);
> +}
> +EXPORT_SYMBOL(drm_sched_submit_stop);
> +
> +/**
> + * drm_sched_submit_start - start scheduler submission
> + *
> + * @sched: scheduler instance
> + */
> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
> +{
> +	kthread_unpark(sched->thread);
> +}
> +EXPORT_SYMBOL(drm_sched_submit_start);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index f9544d9b670d..f12c5aea5294 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -550,6 +550,9 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  
>  void drm_sched_job_cleanup(struct drm_sched_job *job);
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
> +bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
> +void drm_sched_submit_stop(struct drm_gpu_scheduler *sched);
> +void drm_sched_submit_start(struct drm_gpu_scheduler *sched);
>  void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad);
>  void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery);
>  void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);

-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
@ 2023-09-27  3:32   ` Luben Tuikov
  2023-10-05  3:33     ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-27  3:32 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> seems a bit odd but let us explain the reasoning below.
> 
> 1. In XE the submission order from multiple drm_sched_entity is not
> guaranteed to be the same completion even if targeting the same hardware
> engine. This is because in XE we have a firmware scheduler, the GuC,
> which allowed to reorder, timeslice, and preempt submissions. If a using
> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> apart as the TDR expects submission order == completion order. Using a
> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> 
> 2. In XE submissions are done via programming a ring buffer (circular
> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> control on the ring for free.
> 
> A problem with this design is currently a drm_gpu_scheduler uses a
> kthread for submission / job cleanup. This doesn't scale if a large
> number of drm_gpu_scheduler are used. To work around the scaling issue,
> use a worker rather than kthread for submission / job cleanup.
> 
> v2:
>   - (Rob Clark) Fix msm build
>   - Pass in run work queue
> v3:
>   - (Boris) don't have loop in worker
> v4:
>   - (Tvrtko) break out submit ready, stop, start helpers into own patch
> v5:
>   - (Boris) default to ordered work queue
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>  drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>  drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>  drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>  drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>  include/drm/gpu_scheduler.h                |  14 ++-
>  9 files changed, 79 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e366f61c3aed..16f3cfe1574a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>  			break;
>  		}
>  
> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>  				   ring->num_hw_submission, 0,
>  				   timeout, adev->reset_domain->wq,
>  				   ring->sched_score, ring->name,
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 345fec6cb1a4..618a804ddc34 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>  {
>  	int ret;
>  
> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>  			     msecs_to_jiffies(500), NULL, NULL,
>  			     dev_name(gpu->dev), gpu->dev);
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ffd91a5ee299..8d858aed0e56 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>  
>  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>  
> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>  			      lima_job_hang_limit,
>  			      msecs_to_jiffies(timeout), NULL,
>  			      NULL, name, pipe->ldev->dev);
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index 40c0bc35a44c..b8865e61b40f 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>  	 /* currently managing hangcheck ourselves: */
>  	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>  
> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>  			num_hw_submissions, 0, sched_timeout,
>  			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);

checkpatch.pl complains here about unmatched open parens.

>  	if (ret) {
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index 88217185e0f3..d458c2227d4f 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>  	if (!drm->sched_wq)
>  		return -ENOMEM;
>  
> -	return drm_sched_init(sched, &nouveau_sched_ops,
> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>  			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>  }
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 033f5e684707..326ca1ddf1d7 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  		js->queue[j].fence_context = dma_fence_context_alloc(1);
>  
>  		ret = drm_sched_init(&js->queue[j].sched,
> -				     &panfrost_sched_ops,
> +				     &panfrost_sched_ops, NULL,
>  				     nentries, 0,
>  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>  				     pfdev->reset.wq,
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index e4fa62abca41..ee6281942e36 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -48,7 +48,6 @@
>   * through the jobs entity pointer.
>   */
>  
> -#include <linux/kthread.h>
>  #include <linux/wait.h>
>  #include <linux/sched.h>
>  #include <linux/completion.h>
> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>  	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>  }
>  
> +/**
> + * drm_sched_submit_queue - scheduler queue submission

There is no verb in the description, and is not clear what
this function does unless one reads the code. Given that this
is DOC, this should be clearer here. Something like "queue
scheduler work to be executed" or something to that effect.

Coming back to this from reading the patch below, it was somewhat
unclear what "drm_sched_submit_queue()" does, since when reading
below, "submit" was being read by my mind as an adjective, as opposed
to a verb. Perhaps something like:

drm_sched_queue_submit(), or
drm_sched_queue_exec(), or
drm_sched_queue_push(), or something to that effect. You pick. :-)

Note that it doesn't have to be 100% reflective of the fact that
we're putting this on a workqueue and it would be executed sooner
or later, so long as it conveys the fact that we're executing this
scheduler queue.

> + * @sched: scheduler instance
> + */
> +static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
> +{
> +	if (!READ_ONCE(sched->pause_submit))
> +		queue_work(sched->submit_wq, &sched->work_submit);
> +}
> +
>  /**
>   * drm_sched_job_done - complete a job
>   * @s_job: pointer to the job which is done
> @@ -275,7 +284,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
>  	dma_fence_get(&s_fence->finished);
>  	drm_sched_fence_finished(s_fence, result);
>  	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> +	drm_sched_submit_queue(sched);
>  }
>  
>  /**
> @@ -868,7 +877,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
>  {
>  	if (drm_sched_can_queue(sched))
> -		wake_up_interruptible(&sched->wake_up_worker);
> +		drm_sched_submit_queue(sched);
>  }
>  
>  /**
> @@ -978,61 +987,42 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
>  }
>  EXPORT_SYMBOL(drm_sched_pick_best);
>  
> -/**
> - * drm_sched_blocked - check if the scheduler is blocked
> - *
> - * @sched: scheduler instance
> - *
> - * Returns true if blocked, otherwise false.
> - */
> -static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
> -{
> -	if (kthread_should_park()) {
> -		kthread_parkme();
> -		return true;
> -	}
> -
> -	return false;
> -}
> -
>  /**
>   * drm_sched_main - main scheduler thread
>   *
>   * @param: scheduler instance
> - *
> - * Returns 0.
>   */
> -static int drm_sched_main(void *param)
> +static void drm_sched_main(struct work_struct *w)
>  {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched =
> +		container_of(w, struct drm_gpu_scheduler, work_submit);
> +	struct drm_sched_entity *entity;
> +	struct drm_sched_job *cleanup_job;
>  	int r;
>  
> -	sched_set_fifo_low(current);
> +	if (READ_ONCE(sched->pause_submit))
> +		return;
>  
> -	while (!kthread_should_stop()) {
> -		struct drm_sched_entity *entity = NULL;
> -		struct drm_sched_fence *s_fence;
> -		struct drm_sched_job *sched_job;
> -		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
> +	cleanup_job = drm_sched_get_cleanup_job(sched);
> +	entity = drm_sched_select_entity(sched);
>  
> -		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
> -					 (!drm_sched_blocked(sched) &&
> -					  (entity = drm_sched_select_entity(sched))) ||
> -					 kthread_should_stop());
> +	if (!entity && !cleanup_job)
> +		return;	/* No more work */
>  
> -		if (cleanup_job)
> -			sched->ops->free_job(cleanup_job);
> +	if (cleanup_job)
> +		sched->ops->free_job(cleanup_job);
>  
> -		if (!entity)
> -			continue;
> +	if (entity) {
> +		struct dma_fence *fence;
> +		struct drm_sched_fence *s_fence;
> +		struct drm_sched_job *sched_job;
>  
>  		sched_job = drm_sched_entity_pop_job(entity);
> -
>  		if (!sched_job) {
>  			complete_all(&entity->entity_idle);
> -			continue;
> +			if (!cleanup_job)
> +				return;	/* No more work */
> +			goto again;
>  		}
>  
>  		s_fence = sched_job->s_fence;
> @@ -1063,7 +1053,9 @@ static int drm_sched_main(void *param)
>  
>  		wake_up(&sched->job_scheduled);
>  	}
> -	return 0;
> +
> +again:
> +	drm_sched_submit_queue(sched);
>  }
>  
>  /**
> @@ -1071,6 +1063,8 @@ static int drm_sched_main(void *param)
>   *
>   * @sched: scheduler instance
>   * @ops: backend operations for this scheduler
> + * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is
> + *	       allocated and used
>   * @hw_submission: number of hw submissions that can be in flight
>   * @hang_limit: number of times to allow a job to hang before dropping it
>   * @timeout: timeout value in jiffies for the scheduler
> @@ -1084,14 +1078,25 @@ static int drm_sched_main(void *param)
>   */
>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>  		   const struct drm_sched_backend_ops *ops,
> +		   struct workqueue_struct *submit_wq,
>  		   unsigned hw_submission, unsigned hang_limit,
>  		   long timeout, struct workqueue_struct *timeout_wq,
>  		   atomic_t *score, const char *name, struct device *dev)
>  {
> -	int i, ret;
> +	int i;
>  	sched->ops = ops;
>  	sched->hw_submission_limit = hw_submission;
>  	sched->name = name;
> +	if (!submit_wq) {
> +		sched->submit_wq = alloc_ordered_workqueue(name, 0);
> +		if (!sched->submit_wq)
> +			return -ENOMEM;
> +
> +		sched->alloc_submit_wq = true;
> +	} else {
> +		sched->submit_wq = submit_wq;
> +		sched->alloc_submit_wq = false;
> +	}

This if-conditional, I would've written:

	if (submit_wq) {
		sched->submit_wq = submit_wq;
		sched->alloc_submit_wq = false;
	} else {
		sched->submit_wq = alloc_ordered_workqueue(name, 0);
		if (!sched->submit_wq)
			return -ENOMEM;

		sched->alloc_submit_wq = true;
	}

It's easier to understand testing for positivity, than negativity.


>  	sched->timeout = timeout;
>  	sched->timeout_wq = timeout_wq ? : system_wq;
>  	sched->hang_limit = hang_limit;
> @@ -1100,23 +1105,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
>  		drm_sched_rq_init(sched, &sched->sched_rq[i]);
>  
> -	init_waitqueue_head(&sched->wake_up_worker);
>  	init_waitqueue_head(&sched->job_scheduled);
>  	INIT_LIST_HEAD(&sched->pending_list);
>  	spin_lock_init(&sched->job_list_lock);
>  	atomic_set(&sched->hw_rq_count, 0);
>  	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> +	INIT_WORK(&sched->work_submit, drm_sched_main);
>  	atomic_set(&sched->_score, 0);
>  	atomic64_set(&sched->job_id_count, 0);
> -
> -	/* Each scheduler will run on a seperate kernel thread */
> -	sched->thread = kthread_run(drm_sched_main, sched, sched->name);
> -	if (IS_ERR(sched->thread)) {
> -		ret = PTR_ERR(sched->thread);
> -		sched->thread = NULL;
> -		DRM_DEV_ERROR(sched->dev, "Failed to create scheduler for %s.\n", name);
> -		return ret;
> -	}
> +	sched->pause_submit = false;
>  
>  	sched->ready = true;
>  	return 0;
> @@ -1135,8 +1132,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>  	struct drm_sched_entity *s_entity;
>  	int i;
>  
> -	if (sched->thread)
> -		kthread_stop(sched->thread);
> +	drm_sched_submit_stop(sched);
>  
>  	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
>  		struct drm_sched_rq *rq = &sched->sched_rq[i];
> @@ -1159,6 +1155,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>  	/* Confirm no work left behind accessing device structures */
>  	cancel_delayed_work_sync(&sched->work_tdr);
>  
> +	if (sched->alloc_submit_wq)
> +		destroy_workqueue(sched->submit_wq);
>  	sched->ready = false;
>  }
>  EXPORT_SYMBOL(drm_sched_fini);
> @@ -1216,7 +1214,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>   */
>  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
>  {
> -	return !!sched->thread;
> +	return sched->ready;
>  
>  }
>  EXPORT_SYMBOL(drm_sched_submit_ready);
> @@ -1228,7 +1226,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
>   */
>  void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
>  {
> -	kthread_park(sched->thread);
> +	WRITE_ONCE(sched->pause_submit, true);
> +	cancel_work_sync(&sched->work_submit);
>  }
>  EXPORT_SYMBOL(drm_sched_submit_stop);
>  
> @@ -1239,6 +1238,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
>   */
>  void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
>  {
> -	kthread_unpark(sched->thread);
> +	WRITE_ONCE(sched->pause_submit, false);
> +	queue_work(sched->submit_wq, &sched->work_submit);
>  }
>  EXPORT_SYMBOL(drm_sched_submit_start);
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 06238e6d7f5c..38e092ea41e6 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -388,7 +388,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>  	int ret;
>  
>  	ret = drm_sched_init(&v3d->queue[V3D_BIN].sched,
> -			     &v3d_bin_sched_ops,
> +			     &v3d_bin_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
>  			     NULL, "v3d_bin", v3d->drm.dev);
> @@ -396,7 +396,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>  		return ret;
>  
>  	ret = drm_sched_init(&v3d->queue[V3D_RENDER].sched,
> -			     &v3d_render_sched_ops,
> +			     &v3d_render_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
>  			     NULL, "v3d_render", v3d->drm.dev);
> @@ -404,7 +404,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>  		goto fail;
>  
>  	ret = drm_sched_init(&v3d->queue[V3D_TFU].sched,
> -			     &v3d_tfu_sched_ops,
> +			     &v3d_tfu_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
>  			     NULL, "v3d_tfu", v3d->drm.dev);
> @@ -413,7 +413,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>  
>  	if (v3d_has_csd(v3d)) {
>  		ret = drm_sched_init(&v3d->queue[V3D_CSD].sched,
> -				     &v3d_csd_sched_ops,
> +				     &v3d_csd_sched_ops, NULL,
>  				     hw_jobs_limit, job_hang_limit,
>  				     msecs_to_jiffies(hang_limit_ms), NULL,
>  				     NULL, "v3d_csd", v3d->drm.dev);
> @@ -421,7 +421,7 @@ v3d_sched_init(struct v3d_dev *v3d)
>  			goto fail;
>  
>  		ret = drm_sched_init(&v3d->queue[V3D_CACHE_CLEAN].sched,
> -				     &v3d_cache_clean_sched_ops,
> +				     &v3d_cache_clean_sched_ops, NULL,
>  				     hw_jobs_limit, job_hang_limit,
>  				     msecs_to_jiffies(hang_limit_ms), NULL,
>  				     NULL, "v3d_cache_clean", v3d->drm.dev);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index f12c5aea5294..95927c52383c 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -473,17 +473,16 @@ struct drm_sched_backend_ops {
>   * @timeout: the time after which a job is removed from the scheduler.
>   * @name: name of the ring for which this scheduler is being used.
>   * @sched_rq: priority wise array of run queues.
> - * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> - *                  is ready to be scheduled.
>   * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
>   *                 waits on this wait queue until all the scheduled jobs are
>   *                 finished.
>   * @hw_rq_count: the number of jobs currently in the hardware queue.
>   * @job_id_count: used to assign unique id to the each job.
> + * @submit_wq: workqueue used to queue @work_submit
>   * @timeout_wq: workqueue used to queue @work_tdr
> + * @work_submit: schedules jobs and cleans up entities
>   * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>   *            timeout interval is over.
> - * @thread: the kthread on which the scheduler which run.
>   * @pending_list: the list of jobs which are currently in the job queue.
>   * @job_list_lock: lock to protect the pending_list.
>   * @hang_limit: once the hangs by a job crosses this limit then it is marked
> @@ -492,6 +491,8 @@ struct drm_sched_backend_ops {
>   * @_score: score used when the driver doesn't provide one
>   * @ready: marks if the underlying HW is ready to work
>   * @free_guilty: A hit to time out handler to free the guilty job.
> + * @pause_submit: pause queuing of @work_submit on @submit_wq
> + * @alloc_submit_wq: scheduler own allocation of @submit_wq
>   * @dev: system &struct device
>   *
>   * One scheduler is implemented for each hardware ring.
> @@ -502,13 +503,13 @@ struct drm_gpu_scheduler {
>  	long				timeout;
>  	const char			*name;
>  	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
> -	wait_queue_head_t		wake_up_worker;
>  	wait_queue_head_t		job_scheduled;
>  	atomic_t			hw_rq_count;
>  	atomic64_t			job_id_count;
> +	struct workqueue_struct		*submit_wq;
>  	struct workqueue_struct		*timeout_wq;
> +	struct work_struct		work_submit;
>  	struct delayed_work		work_tdr;
> -	struct task_struct		*thread;
>  	struct list_head		pending_list;
>  	spinlock_t			job_list_lock;
>  	int				hang_limit;
> @@ -516,11 +517,14 @@ struct drm_gpu_scheduler {
>  	atomic_t                        _score;
>  	bool				ready;
>  	bool				free_guilty;
> +	bool				pause_submit;
> +	bool				alloc_submit_wq;

Please rename it to what it actually describes:

alloc_submit_wq --> own_submit_wq

to mean "do we own the submit wq". Then the check becomes
the intuitive,
	if (sched->own_submit_wq)
		destroy_workqueue(sched->submit_wq);

>  	struct device			*dev;
>  };
>  
>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>  		   const struct drm_sched_backend_ops *ops,
> +		   struct workqueue_struct *submit_wq,
>  		   uint32_t hw_submission, unsigned hang_limit,
>  		   long timeout, struct workqueue_struct *timeout_wq,
>  		   atomic_t *score, const char *name, struct device *dev);

This is a good patch.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe
  2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
                   ` (11 preceding siblings ...)
  2023-09-19 11:44 ` [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Danilo Krummrich
@ 2023-09-27  7:33 ` Boris Brezillon
  12 siblings, 0 replies; 45+ messages in thread
From: Boris Brezillon @ 2023-09-27  7:33 UTC (permalink / raw)
  To: Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, luben.tuikov, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Mon, 18 Sep 2023 22:01:45 -0700
Matthew Brost <matthew.brost@intel.com> wrote:

> As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
> have been asked to merge our common DRM scheduler patches first.
> 
> This a continuation of a RFC [3] with all comments addressed, ready for
> a full review, and hopefully in state which can merged in the near
> future. More details of this series can found in the cover letter of the
> RFC [3].
> 
> These changes have been tested with the Xe driver.
> 
> v2:
>  - Break run job, free job, and process message in own work items
>  - This might break other drivers as run job and free job now can run in
>    parallel, can fix up if needed
> 
> v3:
>  - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
>  - Fix issue with setting timestamp to early
>  - Don't dequeue jobs for single entity after calling entity fini
>  - Flush pending jobs on entity fini
>  - Add documentation for entity teardown
>  - Add Matthew Brost to maintainers of DRM scheduler
> 
> v4:
>  - Drop message interface
>  - Drop 'Flush pending jobs on entity fini'
>  - Drop 'Add documentation for entity teardown'
>  - Address all feedback
> 
> Matt
> 
> Matthew Brost (10):
>   drm/sched: Add drm_sched_submit_* helpers
>   drm/sched: Convert drm scheduler to use a work queue rather than
>     kthread
>   drm/sched: Move schedule policy to scheduler
>   drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
>   drm/sched: Split free_job into own work item
>   drm/sched: Add drm_sched_start_timeout_unlocked helper
>   drm/sched: Start submission before TDR in drm_sched_start
>   drm/sched: Submit job before starting TDR
>   drm/sched: Add helper to queue TDR immediately for current and future
>     jobs
>   drm/sched: Update maintainers of GPU scheduler

Tested-by: Boris Brezillon <boris.brezillon@collabora.com>

> 
>  MAINTAINERS                                   |   1 +
>  .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
>  drivers/gpu/drm/lima/lima_sched.c             |   5 +-
>  drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
>  drivers/gpu/drm/msm/msm_ringbuffer.c          |   5 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
>  drivers/gpu/drm/scheduler/sched_entity.c      |  85 ++-
>  drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
>  drivers/gpu/drm/scheduler/sched_main.c        | 491 ++++++++++++------
>  drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
>  include/drm/gpu_scheduler.h                   |  48 +-
>  15 files changed, 495 insertions(+), 220 deletions(-)
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler Matthew Brost
  2023-09-24  1:18   ` kernel test robot
@ 2023-09-27 12:13   ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-09-27 12:13 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> Rather than a global modparam for scheduling policy, move the scheduling
> policy to scheduler so user can control each scheduler policy.
> 
> v2:
>   - s/DRM_SCHED_POLICY_MAX/DRM_SCHED_POLICY_COUNT (Luben)
>   - Only include policy in scheduler (Luben)
> v3:
>   - use a ternary operator as opposed to an if-control (Luben)
>   - s/DRM_SCHED_POLICY_DEFAULT/DRM_SCHED_POLICY_UNSET/ (Luben)
>   - s/default_drm_sched_policy/drm_sched_policy_default/ (Luben)
>   - Update commit message (Boris)
>   - Fix v3d build (CI)
>   - s/bad_policies/drm_sched_policy_mismatch/ (Luben)
>   - Don't update modparam doc (Luben)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |  3 ++-
>  drivers/gpu/drm/lima/lima_sched.c          |  3 ++-
>  drivers/gpu/drm/msm/msm_ringbuffer.c       |  3 ++-
>  drivers/gpu/drm/nouveau/nouveau_sched.c    |  3 ++-
>  drivers/gpu/drm/panfrost/panfrost_job.c    |  3 ++-
>  drivers/gpu/drm/scheduler/sched_entity.c   | 24 ++++++++++++++++++----
>  drivers/gpu/drm/scheduler/sched_main.c     | 19 ++++++++++++-----
>  drivers/gpu/drm/v3d/v3d_sched.c            | 15 +++++++++-----
>  include/drm/gpu_scheduler.h                | 20 ++++++++++++------
>  10 files changed, 69 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 16f3cfe1574a..d937e0c71486 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2283,6 +2283,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>  				   ring->num_hw_submission, 0,
>  				   timeout, adev->reset_domain->wq,
>  				   ring->sched_score, ring->name,
> +				   DRM_SCHED_POLICY_UNSET,
>  				   adev->dev);
>  		if (r) {
>  			DRM_ERROR("Failed to create scheduler on ring %s.\n",
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 618a804ddc34..15b0e2f1abe5 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -137,7 +137,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>  	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>  			     msecs_to_jiffies(500), NULL, NULL,
> -			     dev_name(gpu->dev), gpu->dev);
> +			     dev_name(gpu->dev), DRM_SCHED_POLICY_UNSET,
> +			     gpu->dev);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index 8d858aed0e56..50c2075228aa 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -491,7 +491,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>  	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>  			      lima_job_hang_limit,
>  			      msecs_to_jiffies(timeout), NULL,
> -			      NULL, name, pipe->ldev->dev);
> +			      NULL, name, DRM_SCHED_POLICY_UNSET,
> +			      pipe->ldev->dev);
>  }
>  
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index b8865e61b40f..a1c8834c359d 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -96,7 +96,8 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>  
>  	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>  			num_hw_submissions, 0, sched_timeout,
> -			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> +			NULL, NULL, to_msm_bo(ring->bo)->name,
> +			DRM_SCHED_POLICY_UNSET, gpu->dev->dev);

Align to the open brace. This should come from the fixes to patch 2.
(Please use scripts/checkpatch.pl to check patches for common fixes.)
With this fix, this patch is:

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
-- 
Regards,
Luben

>  	if (ret) {
>  		goto fail;
>  	}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index d458c2227d4f..f26a814a9920 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -431,7 +431,8 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>  
>  	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> -			      NULL, NULL, "nouveau_sched", drm->dev->dev);
> +			      NULL, NULL, "nouveau_sched",
> +			      DRM_SCHED_POLICY_UNSET, drm->dev->dev);
>  }
>  
>  void nouveau_sched_fini(struct nouveau_drm *drm)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 326ca1ddf1d7..241e62801586 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -835,7 +835,8 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  				     nentries, 0,
>  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>  				     pfdev->reset.wq,
> -				     NULL, "pan_js", pfdev->dev);
> +				     NULL, "pan_js", DRM_SCHED_POLICY_UNSET,
> +				     pfdev->dev);
>  		if (ret) {
>  			dev_err(pfdev->dev, "Failed to create scheduler: %d.", ret);
>  			goto err_sched;
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index a42763e1429d..cf42e2265d64 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -33,6 +33,20 @@
>  #define to_drm_sched_job(sched_job)		\
>  		container_of((sched_job), struct drm_sched_job, queue_node)
>  
> +static bool drm_sched_policy_mismatch(struct drm_gpu_scheduler **sched_list,
> +				      unsigned int num_sched_list)
> +{
> +	enum drm_sched_policy sched_policy = sched_list[0]->sched_policy;
> +	unsigned int i;
> +
> +	/* All schedule policies must match */
> +	for (i = 1; i < num_sched_list; ++i)
> +		if (sched_policy != sched_list[i]->sched_policy)
> +			return true;
> +
> +	return false;
> +}
> +
>  /**
>   * drm_sched_entity_init - Init a context entity used by scheduler when
>   * submit to HW ring.
> @@ -62,7 +76,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  			  unsigned int num_sched_list,
>  			  atomic_t *guilty)
>  {
> -	if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
> +	if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])) ||
> +	    drm_sched_policy_mismatch(sched_list, num_sched_list))
>  		return -EINVAL;
>  
>  	memset(entity, 0, sizeof(struct drm_sched_entity));
> @@ -486,7 +501,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  	 * Update the entity's location in the min heap according to
>  	 * the timestamp of the next job, if any.
>  	 */
> -	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) {
> +	if (entity->rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO) {
>  		struct drm_sched_job *next;
>  
>  		next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -558,7 +573,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>  void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  {
>  	struct drm_sched_entity *entity = sched_job->entity;
> -	bool first;
> +	bool first, fifo = entity->rq->sched->sched_policy ==
> +		DRM_SCHED_POLICY_FIFO;
>  	ktime_t submit_ts;
>  
>  	trace_drm_sched_job(sched_job, entity);
> @@ -587,7 +603,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  		drm_sched_rq_add_entity(entity->rq, entity);
>  		spin_unlock(&entity->rq_lock);
>  
> -		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> +		if (fifo)
>  			drm_sched_rq_update_fifo(entity, submit_ts);
>  
>  		drm_sched_wakeup_if_can_queue(entity->rq->sched);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index ee6281942e36..f645f32977ed 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -66,14 +66,14 @@
>  #define to_drm_sched_job(sched_job)		\
>  		container_of((sched_job), struct drm_sched_job, queue_node)
>  
> -int drm_sched_policy = DRM_SCHED_POLICY_FIFO;
> +int drm_sched_policy_default = DRM_SCHED_POLICY_FIFO;
>  
>  /**
>   * DOC: sched_policy (int)
>   * Used to override default entities scheduling policy in a run queue.
>   */
>  MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default).");
> -module_param_named(sched_policy, drm_sched_policy, int, 0444);
> +module_param_named(sched_policy, drm_sched_policy_default, int, 0444);
>  
>  static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a,
>  							    const struct rb_node *b)
> @@ -177,7 +177,7 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  	if (rq->current_entity == entity)
>  		rq->current_entity = NULL;
>  
> -	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> +	if (rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO)
>  		drm_sched_rq_remove_fifo_locked(entity);
>  
>  	spin_unlock(&rq->lock);
> @@ -898,7 +898,7 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>  
>  	/* Kernel run queue has higher priority than normal run queue*/
>  	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> -		entity = drm_sched_policy == DRM_SCHED_POLICY_FIFO ?
> +		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
>  			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
>  			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
>  		if (entity)
> @@ -1072,6 +1072,7 @@ static void drm_sched_main(struct work_struct *w)
>   *		used
>   * @score: optional score atomic shared with other schedulers
>   * @name: name used for debugging
> + * @sched_policy: schedule policy
>   * @dev: target &struct device
>   *
>   * Return 0 on success, otherwise error code.
> @@ -1081,9 +1082,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  		   struct workqueue_struct *submit_wq,
>  		   unsigned hw_submission, unsigned hang_limit,
>  		   long timeout, struct workqueue_struct *timeout_wq,
> -		   atomic_t *score, const char *name, struct device *dev)
> +		   atomic_t *score, const char *name,
> +		   enum drm_sched_policy sched_policy,
> +		   struct device *dev)
>  {
>  	int i;
> +
> +	if (sched_policy >= DRM_SCHED_POLICY_COUNT)
> +		return -EINVAL;
> +
>  	sched->ops = ops;
>  	sched->hw_submission_limit = hw_submission;
>  	sched->name = name;
> @@ -1102,6 +1109,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  	sched->hang_limit = hang_limit;
>  	sched->score = score ? score : &sched->_score;
>  	sched->dev = dev;
> +	sched->sched_policy = sched_policy == DRM_SCHED_POLICY_UNSET ?
> +		drm_sched_policy_default : sched_policy;
>  	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
>  		drm_sched_rq_init(sched, &sched->sched_rq[i]);
>  
> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> index 38e092ea41e6..dec89c5b8cb1 100644
> --- a/drivers/gpu/drm/v3d/v3d_sched.c
> +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> @@ -391,7 +391,8 @@ v3d_sched_init(struct v3d_dev *v3d)
>  			     &v3d_bin_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_bin", v3d->drm.dev);
> +			     NULL, "v3d_bin", DRM_SCHED_POLICY_UNSET,
> +			     v3d->drm.dev);
>  	if (ret)
>  		return ret;
>  
> @@ -399,7 +400,8 @@ v3d_sched_init(struct v3d_dev *v3d)
>  			     &v3d_render_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_render", v3d->drm.dev);
> +			     NULL, "v3d_render", DRM_SCHED_POLICY_UNSET,
> +			     v3d->drm.dev);
>  	if (ret)
>  		goto fail;
>  
> @@ -407,7 +409,8 @@ v3d_sched_init(struct v3d_dev *v3d)
>  			     &v3d_tfu_sched_ops, NULL,
>  			     hw_jobs_limit, job_hang_limit,
>  			     msecs_to_jiffies(hang_limit_ms), NULL,
> -			     NULL, "v3d_tfu", v3d->drm.dev);
> +			     NULL, "v3d_tfu", DRM_SCHED_POLICY_UNSET,
> +			     v3d->drm.dev);
>  	if (ret)
>  		goto fail;
>  
> @@ -416,7 +419,8 @@ v3d_sched_init(struct v3d_dev *v3d)
>  				     &v3d_csd_sched_ops, NULL,
>  				     hw_jobs_limit, job_hang_limit,
>  				     msecs_to_jiffies(hang_limit_ms), NULL,
> -				     NULL, "v3d_csd", v3d->drm.dev);
> +				     NULL, "v3d_csd", DRM_SCHED_POLICY_UNSET,
> +				     v3d->drm.dev);
>  		if (ret)
>  			goto fail;
>  
> @@ -424,7 +428,8 @@ v3d_sched_init(struct v3d_dev *v3d)
>  				     &v3d_cache_clean_sched_ops, NULL,
>  				     hw_jobs_limit, job_hang_limit,
>  				     msecs_to_jiffies(hang_limit_ms), NULL,
> -				     NULL, "v3d_cache_clean", v3d->drm.dev);
> +				     NULL, "v3d_cache_clean",
> +				     DRM_SCHED_POLICY_UNSET, v3d->drm.dev);
>  		if (ret)
>  			goto fail;
>  	}
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 95927c52383c..9f830ff84bad 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -72,11 +72,15 @@ enum drm_sched_priority {
>  	DRM_SCHED_PRIORITY_UNSET = -2
>  };
>  
> -/* Used to chose between FIFO and RR jobs scheduling */
> -extern int drm_sched_policy;
> -
> -#define DRM_SCHED_POLICY_RR    0
> -#define DRM_SCHED_POLICY_FIFO  1
> +/* Used to chose default scheduling policy*/
> +extern int default_drm_sched_policy;
> +
> +enum drm_sched_policy {
> +	DRM_SCHED_POLICY_UNSET,
> +	DRM_SCHED_POLICY_RR,
> +	DRM_SCHED_POLICY_FIFO,
> +	DRM_SCHED_POLICY_COUNT,
> +};
>  
>  /**
>   * struct drm_sched_entity - A wrapper around a job queue (typically
> @@ -489,6 +493,7 @@ struct drm_sched_backend_ops {
>   *              guilty and it will no longer be considered for scheduling.
>   * @score: score to help loadbalancer pick a idle sched
>   * @_score: score used when the driver doesn't provide one
> + * @sched_policy: Schedule policy for scheduler
>   * @ready: marks if the underlying HW is ready to work
>   * @free_guilty: A hit to time out handler to free the guilty job.
>   * @pause_submit: pause queuing of @work_submit on @submit_wq
> @@ -515,6 +520,7 @@ struct drm_gpu_scheduler {
>  	int				hang_limit;
>  	atomic_t                        *score;
>  	atomic_t                        _score;
> +	enum drm_sched_policy		sched_policy;
>  	bool				ready;
>  	bool				free_guilty;
>  	bool				pause_submit;
> @@ -527,7 +533,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  		   struct workqueue_struct *submit_wq,
>  		   uint32_t hw_submission, unsigned hang_limit,
>  		   long timeout, struct workqueue_struct *timeout_wq,
> -		   atomic_t *score, const char *name, struct device *dev);
> +		   atomic_t *score, const char *name,
> +		   enum drm_sched_policy sched_policy,
> +		   struct device *dev);
>  
>  void drm_sched_fini(struct drm_gpu_scheduler *sched);
>  int drm_sched_job_init(struct drm_sched_job *job,


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
@ 2023-09-27 14:36   ` Luben Tuikov
  2023-10-05  4:02     ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-27 14:36 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> DRM_SCHED_POLICY_SINGLE_ENTITY creates a 1 to 1 relationship between
> scheduler and entity. No priorities or run queue used in this mode.
> Intended for devices with firmware schedulers.
> 
> v2:
>   - Drop sched / rq union (Luben)
> v3:
>   - Don't pick entity if stopped in drm_sched_select_entity (Danilo)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 69 ++++++++++++++++++------
>  drivers/gpu/drm/scheduler/sched_fence.c  |  2 +-
>  drivers/gpu/drm/scheduler/sched_main.c   | 64 +++++++++++++++++++---
>  include/drm/gpu_scheduler.h              |  8 +++
>  4 files changed, 120 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index cf42e2265d64..437c50867c99 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -83,6 +83,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  	memset(entity, 0, sizeof(struct drm_sched_entity));
>  	INIT_LIST_HEAD(&entity->list);
>  	entity->rq = NULL;
> +	entity->single_sched = NULL;
>  	entity->guilty = guilty;
>  	entity->num_sched_list = num_sched_list;
>  	entity->priority = priority;
> @@ -90,8 +91,17 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  	RCU_INIT_POINTER(entity->last_scheduled, NULL);
>  	RB_CLEAR_NODE(&entity->rb_tree_node);
>  
> -	if(num_sched_list)
> -		entity->rq = &sched_list[0]->sched_rq[entity->priority];
> +	if (num_sched_list) {
> +		if (sched_list[0]->sched_policy !=
> +		    DRM_SCHED_POLICY_SINGLE_ENTITY) {
> +			entity->rq = &sched_list[0]->sched_rq[entity->priority];
> +		} else {
> +			if (num_sched_list != 1 || sched_list[0]->single_entity)
> +				return -EINVAL;
> +			sched_list[0]->single_entity = entity;
> +			entity->single_sched = sched_list[0];
> +		}
> +	}

So much (checking for) negativity...:-)
Perhaps the simplified form below?

	if (num_sched_list) {
		if (sched_list[0]->sched_policy !=
		    DRM_SCHED_POLICY_SINGLE_ENTITY) {
			entity->rq = &sched_list[0]->sched_rq[entity->priority];
		} else if (num_sched_list == 1 && !sched_list[0]->single_entity) {
			sched_list[0]->single_entity = entity;
			entity->single_sched = sched_list[0];
		} else {
			return -EINVAL;
		}
	}

>  
>  	init_completion(&entity->entity_idle);
>  
> @@ -124,7 +134,8 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>  				    unsigned int num_sched_list)
>  {
> -	WARN_ON(!num_sched_list || !sched_list);
> +	WARN_ON(!num_sched_list || !sched_list ||
> +		!!entity->single_sched);
>  
>  	entity->sched_list = sched_list;
>  	entity->num_sched_list = num_sched_list;
> @@ -231,13 +242,15 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
>  {
>  	struct drm_sched_job *job;
>  	struct dma_fence *prev;
> +	bool single_entity = !!entity->single_sched;
>  
> -	if (!entity->rq)
> +	if (!entity->rq && !single_entity)
>  		return;
>  
>  	spin_lock(&entity->rq_lock);
>  	entity->stopped = true;
> -	drm_sched_rq_remove_entity(entity->rq, entity);
> +	if (!single_entity)
> +		drm_sched_rq_remove_entity(entity->rq, entity);
>  	spin_unlock(&entity->rq_lock);
>  
>  	/* Make sure this entity is not used by the scheduler at the moment */
> @@ -259,6 +272,20 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
>  	dma_fence_put(prev);
>  }
>  
> +/**
> + * drm_sched_entity_to_scheduler - Schedule entity to GPU scheduler

Please use verbs. Please?

Fix:
/**
 * drm_sched_entity_to_scheduler - Map a schedule entity to a GPU scheduler

> + * @entity: scheduler entity
> + *
> + * Returns GPU scheduler for the entity

Fix:
* Given an entity, return its GPU scheduler.

> + */
> +struct drm_gpu_scheduler *
> +drm_sched_entity_to_scheduler(struct drm_sched_entity *entity)
> +{
> +	bool single_entity = !!entity->single_sched;
> +
> +	return single_entity ? entity->single_sched : entity->rq->sched;
> +}
> +
>  /**
>   * drm_sched_entity_flush - Flush a context entity
>   *
> @@ -276,11 +303,12 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
>  	struct drm_gpu_scheduler *sched;
>  	struct task_struct *last_user;
>  	long ret = timeout;
> +	bool single_entity = !!entity->single_sched;
>  
> -	if (!entity->rq)
> +	if (!entity->rq && !single_entity)
>  		return 0;
>  
> -	sched = entity->rq->sched;
> +	sched = drm_sched_entity_to_scheduler(entity);
>  	/**
>  	 * The client will not queue more IBs during this fini, consume existing
>  	 * queued IBs or discard them on SIGKILL
> @@ -373,7 +401,7 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
>  		container_of(cb, struct drm_sched_entity, cb);
>  
>  	drm_sched_entity_clear_dep(f, cb);
> -	drm_sched_wakeup_if_can_queue(entity->rq->sched);
> +	drm_sched_wakeup_if_can_queue(drm_sched_entity_to_scheduler(entity));
>  }
>  
>  /**
> @@ -387,6 +415,8 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
>  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>  				   enum drm_sched_priority priority)
>  {
> +	WARN_ON(!!entity->single_sched);
> +
>  	spin_lock(&entity->rq_lock);
>  	entity->priority = priority;
>  	spin_unlock(&entity->rq_lock);
> @@ -399,7 +429,7 @@ EXPORT_SYMBOL(drm_sched_entity_set_priority);
>   */
>  static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>  {
> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
> +	struct drm_gpu_scheduler *sched = drm_sched_entity_to_scheduler(entity);
>  	struct dma_fence *fence = entity->dependency;
>  	struct drm_sched_fence *s_fence;
>  
> @@ -501,7 +531,8 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  	 * Update the entity's location in the min heap according to
>  	 * the timestamp of the next job, if any.
>  	 */
> -	if (entity->rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO) {
> +	if (drm_sched_entity_to_scheduler(entity)->sched_policy ==
> +	    DRM_SCHED_POLICY_FIFO) {
>  		struct drm_sched_job *next;
>  
>  		next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -524,6 +555,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>  	struct drm_gpu_scheduler *sched;
>  	struct drm_sched_rq *rq;
>  
> +	WARN_ON(!!entity->single_sched);
> +
>  	/* single possible engine and already selected */
>  	if (!entity->sched_list)
>  		return;
> @@ -573,12 +606,13 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>  void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  {
>  	struct drm_sched_entity *entity = sched_job->entity;
> -	bool first, fifo = entity->rq->sched->sched_policy ==
> -		DRM_SCHED_POLICY_FIFO;
> +	bool single_entity = !!entity->single_sched;
> +	bool first;
>  	ktime_t submit_ts;
>  
>  	trace_drm_sched_job(sched_job, entity);
> -	atomic_inc(entity->rq->sched->score);
> +	if (!single_entity)
> +		atomic_inc(entity->rq->sched->score);
>  	WRITE_ONCE(entity->last_user, current->group_leader);
>  
>  	/*
> @@ -591,6 +625,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  
>  	/* first job wakes up scheduler */
>  	if (first) {
> +		struct drm_gpu_scheduler *sched =
> +			drm_sched_entity_to_scheduler(entity);
> +		bool fifo = sched->sched_policy == DRM_SCHED_POLICY_FIFO;
> +
>  		/* Add the entity to the run queue */
>  		spin_lock(&entity->rq_lock);
>  		if (entity->stopped) {
> @@ -600,13 +638,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  			return;
>  		}
>  
> -		drm_sched_rq_add_entity(entity->rq, entity);
> +		if (!single_entity)
> +			drm_sched_rq_add_entity(entity->rq, entity);
>  		spin_unlock(&entity->rq_lock);
>  
>  		if (fifo)
>  			drm_sched_rq_update_fifo(entity, submit_ts);
>  
> -		drm_sched_wakeup_if_can_queue(entity->rq->sched);
> +		drm_sched_wakeup_if_can_queue(sched);
>  	}
>  }
>  EXPORT_SYMBOL(drm_sched_entity_push_job);
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 06cedfe4b486..f6b926f5e188 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -225,7 +225,7 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
>  {
>  	unsigned seq;
>  
> -	fence->sched = entity->rq->sched;
> +	fence->sched = drm_sched_entity_to_scheduler(entity);
>  	seq = atomic_inc_return(&entity->fence_seq);
>  	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>  		       &fence->lock, entity->fence_context, seq);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index f645f32977ed..588c735f7498 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -32,7 +32,8 @@
>   * backend operations to the scheduler like submitting a job to hardware run queue,
>   * returning the dependencies of a job etc.
>   *
> - * The organisation of the scheduler is the following:
> + * The organisation of the scheduler is the following for scheduling policies
> + * DRM_SCHED_POLICY_RR and DRM_SCHED_POLICY_FIFO:

Yes, so this was badly written to begin with. If we're adding more information,
I'd write:

    * For scheduling policies DRM_SCHED_POLICY_RR and DRM_SCHED_POLICY_FIFO,
    * the scheduler organization is,

>   *
>   * 1. Each hw run queue has one scheduler
>   * 2. Each scheduler has multiple run queues with different priorities
> @@ -43,6 +44,23 @@
>   *
>   * The jobs in a entity are always scheduled in the order that they were pushed.
>   *
> + * The organisation of the scheduler is the following for scheduling policy
> + * DRM_SCHED_POLICY_SINGLE_ENTITY:

Remember, it's a list, on large enough scale, thus,

    * For DRM_SCHED_POLICY_SINGLE_ENTITY, the organization of the scheduler is,

> + *
> + * 1. One to one relationship between scheduler and entity
> + * 2. No priorities implemented per scheduler (single job queue)
> + * 3. No run queues in scheduler rather jobs are directly dequeued from entity
> + * 4. The entity maintains a queue of jobs that will be scheduled on the
> + * hardware

Good! But please fix,

      4. The entity maintains a queue of jobs that will be scheduler _to_ the hardware.

> + *
> + * The jobs in a entity are always scheduled in the order that they were pushed
> + * regardless of scheduling policy.

Please add here,
	Single-entity scheduling is essentially a FIFO for jobs.

> + *
> + * A policy of DRM_SCHED_POLICY_RR or DRM_SCHED_POLICY_FIFO is expected to used

"... is expected to _be_ used ..."

> + * when the KMD is scheduling directly on the hardware while a scheduling policy

I'd spell out "kernel-mode driver" since it makes it terse when reading a processed
DOC format, and having a three-letter abbreviation spelled out makes for an easier
reading experience. (There are too many three-letter abbreviations as is...)

"... directly _to_ the hardware ..." since, ultimately, the DRM scheduler just
pushes jobs to be executed to the hardware by the hardware and doesn't support
or control hardware preemption of jobs _on_ the hardware. (See what I did there? :-) )

> + * of DRM_SCHED_POLICY_SINGLE_ENTITY is expected to be used when there is a
> + * firmware scheduler.
> + *

Yeah, so that's a good explanation--thanks for writing this.

>   * Note that once a job was taken from the entities queue and pushed to the

Please only use present tense in software documentation. No past, future, or 
perfect tenses please.

    * Note that once a job _is_ taken from the entities queue and pushed to the

>   * hardware, i.e. the pending queue, the entity must not be referenced anymore
>   * through the jobs entity pointer.

Yeah, another good explanation--thanks for including this.

> @@ -96,6 +114,8 @@ static inline void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *enti
>  
>  void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
>  {
> +	WARN_ON(!!entity->single_sched);
> +
>  	/*
>  	 * Both locks need to be grabbed, one to protect from entity->rq change
>  	 * for entity from within concurrent drm_sched_entity_select_rq and the
> @@ -126,6 +146,8 @@ void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
>  static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
>  			      struct drm_sched_rq *rq)
>  {
> +	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
> +
>  	spin_lock_init(&rq->lock);
>  	INIT_LIST_HEAD(&rq->entities);
>  	rq->rb_tree_root = RB_ROOT_CACHED;
> @@ -144,6 +166,8 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
>  void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>  			     struct drm_sched_entity *entity)
>  {
> +	WARN_ON(!!entity->single_sched);
> +
>  	if (!list_empty(&entity->list))
>  		return;
>  
> @@ -166,6 +190,8 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>  void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  				struct drm_sched_entity *entity)
>  {
> +	WARN_ON(!!entity->single_sched);
> +
>  	if (list_empty(&entity->list))
>  		return;
>  
> @@ -641,7 +667,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner)
>  {
> -	if (!entity->rq)
> +	if (!entity->rq && !entity->single_sched)
>  		return -ENOENT;
>  
>  	job->entity = entity;
> @@ -674,13 +700,16 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  {
>  	struct drm_gpu_scheduler *sched;
>  	struct drm_sched_entity *entity = job->entity;
> +	bool single_entity = !!entity->single_sched;
>  
>  	BUG_ON(!entity);
> -	drm_sched_entity_select_rq(entity);
> -	sched = entity->rq->sched;
> +	if (!single_entity)
> +		drm_sched_entity_select_rq(entity);
> +	sched = drm_sched_entity_to_scheduler(entity);

So here, I wonder, and I've a tiny exploratory request:
Could we "fake" an rq for the single-entity and thus remove (become unnecessary)
all those "if (single-entity)" and "if (!single-entity)".

If we keep adding code peppered with if () everywhere, over the years it'll become
hard to read. However, if we use maps to achieve choice and selection, such as entity->rq,
then you'd not need much of the "if (single-entity)" and "if (!single-entity)",
and the code would naturally stay mostly the same and the sched selection would
still be abstracted out via the entity->rq.

What do you think?

>  
>  	job->sched = sched;
> -	job->s_priority = entity->rq - sched->sched_rq;
> +	if (!single_entity)
> +		job->s_priority = entity->rq - sched->sched_rq;
>  	job->id = atomic64_inc_return(&sched->job_id_count);
>  
>  	drm_sched_fence_init(job->s_fence, job->entity);
> @@ -896,6 +925,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>  	if (!drm_sched_can_queue(sched))
>  		return NULL;
>  
> +	if (sched->single_entity) {
> +		if (!READ_ONCE(sched->single_entity->stopped) &&
> +		    drm_sched_entity_is_ready(sched->single_entity))
> +			return sched->single_entity;
> +
> +		return NULL;
> +	}
> +
>  	/* Kernel run queue has higher priority than normal run queue*/
>  	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
>  		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> @@ -1092,6 +1129,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  		return -EINVAL;
>  
>  	sched->ops = ops;
> +	sched->single_entity = NULL;
>  	sched->hw_submission_limit = hw_submission;
>  	sched->name = name;
>  	if (!submit_wq) {
> @@ -1111,7 +1149,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  	sched->dev = dev;
>  	sched->sched_policy = sched_policy == DRM_SCHED_POLICY_UNSET ?
>  		drm_sched_policy_default : sched_policy;
> -	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
> +	for (i = DRM_SCHED_PRIORITY_MIN; sched_policy !=
> +	     DRM_SCHED_POLICY_SINGLE_ENTITY && i < DRM_SCHED_PRIORITY_COUNT;
> +	     i++)

So, "sched_policy != DRM_SCHED_POLICY_SINGLE_ENTITY" doesn't seem to be
a loop-invariant, since it doesn't cause the loop to exit over iterations.
It's just a gate to executing the loop. I am used to seeing only loop
invariants in the for-loop conditional.

I wonder if it is clearer to just say what is meant:

	if (sched_policy != DRM_SCHED_POLICY_SINGLE_ENTITY) {
		for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
			...
	}

On a larger scheme of things, I believe it is a bit presumptuous to say:

struct drm_gpu_scheduler {
	...
	struct drm_sched_rq             sched_rq[DRM_SCHED_PRIORITY_COUNT];
	...
};

I mean, why does a scheduler have to implement all those priorities? Maybe it
wants to implement only one. :-)

Perhaps we can have,

struct drm_gpu_scheduler {
	...
	u32                             num_rqs;
	struct drm_sched_rq             *sched_rq;
	...
};

Which might make it easier to fake out an rq for single-entity and then leave
the code mostly intact, while also implementing single-entity.

It's not a gating issue, but perhaps it would create a cleaner code in the long
run? Maybe we should explore this?

>  		drm_sched_rq_init(sched, &sched->sched_rq[i]);
>  
>  	init_waitqueue_head(&sched->job_scheduled);
> @@ -1143,7 +1183,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>  
>  	drm_sched_submit_stop(sched);
>  
> -	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> +	if (sched->single_entity) {
> +		spin_lock(&sched->single_entity->rq_lock);
> +		sched->single_entity->stopped = true;
> +		spin_unlock(&sched->single_entity->rq_lock);
> +	}
> +
> +	for (i = DRM_SCHED_PRIORITY_COUNT - 1; sched->sched_policy !=
> +	     DRM_SCHED_POLICY_SINGLE_ENTITY && i >= DRM_SCHED_PRIORITY_MIN;
> +	     i--) {
>  		struct drm_sched_rq *rq = &sched->sched_rq[i];

Same sentiment here, as above.
-- 
Regards,
Luben

>  
>  		spin_lock(&rq->lock);
> @@ -1186,6 +1234,8 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
>  	struct drm_sched_entity *entity;
>  	struct drm_gpu_scheduler *sched = bad->sched;
>  
> +	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
> +
>  	/* don't change @bad's karma if it's from KERNEL RQ,
>  	 * because sometimes GPU hang would cause kernel jobs (like VM updating jobs)
>  	 * corrupt but keep in mind that kernel jobs always considered good.
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 9f830ff84bad..655675f797ea 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -79,6 +79,7 @@ enum drm_sched_policy {
>  	DRM_SCHED_POLICY_UNSET,
>  	DRM_SCHED_POLICY_RR,
>  	DRM_SCHED_POLICY_FIFO,
> +	DRM_SCHED_POLICY_SINGLE_ENTITY,
>  	DRM_SCHED_POLICY_COUNT,
>  };
>  
> @@ -112,6 +113,9 @@ struct drm_sched_entity {
>  	 */
>  	struct drm_sched_rq		*rq;
>  
> +	/** @single_sched: Single scheduler */
> +	struct drm_gpu_scheduler	*single_sched;
> +
>  	/**
>  	 * @sched_list:
>  	 *
> @@ -473,6 +477,7 @@ struct drm_sched_backend_ops {
>   * struct drm_gpu_scheduler - scheduler instance-specific data
>   *
>   * @ops: backend operations provided by the driver.
> + * @single_entity: Single entity for the scheduler
>   * @hw_submission_limit: the max size of the hardware queue.
>   * @timeout: the time after which a job is removed from the scheduler.
>   * @name: name of the ring for which this scheduler is being used.
> @@ -504,6 +509,7 @@ struct drm_sched_backend_ops {
>   */
>  struct drm_gpu_scheduler {
>  	const struct drm_sched_backend_ops	*ops;
> +	struct drm_sched_entity		*single_entity;
>  	uint32_t			hw_submission_limit;
>  	long				timeout;
>  	const char			*name;
> @@ -587,6 +593,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  			  struct drm_gpu_scheduler **sched_list,
>  			  unsigned int num_sched_list,
>  			  atomic_t *guilty);
> +struct drm_gpu_scheduler *
> +drm_sched_entity_to_scheduler(struct drm_sched_entity *entity);
>  long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout);
>  void drm_sched_entity_fini(struct drm_sched_entity *entity);
>  void drm_sched_entity_destroy(struct drm_sched_entity *entity);


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item Matthew Brost
@ 2023-09-28 16:14   ` Luben Tuikov
  2023-10-05  4:06     ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-28 16:14 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

On 2023-09-19 01:01, Matthew Brost wrote:
> Rather than call free_job and run_job in same work item have a dedicated
> work item for each. This aligns with the design and intended use of work
> queues.
> 
> v2:
>    - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
>      timestamp in free_job() work item (Danilo)
> v3:
>   - Drop forward dec of drm_sched_select_entity (Boris)
>   - Return in drm_sched_run_job_work if entity NULL (Boris)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 290 +++++++++++++++----------
>  include/drm/gpu_scheduler.h            |   8 +-
>  2 files changed, 182 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 588c735f7498..1e21d234fb5c 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -213,11 +213,12 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run
>   *
>   * @rq: scheduler run queue to check.
> + * @dequeue: dequeue selected entity

Change this to "peek" as indicated below.

>   *
>   * Try to find a ready entity, returns NULL if none found.
>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq, bool dequeue)
>  {
>  	struct drm_sched_entity *entity;
>  
> @@ -227,8 +228,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>  	if (entity) {
>  		list_for_each_entry_continue(entity, &rq->entities, list) {
>  			if (drm_sched_entity_is_ready(entity)) {
> -				rq->current_entity = entity;
> -				reinit_completion(&entity->entity_idle);
> +				if (dequeue) {
> +					rq->current_entity = entity;
> +					reinit_completion(&entity->entity_idle);
> +				}

Please rename "dequeue" or invert its logic, as from this patch it seems that
it is hiding (gating out) current behaviour.

Ideally, I'd prefer it be inverted, so that current behaviour, i.e. what people
are used to the rq_select_entity_*() to do, is default--preserved.

Perhaps use "peek" as the name of this new variable, to indicate that
we're not setting it to be the current entity.

I prefer "peek" to others, as the former tells me "Hey, I'm only
peeking at the rq and not really doing the default behaviour I've been
doing which you're used to." So, probably use "peek". ("Peek" also has historical
significance...).

>  				spin_unlock(&rq->lock);
>  				return entity;
>  			}
> @@ -238,8 +241,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>  	list_for_each_entry(entity, &rq->entities, list) {
>  
>  		if (drm_sched_entity_is_ready(entity)) {
> -			rq->current_entity = entity;
> -			reinit_completion(&entity->entity_idle);
> +			if (dequeue) {

			if (!peek) {

> +				rq->current_entity = entity;
> +				reinit_completion(&entity->entity_idle);
> +			}
>  			spin_unlock(&rq->lock);
>  			return entity;
>  		}
> @@ -257,11 +262,12 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>   * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run
>   *
>   * @rq: scheduler run queue to check.
> + * @dequeue: dequeue selected entity

    * @peek: Just find, don't set to current.

>   *
>   * Find oldest waiting ready entity, returns NULL if none found.>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq, bool dequeue)
>  {
>  	struct rb_node *rb;
>  
> @@ -271,8 +277,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>  
>  		entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node);
>  		if (drm_sched_entity_is_ready(entity)) {
> -			rq->current_entity = entity;
> -			reinit_completion(&entity->entity_idle);
> +			if (dequeue) {

			if (!peek) {

> +				rq->current_entity = entity;
> +				reinit_completion(&entity->entity_idle);
> +			}
>  			break;
>  		}
>  	}
> @@ -282,13 +290,102 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>  }
>  
>  /**
> - * drm_sched_submit_queue - scheduler queue submission
> + * drm_sched_run_job_queue - queue job submission
> + * @sched: scheduler instance
> + */

Perhaps it would be clearer to a DOC reader if there were verbs
in this function comment? I feel this was mentioned in the review
to patch 2...

> +static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> +{
> +	if (!READ_ONCE(sched->pause_submit))
> +		queue_work(sched->submit_wq, &sched->work_run_job);
> +}
> +
> +/**
> + * drm_sched_can_queue -- Can we queue more to the hardware?
> + * @sched: scheduler instance
> + *
> + * Return true if we can push more jobs to the hw, otherwise false.
> + */
> +static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> +{
> +	return atomic_read(&sched->hw_rq_count) <
> +		sched->hw_submission_limit;
> +}
> +
> +/**
> + * drm_sched_select_entity - Select next entity to process
> + *
> + * @sched: scheduler instance
> + * @dequeue: dequeue selected entity

When I see "dequeue" I'm thinking "list_del()". Let's
use "peek" here as mentioned above.

> + *
> + * Returns the entity to process or NULL if none are found.
> + */
> +static struct drm_sched_entity *
> +drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool dequeue)

drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool peek)

> +{
> +	struct drm_sched_entity *entity;
> +	int i;
> +
> +	if (!drm_sched_can_queue(sched))
> +		return NULL;
> +
> +	if (sched->single_entity) {
> +		if (!READ_ONCE(sched->single_entity->stopped) &&
> +		    drm_sched_entity_is_ready(sched->single_entity))
> +			return sched->single_entity;
> +
> +		return NULL;
> +	}
> +
> +	/* Kernel run queue has higher priority than normal run queue*/
> +	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> +		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> +			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i],
> +							dequeue) :
> +			drm_sched_rq_select_entity_rr(&sched->sched_rq[i],
> +						      dequeue);
> +		if (entity)
> +			break;
> +	}
> +
> +	return entity;
> +}
> +
> +/**
> + * drm_sched_run_job_queue_if_ready - queue job submission if ready
>   * @sched: scheduler instance
>   */
> -static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
> +static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> +{
> +	if (drm_sched_select_entity(sched, false))
> +		drm_sched_run_job_queue(sched);
> +}
> +
> +/**
> + * drm_sched_free_job_queue - queue free job

 * drm_sched_free_job_queue - enqueue free-job work

> + *
> + * @sched: scheduler instance to queue free job

 * @sched: scheduler instance to queue free job work for


> + */
> +static void drm_sched_free_job_queue(struct drm_gpu_scheduler *sched)
>  {
>  	if (!READ_ONCE(sched->pause_submit))
> -		queue_work(sched->submit_wq, &sched->work_submit);
> +		queue_work(sched->submit_wq, &sched->work_free_job);
> +}
> +
> +/**
> + * drm_sched_free_job_queue_if_ready - queue free job if ready

 * drm_sched_free_job_queue_if_ready - enqueue free-job work if ready

> + *
> + * @sched: scheduler instance to queue free job
> + */
> +static void drm_sched_free_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> +{
> +	struct drm_sched_job *job;
> +
> +	spin_lock(&sched->job_list_lock);
> +	job = list_first_entry_or_null(&sched->pending_list,
> +				       struct drm_sched_job, list);
> +	if (job && dma_fence_is_signaled(&job->s_fence->finished))
> +		drm_sched_free_job_queue(sched);
> +	spin_unlock(&sched->job_list_lock);
>  }
>  
>  /**
> @@ -310,7 +407,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
>  	dma_fence_get(&s_fence->finished);
>  	drm_sched_fence_finished(s_fence, result);
>  	dma_fence_put(&s_fence->finished);
> -	drm_sched_submit_queue(sched);
> +	drm_sched_free_job_queue(sched);
>  }
>  
>  /**
> @@ -885,18 +982,6 @@ void drm_sched_job_cleanup(struct drm_sched_job *job)
>  }
>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>  
> -/**
> - * drm_sched_can_queue -- Can we queue more to the hardware?
> - * @sched: scheduler instance
> - *
> - * Return true if we can push more jobs to the hw, otherwise false.
> - */
> -static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> -{
> -	return atomic_read(&sched->hw_rq_count) <
> -		sched->hw_submission_limit;
> -}
> -
>  /**
>   * drm_sched_wakeup_if_can_queue - Wake up the scheduler
>   * @sched: scheduler instance
> @@ -906,43 +991,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
>  {
>  	if (drm_sched_can_queue(sched))
> -		drm_sched_submit_queue(sched);
> -}
> -
> -/**
> - * drm_sched_select_entity - Select next entity to process
> - *
> - * @sched: scheduler instance
> - *
> - * Returns the entity to process or NULL if none are found.
> - */
> -static struct drm_sched_entity *
> -drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> -{
> -	struct drm_sched_entity *entity;
> -	int i;
> -
> -	if (!drm_sched_can_queue(sched))
> -		return NULL;
> -
> -	if (sched->single_entity) {
> -		if (!READ_ONCE(sched->single_entity->stopped) &&
> -		    drm_sched_entity_is_ready(sched->single_entity))
> -			return sched->single_entity;
> -
> -		return NULL;
> -	}
> -
> -	/* Kernel run queue has higher priority than normal run queue*/
> -	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> -		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> -			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
> -			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
> -		if (entity)
> -			break;
> -	}
> -
> -	return entity;
> +		drm_sched_run_job_queue(sched);
>  }
>  
>  /**
> @@ -974,8 +1023,10 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>  						typeof(*next), list);
>  
>  		if (next) {
> -			next->s_fence->scheduled.timestamp =
> -				job->s_fence->finished.timestamp;
> +			if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
> +				     &next->s_fence->scheduled.flags))
> +				next->s_fence->scheduled.timestamp =
> +					job->s_fence->finished.timestamp;
>  			/* start TO timer for next job */
>  			drm_sched_start_timeout(sched);
>  		}
> @@ -1025,74 +1076,84 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
>  EXPORT_SYMBOL(drm_sched_pick_best);
>  
>  /**
> - * drm_sched_main - main scheduler thread
> + * drm_sched_free_job_work - worker to call free_job
>   *
> - * @param: scheduler instance
> + * @w: free job work
>   */
> -static void drm_sched_main(struct work_struct *w)
> +static void drm_sched_free_job_work(struct work_struct *w)
>  {
>  	struct drm_gpu_scheduler *sched =
> -		container_of(w, struct drm_gpu_scheduler, work_submit);
> -	struct drm_sched_entity *entity;
> +		container_of(w, struct drm_gpu_scheduler, work_free_job);
>  	struct drm_sched_job *cleanup_job;
> -	int r;
>  
>  	if (READ_ONCE(sched->pause_submit))
>  		return;
>  
>  	cleanup_job = drm_sched_get_cleanup_job(sched);
> -	entity = drm_sched_select_entity(sched);
> -
> -	if (!entity && !cleanup_job)
> -		return;	/* No more work */
> -
> -	if (cleanup_job)
> +	if (cleanup_job) {
>  		sched->ops->free_job(cleanup_job);
>  
> -	if (entity) {
> -		struct dma_fence *fence;
> -		struct drm_sched_fence *s_fence;
> -		struct drm_sched_job *sched_job;
> -
> -		sched_job = drm_sched_entity_pop_job(entity);
> -		if (!sched_job) {
> -			complete_all(&entity->entity_idle);
> -			if (!cleanup_job)
> -				return;	/* No more work */
> -			goto again;
> -		}
> +		drm_sched_free_job_queue_if_ready(sched);
> +		drm_sched_run_job_queue_if_ready(sched);
> +	}
> +}
> +
> +/**
> + * drm_sched_run_job_work - worker to call run_job
> + *
> + * @w: run job work
> + */
> +static void drm_sched_run_job_work(struct work_struct *w)
> +{
> +	struct drm_gpu_scheduler *sched =
> +		container_of(w, struct drm_gpu_scheduler, work_run_job);
> +	struct drm_sched_entity *entity;
> +	struct dma_fence *fence;
> +	struct drm_sched_fence *s_fence;
> +	struct drm_sched_job *sched_job;
> +	int r;
>  
> -		s_fence = sched_job->s_fence;
> +	if (READ_ONCE(sched->pause_submit))
> +		return;
>  
> -		atomic_inc(&sched->hw_rq_count);
> -		drm_sched_job_begin(sched_job);
> +	entity = drm_sched_select_entity(sched, true);
> +	if (!entity)
> +		return;
>  
> -		trace_drm_run_job(sched_job, entity);
> -		fence = sched->ops->run_job(sched_job);
> +	sched_job = drm_sched_entity_pop_job(entity);
> +	if (!sched_job) {
>  		complete_all(&entity->entity_idle);
> -		drm_sched_fence_scheduled(s_fence, fence);
> +		return;	/* No more work */
> +	}
>  
> -		if (!IS_ERR_OR_NULL(fence)) {
> -			/* Drop for original kref_init of the fence */
> -			dma_fence_put(fence);
> +	s_fence = sched_job->s_fence;
>  
> -			r = dma_fence_add_callback(fence, &sched_job->cb,
> -						   drm_sched_job_done_cb);
> -			if (r == -ENOENT)
> -				drm_sched_job_done(sched_job, fence->error);
> -			else if (r)
> -				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> -					  r);
> -		} else {
> -			drm_sched_job_done(sched_job, IS_ERR(fence) ?
> -					   PTR_ERR(fence) : 0);
> -		}
> +	atomic_inc(&sched->hw_rq_count);
> +	drm_sched_job_begin(sched_job);
> +
> +	trace_drm_run_job(sched_job, entity);
> +	fence = sched->ops->run_job(sched_job);
> +	complete_all(&entity->entity_idle);
> +	drm_sched_fence_scheduled(s_fence, fence);
>  
> -		wake_up(&sched->job_scheduled);
> +	if (!IS_ERR_OR_NULL(fence)) {
> +		/* Drop for original kref_init of the fence */
> +		dma_fence_put(fence);
> +
> +		r = dma_fence_add_callback(fence, &sched_job->cb,
> +					   drm_sched_job_done_cb);
> +		if (r == -ENOENT)
> +			drm_sched_job_done(sched_job, fence->error);
> +		else if (r)
> +			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> +				  r);

Please align "r);" to the open brace on the previous line. If you're using Emacs
with sane Linux settings, press the "Tab" key anywhere on the line to indent it.
(It should run c-indent-line-or-region, usually using leading-tabs-only mode. Pressing
it again, over and over, on an already indented line, does nothing. Column indenting--say
for columns in 2D/3D/etc., array, usually happens using spaces, which is portable.
Also please take an overview with "scrips/checkpatch.pl --strict".)

Wrap-around was bumped to 100 in the Linux kernel so you can put the 'r' on
the same line without style problems.

> +	} else {
> +		drm_sched_job_done(sched_job, IS_ERR(fence) ?
> +				   PTR_ERR(fence) : 0);
>  	}
>  
> -again:
> -	drm_sched_submit_queue(sched);
> +	wake_up(&sched->job_scheduled);
> +	drm_sched_run_job_queue_if_ready(sched);
>  }
>  
>  /**
> @@ -1159,7 +1220,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  	spin_lock_init(&sched->job_list_lock);
>  	atomic_set(&sched->hw_rq_count, 0);
>  	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> -	INIT_WORK(&sched->work_submit, drm_sched_main);
> +	INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
> +	INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
>  	atomic_set(&sched->_score, 0);
>  	atomic64_set(&sched->job_id_count, 0);
>  	sched->pause_submit = false;
> @@ -1286,7 +1348,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
>  void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
>  {
>  	WRITE_ONCE(sched->pause_submit, true);
> -	cancel_work_sync(&sched->work_submit);
> +	cancel_work_sync(&sched->work_run_job);
> +	cancel_work_sync(&sched->work_free_job);
>  }
>  EXPORT_SYMBOL(drm_sched_submit_stop);
>  
> @@ -1298,6 +1361,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
>  void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
>  {
>  	WRITE_ONCE(sched->pause_submit, false);
> -	queue_work(sched->submit_wq, &sched->work_submit);
> +	queue_work(sched->submit_wq, &sched->work_run_job);
> +	queue_work(sched->submit_wq, &sched->work_free_job);
>  }
>  EXPORT_SYMBOL(drm_sched_submit_start);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 655675f797ea..7e6c121003ca 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -487,9 +487,10 @@ struct drm_sched_backend_ops {
>   *                 finished.
>   * @hw_rq_count: the number of jobs currently in the hardware queue.
>   * @job_id_count: used to assign unique id to the each job.
> - * @submit_wq: workqueue used to queue @work_submit
> + * @submit_wq: workqueue used to queue @work_run_job and @work_free_job
>   * @timeout_wq: workqueue used to queue @work_tdr
> - * @work_submit: schedules jobs and cleans up entities
> + * @work_run_job: schedules jobs
> + * @work_free_job: cleans up jobs
>   * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>   *            timeout interval is over.
>   * @pending_list: the list of jobs which are currently in the job queue.
> @@ -519,7 +520,8 @@ struct drm_gpu_scheduler {
>  	atomic64_t			job_id_count;
>  	struct workqueue_struct		*submit_wq;
>  	struct workqueue_struct		*timeout_wq;
> -	struct work_struct		work_submit;
> +	struct work_struct		work_run_job;
> +	struct work_struct		work_free_job;
>  	struct delayed_work		work_tdr;
>  	struct list_head		pending_list;
>  	spinlock_t			job_list_lock;

Yeah, so this is a good patch. Thanks for doing this!
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
@ 2023-09-29 21:23   ` Luben Tuikov
  0 siblings, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-09-29 21:23 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> Also add a lockdep assert to drm_sched_start_timeout.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>

Thanks for this patch!

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 1e21d234fb5c..09ef07b9e9d5 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -431,11 +431,20 @@ static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
>   */
>  static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>  {
> +	lockdep_assert_held(&sched->job_list_lock);
> +
>  	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>  	    !list_empty(&sched->pending_list))
>  		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
>  }
>  
> +static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> +{
> +	spin_lock(&sched->job_list_lock);
> +	drm_sched_start_timeout(sched);
> +	spin_unlock(&sched->job_list_lock);
> +}
> +
>  /**
>   * drm_sched_fault - immediately start timeout handler
>   *
> @@ -548,11 +557,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>  		spin_unlock(&sched->job_list_lock);
>  	}
>  
> -	if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> -		spin_lock(&sched->job_list_lock);
> -		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);
> -	}
> +	if (status != DRM_GPU_SCHED_STAT_ENODEV)
> +		drm_sched_start_timeout_unlocked(sched);
>  }
>  
>  /**
> @@ -678,11 +684,8 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>  			drm_sched_job_done(s_job, -ECANCELED);
>  	}
>  
> -	if (full_recovery) {
> -		spin_lock(&sched->job_list_lock);
> -		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);
> -	}
> +	if (full_recovery)
> +		drm_sched_start_timeout_unlocked(sched);
>  
>  	drm_sched_submit_start(sched);
>  }

-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start Matthew Brost
@ 2023-09-29 21:53   ` Luben Tuikov
  2023-09-30 19:48     ` Luben Tuikov
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-29 21:53 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> If the TDR is set to a very small value it can fire before the
> submission is started in the function drm_sched_start. The submission is
> expected to running when the TDR fires, fix this ordering so this
> expectation is always met.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 09ef07b9e9d5..a5cc9b6c2faa 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>  			drm_sched_job_done(s_job, -ECANCELED);
>  	}
>  
> +	drm_sched_submit_start(sched);
> +
>  	if (full_recovery)
>  		drm_sched_start_timeout_unlocked(sched);
> -
> -	drm_sched_submit_start(sched);
>  }
>  EXPORT_SYMBOL(drm_sched_start);

No.

A timeout timer should be started before we submit anything down to the hardware.
See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>.

You shouldn't start TDR at an arbitrarily late time after job
submission to the hardware. To close this, the timer is started
before jobs are submitted to the hardware.

One possibility is to increase the timeout timer value.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR Matthew Brost
@ 2023-09-29 21:58   ` Luben Tuikov
  2023-10-05  4:11     ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-29 21:58 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> If the TDR is set to a value, it can fire before a job is submitted in
> drm_sched_main. The job should be always be submitted before the TDR
> fires, fix this ordering.
> 
> v2:
>   - Add to pending list before run_job, start TDR after (Luben, Boris)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index a5cc9b6c2faa..e8a3e6033f66 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -517,7 +517,6 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>  
>  	spin_lock(&sched->job_list_lock);
>  	list_add_tail(&s_job->list, &sched->pending_list);
> -	drm_sched_start_timeout(sched);
>  	spin_unlock(&sched->job_list_lock);
>  }
>  
> @@ -1138,6 +1137,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>  	fence = sched->ops->run_job(sched_job);
>  	complete_all(&entity->entity_idle);
>  	drm_sched_fence_scheduled(s_fence, fence);
> +	drm_sched_start_timeout_unlocked(sched);
>  
>  	if (!IS_ERR_OR_NULL(fence)) {
>  		/* Drop for original kref_init of the fence */

No.

See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>,
and Message-ID: <24bc965f-61fb-4b92-9afa-360ca85a53af@amd.com>.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs
  2023-09-19  5:01 ` [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs Matthew Brost
@ 2023-09-29 22:44   ` Luben Tuikov
  2023-10-05  3:22     ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-29 22:44 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

On 2023-09-19 01:01, Matthew Brost wrote:
> Add helper to queue TDR immediately for current and future jobs. This
> will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup

Please use present tense, "is", in code, comments, commits, etc.

Is it "XE" or is it "Xe"? I always thought it was "Xe".

	This is used in Xe, a new Intel GPU driver, to trigger a TDR to clean up

Code, comments, commits, etc., immediately become history, and it's a bit
ambitious to use future tense in something which immediately becomes
history. It's much better to describe what is happening now, including the patch
in question (any patch, ftm) is considered "now"/"current state" as well.

> a drm_scheduler that encounter error[.]> 
> v2:
>  - Drop timeout args, rename function, use mod delayed work (Luben)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
>  include/drm/gpu_scheduler.h            |  1 +
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index e8a3e6033f66..88ef8be2d3c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -435,7 +435,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>  
>  	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>  	    !list_empty(&sched->pending_list))
> -		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> +		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
>  }
>  
>  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> @@ -445,6 +445,23 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
>  	spin_unlock(&sched->job_list_lock);
>  }
>  
> +/**
> + * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
> + * jobs

Let's not mention "including future jobs", since we don't know the future.
But we can sneak "jobs" into the description like this:

 * drm_sched_tdr_queue_imm - immediately start job timeout handler

:-)

> + *
> + * @sched: scheduler where the timeout handling should be started.

"where" --> "for which"
The former denotes a location, like in space-time, and the latter
denotes an object, like a scheduler, a spaceship, a bicycle, etc.

> + *
> + * Start timeout handling immediately for current and future jobs

 * Start timeout handling immediately for the named scheduler.

> + */
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
> +{
> +	spin_lock(&sched->job_list_lock);
> +	sched->timeout = 0;
> +	drm_sched_start_timeout(sched);
> +	spin_unlock(&sched->job_list_lock);
> +}
> +EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
> +
>  /**
>   * drm_sched_fault - immediately start timeout handler
>   *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7e6c121003ca..27f5778bbd6d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
>  
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
>  void drm_sched_job_cleanup(struct drm_sched_job *job);
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
>  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);

Looks good!

Fix the above, for an immediate R-B. :-)
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start
  2023-09-29 21:53   ` Luben Tuikov
@ 2023-09-30 19:48     ` Luben Tuikov
  2023-10-05  3:11       ` Matthew Brost
  0 siblings, 1 reply; 45+ messages in thread
From: Luben Tuikov @ 2023-09-30 19:48 UTC (permalink / raw)
  To: Matthew Brost, dri-devel, intel-xe
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, boris.brezillon, dakr, donald.robson, daniel, lina,
	airlied, christian.koenig, faith.ekstrand

On 2023-09-29 17:53, Luben Tuikov wrote:
> Hi,
> 
> On 2023-09-19 01:01, Matthew Brost wrote:
>> If the TDR is set to a very small value it can fire before the
>> submission is started in the function drm_sched_start. The submission is
>> expected to running when the TDR fires, fix this ordering so this
>> expectation is always met.
>>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>  drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 09ef07b9e9d5..a5cc9b6c2faa 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>  			drm_sched_job_done(s_job, -ECANCELED);
>>  	}
>>  
>> +	drm_sched_submit_start(sched);
>> +
>>  	if (full_recovery)
>>  		drm_sched_start_timeout_unlocked(sched);
>> -
>> -	drm_sched_submit_start(sched);
>>  }
>>  EXPORT_SYMBOL(drm_sched_start);
> 
> No.
> 
> A timeout timer should be started before we submit anything down to the hardware.
> See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
> and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>.
> 
> You shouldn't start TDR at an arbitrarily late time after job
> submission to the hardware. To close this, the timer is started
> before jobs are submitted to the hardware.
> 
> One possibility is to increase the timeout timer value.

If we went with this general change as we see here and in the subsequent patch--starting
the TDR _after_ submitting jobs for execution to the hardware--this is what generally happens,
1. submit one or many jobs for execution;
2. one or many jobs may execute, complete, hang, etc.;
3. at some arbitrary time in the future, start TDR.
Which means that the timeout doesn't necessarily track the time allotted for a job to finish
executing in the hardware. It ends up larger than intended.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start
  2023-09-30 19:48     ` Luben Tuikov
@ 2023-10-05  3:11       ` Matthew Brost
  2023-10-05  3:18         ` Luben Tuikov
  0 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  3:11 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Sat, Sep 30, 2023 at 03:48:07PM -0400, Luben Tuikov wrote:
> On 2023-09-29 17:53, Luben Tuikov wrote:
> > Hi,
> > 
> > On 2023-09-19 01:01, Matthew Brost wrote:
> >> If the TDR is set to a very small value it can fire before the
> >> submission is started in the function drm_sched_start. The submission is
> >> expected to running when the TDR fires, fix this ordering so this
> >> expectation is always met.
> >>
> >> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >> ---
> >>  drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >> index 09ef07b9e9d5..a5cc9b6c2faa 100644
> >> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >> @@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
> >>  			drm_sched_job_done(s_job, -ECANCELED);
> >>  	}
> >>  
> >> +	drm_sched_submit_start(sched);
> >> +
> >>  	if (full_recovery)
> >>  		drm_sched_start_timeout_unlocked(sched);
> >> -
> >> -	drm_sched_submit_start(sched);
> >>  }
> >>  EXPORT_SYMBOL(drm_sched_start);
> > 
> > No.
> > 

I don't think we will ever agree on this but I pulled out this patch and
the next in Xe. It seems to work without these changes, I believe
understand why and think it should actually work without this change. If
for some reason it didn't work, I know how I can work around this in the
Xe submission backend.

With this, I will drop these in the next rev.

But more on why I disagree below...

> > A timeout timer should be started before we submit anything down to the hardware.
> > See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
> > and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>.
> > 
> > You shouldn't start TDR at an arbitrarily late time after job
> > submission to the hardware. To close this, the timer is started
> > before jobs are submitted to the hardware.
> > 
> > One possibility is to increase the timeout timer value.

No matter what the timeout value is there will always be a race of TDR
firing before run_job() is called.

> 
> If we went with this general change as we see here and in the subsequent patch--starting
> the TDR _after_ submitting jobs for execution to the hardware--this is what generally happens,
> 1. submit one or many jobs for execution;
> 2. one or many jobs may execute, complete, hang, etc.;
> 3. at some arbitrary time in the future, start TDR.
> Which means that the timeout doesn't necessarily track the time allotted for a job to finish
> executing in the hardware. It ends up larger than intended.

Yes, conversely it can be smaller the way it is coded now. Kinda just a
matter of opinion on which one to prefer.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start
  2023-10-05  3:11       ` Matthew Brost
@ 2023-10-05  3:18         ` Luben Tuikov
  0 siblings, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-05  3:18 UTC (permalink / raw)
  To: Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On 2023-10-04 23:11, Matthew Brost wrote:
> On Sat, Sep 30, 2023 at 03:48:07PM -0400, Luben Tuikov wrote:
>> On 2023-09-29 17:53, Luben Tuikov wrote:
>>> Hi,
>>>
>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>> If the TDR is set to a very small value it can fire before the
>>>> submission is started in the function drm_sched_start. The submission is
>>>> expected to running when the TDR fires, fix this ordering so this
>>>> expectation is always met.
>>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>  drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 09ef07b9e9d5..a5cc9b6c2faa 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>>>  			drm_sched_job_done(s_job, -ECANCELED);
>>>>  	}
>>>>  
>>>> +	drm_sched_submit_start(sched);
>>>> +
>>>>  	if (full_recovery)
>>>>  		drm_sched_start_timeout_unlocked(sched);
>>>> -
>>>> -	drm_sched_submit_start(sched);
>>>>  }
>>>>  EXPORT_SYMBOL(drm_sched_start);
>>>
>>> No.
>>>
> 
> I don't think we will ever agree on this but I pulled out this patch and
> the next in Xe. It seems to work without these changes, I believe
> understand why and think it should actually work without this change. If
> for some reason it didn't work, I know how I can work around this in the
> Xe submission backend.
> 
> With this, I will drop these in the next rev.
> 
> But more on why I disagree below...
> 
>>> A timeout timer should be started before we submit anything down to the hardware.
>>> See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
>>> and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>.
>>>
>>> You shouldn't start TDR at an arbitrarily late time after job
>>> submission to the hardware. To close this, the timer is started
>>> before jobs are submitted to the hardware.
>>>
>>> One possibility is to increase the timeout timer value.
> 
> No matter what the timeout value is there will always be a race of TDR
> firing before run_job() is called.

It's not a "race".

In all software and firmware I've seen, a timeout timer is started _before_
a command is submitted to firmware or hardware, respectively.

> 
>>
>> If we went with this general change as we see here and in the subsequent patch--starting
>> the TDR _after_ submitting jobs for execution to the hardware--this is what generally happens,
>> 1. submit one or many jobs for execution;
>> 2. one or many jobs may execute, complete, hang, etc.;
>> 3. at some arbitrary time in the future, start TDR.
>> Which means that the timeout doesn't necessarily track the time allotted for a job to finish
>> executing in the hardware. It ends up larger than intended.
> 
> Yes, conversely it can be smaller the way it is coded now. Kinda just a
> matter of opinion on which one to prefer.

It should be large enough to contain the command/task/job making it to the hardware.
We want to make sure there's no runaway job, _for_ the amount of time allotted
to each job.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs
  2023-09-29 22:44   ` Luben Tuikov
@ 2023-10-05  3:22     ` Matthew Brost
  0 siblings, 0 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  3:22 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Fri, Sep 29, 2023 at 06:44:53PM -0400, Luben Tuikov wrote:
> On 2023-09-19 01:01, Matthew Brost wrote:
> > Add helper to queue TDR immediately for current and future jobs. This
> > will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup
> 
> Please use present tense, "is", in code, comments, commits, etc.
> 
> Is it "XE" or is it "Xe"? I always thought it was "Xe".
> 

Yea should be 'Xe'.

> 	This is used in Xe, a new Intel GPU driver, to trigger a TDR to clean up
> 

Will fix.

> Code, comments, commits, etc., immediately become history, and it's a bit
> ambitious to use future tense in something which immediately becomes
> history. It's much better to describe what is happening now, including the patch
> in question (any patch, ftm) is considered "now"/"current state" as well.
>

Got it.

> > a drm_scheduler that encounter error[.]> 
> > v2:
> >  - Drop timeout args, rename function, use mod delayed work (Luben)
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
> >  include/drm/gpu_scheduler.h            |  1 +
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index e8a3e6033f66..88ef8be2d3c7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -435,7 +435,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
> >  
> >  	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> >  	    !list_empty(&sched->pending_list))
> > -		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> > +		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> >  }
> >  
> >  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> > @@ -445,6 +445,23 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> >  	spin_unlock(&sched->job_list_lock);
> >  }
> >  
> > +/**
> > + * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
> > + * jobs
> 
> Let's not mention "including future jobs", since we don't know the future.
> But we can sneak "jobs" into the description like this:
> 
>  * drm_sched_tdr_queue_imm - immediately start job timeout handler
> 
> :-)
>

Will change.
 
> > + *
> > + * @sched: scheduler where the timeout handling should be started.
> 
> "where" --> "for which"
> The former denotes a location, like in space-time, and the latter
> denotes an object, like a scheduler, a spaceship, a bicycle, etc.
>

+1
 
> > + *
> > + * Start timeout handling immediately for current and future jobs
> 
>  * Start timeout handling immediately for the named scheduler.
>

+1

> > + */
> > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
> > +{
> > +	spin_lock(&sched->job_list_lock);
> > +	sched->timeout = 0;
> > +	drm_sched_start_timeout(sched);
> > +	spin_unlock(&sched->job_list_lock);
> > +}
> > +EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
> > +
> >  /**
> >   * drm_sched_fault - immediately start timeout handler
> >   *
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7e6c121003ca..27f5778bbd6d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >  				    struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> >  
> > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
> >  void drm_sched_job_cleanup(struct drm_sched_job *job);
> >  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
> >  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
> 
> Looks good!
> 
> Fix the above, for an immediate R-B. :-)

Thanks for the review, will fix all of this.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-09-27  3:32   ` Luben Tuikov
@ 2023-10-05  3:33     ` Matthew Brost
  2023-10-05  4:13       ` Luben Tuikov
  0 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  3:33 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
> Hi,
> 
> On 2023-09-19 01:01, Matthew Brost wrote:
> > In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> > mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> > seems a bit odd but let us explain the reasoning below.
> > 
> > 1. In XE the submission order from multiple drm_sched_entity is not
> > guaranteed to be the same completion even if targeting the same hardware
> > engine. This is because in XE we have a firmware scheduler, the GuC,
> > which allowed to reorder, timeslice, and preempt submissions. If a using
> > shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> > apart as the TDR expects submission order == completion order. Using a
> > dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> > 
> > 2. In XE submissions are done via programming a ring buffer (circular
> > buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> > limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> > control on the ring for free.
> > 
> > A problem with this design is currently a drm_gpu_scheduler uses a
> > kthread for submission / job cleanup. This doesn't scale if a large
> > number of drm_gpu_scheduler are used. To work around the scaling issue,
> > use a worker rather than kthread for submission / job cleanup.
> > 
> > v2:
> >   - (Rob Clark) Fix msm build
> >   - Pass in run work queue
> > v3:
> >   - (Boris) don't have loop in worker
> > v4:
> >   - (Tvrtko) break out submit ready, stop, start helpers into own patch
> > v5:
> >   - (Boris) default to ordered work queue
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
> >  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
> >  drivers/gpu/drm/lima/lima_sched.c          |   2 +-
> >  drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
> >  drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
> >  drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
> >  drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
> >  drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
> >  include/drm/gpu_scheduler.h                |  14 ++-
> >  9 files changed, 79 insertions(+), 75 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index e366f61c3aed..16f3cfe1574a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> >  			break;
> >  		}
> >  
> > -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
> > +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
> >  				   ring->num_hw_submission, 0,
> >  				   timeout, adev->reset_domain->wq,
> >  				   ring->sched_score, ring->name,
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index 345fec6cb1a4..618a804ddc34 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
> >  {
> >  	int ret;
> >  
> > -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
> > +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
> >  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> >  			     msecs_to_jiffies(500), NULL, NULL,
> >  			     dev_name(gpu->dev), gpu->dev);
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ffd91a5ee299..8d858aed0e56 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
> >  
> >  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
> >  
> > -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
> > +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
> >  			      lima_job_hang_limit,
> >  			      msecs_to_jiffies(timeout), NULL,
> >  			      NULL, name, pipe->ldev->dev);
> > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > index 40c0bc35a44c..b8865e61b40f 100644
> > --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
> >  	 /* currently managing hangcheck ourselves: */
> >  	sched_timeout = MAX_SCHEDULE_TIMEOUT;
> >  
> > -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
> > +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
> >  			num_hw_submissions, 0, sched_timeout,
> >  			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> 
> checkpatch.pl complains here about unmatched open parens.
> 

Will fix and run checkpatch before posting next rev.

> >  	if (ret) {
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > index 88217185e0f3..d458c2227d4f 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
> >  	if (!drm->sched_wq)
> >  		return -ENOMEM;
> >  
> > -	return drm_sched_init(sched, &nouveau_sched_ops,
> > +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
> >  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> >  			      NULL, NULL, "nouveau_sched", drm->dev->dev);
> >  }
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 033f5e684707..326ca1ddf1d7 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
> >  		js->queue[j].fence_context = dma_fence_context_alloc(1);
> >  
> >  		ret = drm_sched_init(&js->queue[j].sched,
> > -				     &panfrost_sched_ops,
> > +				     &panfrost_sched_ops, NULL,
> >  				     nentries, 0,
> >  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
> >  				     pfdev->reset.wq,
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index e4fa62abca41..ee6281942e36 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -48,7 +48,6 @@
> >   * through the jobs entity pointer.
> >   */
> >  
> > -#include <linux/kthread.h>
> >  #include <linux/wait.h>
> >  #include <linux/sched.h>
> >  #include <linux/completion.h>
> > @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> >  	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
> >  }
> >  
> > +/**
> > + * drm_sched_submit_queue - scheduler queue submission
> 
> There is no verb in the description, and is not clear what
> this function does unless one reads the code. Given that this
> is DOC, this should be clearer here. Something like "queue
> scheduler work to be executed" or something to that effect.
>

Will fix.
 
> Coming back to this from reading the patch below, it was somewhat
> unclear what "drm_sched_submit_queue()" does, since when reading
> below, "submit" was being read by my mind as an adjective, as opposed
> to a verb. Perhaps something like:
> 
> drm_sched_queue_submit(), or
> drm_sched_queue_exec(), or
> drm_sched_queue_push(), or something to that effect. You pick. :-)
>

I prefer the name as is. In this patch we have:

drm_sched_submit_queue()
drm_sched_submit_start)
drm_sched_submit_stop()
drm_sched_submit_ready()

I like all these functions start with 'drm_sched_submit' which allows
for easy searching for the functions that touch the DRM scheduler
submission state.

With a little better doc are you fine with the names as is.

> Note that it doesn't have to be 100% reflective of the fact that
> we're putting this on a workqueue and it would be executed sooner
> or later, so long as it conveys the fact that we're executing this
> scheduler queue.
> 
> > + * @sched: scheduler instance
> > + */
> > +static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
> > +{
> > +	if (!READ_ONCE(sched->pause_submit))
> > +		queue_work(sched->submit_wq, &sched->work_submit);
> > +}
> > +
> >  /**
> >   * drm_sched_job_done - complete a job
> >   * @s_job: pointer to the job which is done
> > @@ -275,7 +284,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
> >  	dma_fence_get(&s_fence->finished);
> >  	drm_sched_fence_finished(s_fence, result);
> >  	dma_fence_put(&s_fence->finished);
> > -	wake_up_interruptible(&sched->wake_up_worker);
> > +	drm_sched_submit_queue(sched);
> >  }
> >  
> >  /**
> > @@ -868,7 +877,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> >  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
> >  {
> >  	if (drm_sched_can_queue(sched))
> > -		wake_up_interruptible(&sched->wake_up_worker);
> > +		drm_sched_submit_queue(sched);
> >  }
> >  
> >  /**
> > @@ -978,61 +987,42 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
> >  }
> >  EXPORT_SYMBOL(drm_sched_pick_best);
> >  
> > -/**
> > - * drm_sched_blocked - check if the scheduler is blocked
> > - *
> > - * @sched: scheduler instance
> > - *
> > - * Returns true if blocked, otherwise false.
> > - */
> > -static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
> > -{
> > -	if (kthread_should_park()) {
> > -		kthread_parkme();
> > -		return true;
> > -	}
> > -
> > -	return false;
> > -}
> > -
> >  /**
> >   * drm_sched_main - main scheduler thread
> >   *
> >   * @param: scheduler instance
> > - *
> > - * Returns 0.
> >   */
> > -static int drm_sched_main(void *param)
> > +static void drm_sched_main(struct work_struct *w)
> >  {
> > -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> > +	struct drm_gpu_scheduler *sched =
> > +		container_of(w, struct drm_gpu_scheduler, work_submit);
> > +	struct drm_sched_entity *entity;
> > +	struct drm_sched_job *cleanup_job;
> >  	int r;
> >  
> > -	sched_set_fifo_low(current);
> > +	if (READ_ONCE(sched->pause_submit))
> > +		return;
> >  
> > -	while (!kthread_should_stop()) {
> > -		struct drm_sched_entity *entity = NULL;
> > -		struct drm_sched_fence *s_fence;
> > -		struct drm_sched_job *sched_job;
> > -		struct dma_fence *fence;
> > -		struct drm_sched_job *cleanup_job = NULL;
> > +	cleanup_job = drm_sched_get_cleanup_job(sched);
> > +	entity = drm_sched_select_entity(sched);
> >  
> > -		wait_event_interruptible(sched->wake_up_worker,
> > -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
> > -					 (!drm_sched_blocked(sched) &&
> > -					  (entity = drm_sched_select_entity(sched))) ||
> > -					 kthread_should_stop());
> > +	if (!entity && !cleanup_job)
> > +		return;	/* No more work */
> >  
> > -		if (cleanup_job)
> > -			sched->ops->free_job(cleanup_job);
> > +	if (cleanup_job)
> > +		sched->ops->free_job(cleanup_job);
> >  
> > -		if (!entity)
> > -			continue;
> > +	if (entity) {
> > +		struct dma_fence *fence;
> > +		struct drm_sched_fence *s_fence;
> > +		struct drm_sched_job *sched_job;
> >  
> >  		sched_job = drm_sched_entity_pop_job(entity);
> > -
> >  		if (!sched_job) {
> >  			complete_all(&entity->entity_idle);
> > -			continue;
> > +			if (!cleanup_job)
> > +				return;	/* No more work */
> > +			goto again;
> >  		}
> >  
> >  		s_fence = sched_job->s_fence;
> > @@ -1063,7 +1053,9 @@ static int drm_sched_main(void *param)
> >  
> >  		wake_up(&sched->job_scheduled);
> >  	}
> > -	return 0;
> > +
> > +again:
> > +	drm_sched_submit_queue(sched);
> >  }
> >  
> >  /**
> > @@ -1071,6 +1063,8 @@ static int drm_sched_main(void *param)
> >   *
> >   * @sched: scheduler instance
> >   * @ops: backend operations for this scheduler
> > + * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is
> > + *	       allocated and used
> >   * @hw_submission: number of hw submissions that can be in flight
> >   * @hang_limit: number of times to allow a job to hang before dropping it
> >   * @timeout: timeout value in jiffies for the scheduler
> > @@ -1084,14 +1078,25 @@ static int drm_sched_main(void *param)
> >   */
> >  int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  		   const struct drm_sched_backend_ops *ops,
> > +		   struct workqueue_struct *submit_wq,
> >  		   unsigned hw_submission, unsigned hang_limit,
> >  		   long timeout, struct workqueue_struct *timeout_wq,
> >  		   atomic_t *score, const char *name, struct device *dev)
> >  {
> > -	int i, ret;
> > +	int i;
> >  	sched->ops = ops;
> >  	sched->hw_submission_limit = hw_submission;
> >  	sched->name = name;
> > +	if (!submit_wq) {
> > +		sched->submit_wq = alloc_ordered_workqueue(name, 0);
> > +		if (!sched->submit_wq)
> > +			return -ENOMEM;
> > +
> > +		sched->alloc_submit_wq = true;
> > +	} else {
> > +		sched->submit_wq = submit_wq;
> > +		sched->alloc_submit_wq = false;
> > +	}
> 
> This if-conditional, I would've written:
> 
> 	if (submit_wq) {
> 		sched->submit_wq = submit_wq;
> 		sched->alloc_submit_wq = false;
> 	} else {
> 		sched->submit_wq = alloc_ordered_workqueue(name, 0);
> 		if (!sched->submit_wq)
> 			return -ENOMEM;
> 
> 		sched->alloc_submit_wq = true;
> 	}
> 
> It's easier to understand testing for positivity, than negativity.
>

+1, will do this all in future patches.
 
> 
> >  	sched->timeout = timeout;
> >  	sched->timeout_wq = timeout_wq ? : system_wq;
> >  	sched->hang_limit = hang_limit;
> > @@ -1100,23 +1105,15 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
> >  		drm_sched_rq_init(sched, &sched->sched_rq[i]);
> >  
> > -	init_waitqueue_head(&sched->wake_up_worker);
> >  	init_waitqueue_head(&sched->job_scheduled);
> >  	INIT_LIST_HEAD(&sched->pending_list);
> >  	spin_lock_init(&sched->job_list_lock);
> >  	atomic_set(&sched->hw_rq_count, 0);
> >  	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> > +	INIT_WORK(&sched->work_submit, drm_sched_main);
> >  	atomic_set(&sched->_score, 0);
> >  	atomic64_set(&sched->job_id_count, 0);
> > -
> > -	/* Each scheduler will run on a seperate kernel thread */
> > -	sched->thread = kthread_run(drm_sched_main, sched, sched->name);
> > -	if (IS_ERR(sched->thread)) {
> > -		ret = PTR_ERR(sched->thread);
> > -		sched->thread = NULL;
> > -		DRM_DEV_ERROR(sched->dev, "Failed to create scheduler for %s.\n", name);
> > -		return ret;
> > -	}
> > +	sched->pause_submit = false;
> >  
> >  	sched->ready = true;
> >  	return 0;
> > @@ -1135,8 +1132,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >  	struct drm_sched_entity *s_entity;
> >  	int i;
> >  
> > -	if (sched->thread)
> > -		kthread_stop(sched->thread);
> > +	drm_sched_submit_stop(sched);
> >  
> >  	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> >  		struct drm_sched_rq *rq = &sched->sched_rq[i];
> > @@ -1159,6 +1155,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >  	/* Confirm no work left behind accessing device structures */
> >  	cancel_delayed_work_sync(&sched->work_tdr);
> >  
> > +	if (sched->alloc_submit_wq)
> > +		destroy_workqueue(sched->submit_wq);
> >  	sched->ready = false;
> >  }
> >  EXPORT_SYMBOL(drm_sched_fini);
> > @@ -1216,7 +1214,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
> >   */
> >  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched)
> >  {
> > -	return !!sched->thread;
> > +	return sched->ready;
> >  
> >  }
> >  EXPORT_SYMBOL(drm_sched_submit_ready);
> > @@ -1228,7 +1226,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
> >   */
> >  void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
> >  {
> > -	kthread_park(sched->thread);
> > +	WRITE_ONCE(sched->pause_submit, true);
> > +	cancel_work_sync(&sched->work_submit);
> >  }
> >  EXPORT_SYMBOL(drm_sched_submit_stop);
> >  
> > @@ -1239,6 +1238,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
> >   */
> >  void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
> >  {
> > -	kthread_unpark(sched->thread);
> > +	WRITE_ONCE(sched->pause_submit, false);
> > +	queue_work(sched->submit_wq, &sched->work_submit);
> >  }
> >  EXPORT_SYMBOL(drm_sched_submit_start);
> > diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
> > index 06238e6d7f5c..38e092ea41e6 100644
> > --- a/drivers/gpu/drm/v3d/v3d_sched.c
> > +++ b/drivers/gpu/drm/v3d/v3d_sched.c
> > @@ -388,7 +388,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >  	int ret;
> >  
> >  	ret = drm_sched_init(&v3d->queue[V3D_BIN].sched,
> > -			     &v3d_bin_sched_ops,
> > +			     &v3d_bin_sched_ops, NULL,
> >  			     hw_jobs_limit, job_hang_limit,
> >  			     msecs_to_jiffies(hang_limit_ms), NULL,
> >  			     NULL, "v3d_bin", v3d->drm.dev);
> > @@ -396,7 +396,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >  		return ret;
> >  
> >  	ret = drm_sched_init(&v3d->queue[V3D_RENDER].sched,
> > -			     &v3d_render_sched_ops,
> > +			     &v3d_render_sched_ops, NULL,
> >  			     hw_jobs_limit, job_hang_limit,
> >  			     msecs_to_jiffies(hang_limit_ms), NULL,
> >  			     NULL, "v3d_render", v3d->drm.dev);
> > @@ -404,7 +404,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >  		goto fail;
> >  
> >  	ret = drm_sched_init(&v3d->queue[V3D_TFU].sched,
> > -			     &v3d_tfu_sched_ops,
> > +			     &v3d_tfu_sched_ops, NULL,
> >  			     hw_jobs_limit, job_hang_limit,
> >  			     msecs_to_jiffies(hang_limit_ms), NULL,
> >  			     NULL, "v3d_tfu", v3d->drm.dev);
> > @@ -413,7 +413,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >  
> >  	if (v3d_has_csd(v3d)) {
> >  		ret = drm_sched_init(&v3d->queue[V3D_CSD].sched,
> > -				     &v3d_csd_sched_ops,
> > +				     &v3d_csd_sched_ops, NULL,
> >  				     hw_jobs_limit, job_hang_limit,
> >  				     msecs_to_jiffies(hang_limit_ms), NULL,
> >  				     NULL, "v3d_csd", v3d->drm.dev);
> > @@ -421,7 +421,7 @@ v3d_sched_init(struct v3d_dev *v3d)
> >  			goto fail;
> >  
> >  		ret = drm_sched_init(&v3d->queue[V3D_CACHE_CLEAN].sched,
> > -				     &v3d_cache_clean_sched_ops,
> > +				     &v3d_cache_clean_sched_ops, NULL,
> >  				     hw_jobs_limit, job_hang_limit,
> >  				     msecs_to_jiffies(hang_limit_ms), NULL,
> >  				     NULL, "v3d_cache_clean", v3d->drm.dev);
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index f12c5aea5294..95927c52383c 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -473,17 +473,16 @@ struct drm_sched_backend_ops {
> >   * @timeout: the time after which a job is removed from the scheduler.
> >   * @name: name of the ring for which this scheduler is being used.
> >   * @sched_rq: priority wise array of run queues.
> > - * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> > - *                  is ready to be scheduled.
> >   * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> >   *                 waits on this wait queue until all the scheduled jobs are
> >   *                 finished.
> >   * @hw_rq_count: the number of jobs currently in the hardware queue.
> >   * @job_id_count: used to assign unique id to the each job.
> > + * @submit_wq: workqueue used to queue @work_submit
> >   * @timeout_wq: workqueue used to queue @work_tdr
> > + * @work_submit: schedules jobs and cleans up entities
> >   * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
> >   *            timeout interval is over.
> > - * @thread: the kthread on which the scheduler which run.
> >   * @pending_list: the list of jobs which are currently in the job queue.
> >   * @job_list_lock: lock to protect the pending_list.
> >   * @hang_limit: once the hangs by a job crosses this limit then it is marked
> > @@ -492,6 +491,8 @@ struct drm_sched_backend_ops {
> >   * @_score: score used when the driver doesn't provide one
> >   * @ready: marks if the underlying HW is ready to work
> >   * @free_guilty: A hit to time out handler to free the guilty job.
> > + * @pause_submit: pause queuing of @work_submit on @submit_wq
> > + * @alloc_submit_wq: scheduler own allocation of @submit_wq
> >   * @dev: system &struct device
> >   *
> >   * One scheduler is implemented for each hardware ring.
> > @@ -502,13 +503,13 @@ struct drm_gpu_scheduler {
> >  	long				timeout;
> >  	const char			*name;
> >  	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
> > -	wait_queue_head_t		wake_up_worker;
> >  	wait_queue_head_t		job_scheduled;
> >  	atomic_t			hw_rq_count;
> >  	atomic64_t			job_id_count;
> > +	struct workqueue_struct		*submit_wq;
> >  	struct workqueue_struct		*timeout_wq;
> > +	struct work_struct		work_submit;
> >  	struct delayed_work		work_tdr;
> > -	struct task_struct		*thread;
> >  	struct list_head		pending_list;
> >  	spinlock_t			job_list_lock;
> >  	int				hang_limit;
> > @@ -516,11 +517,14 @@ struct drm_gpu_scheduler {
> >  	atomic_t                        _score;
> >  	bool				ready;
> >  	bool				free_guilty;
> > +	bool				pause_submit;
> > +	bool				alloc_submit_wq;
> 
> Please rename it to what it actually describes:
> 
> alloc_submit_wq --> own_submit_wq
> 
> to mean "do we own the submit wq". Then the check becomes
> the intuitive,
> 	if (sched->own_submit_wq)
> 		destroy_workqueue(sched->submit_wq);
>

Got it, agree.
 
> >  	struct device			*dev;
> >  };
> >  
> >  int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  		   const struct drm_sched_backend_ops *ops,
> > +		   struct workqueue_struct *submit_wq,
> >  		   uint32_t hw_submission, unsigned hang_limit,
> >  		   long timeout, struct workqueue_struct *timeout_wq,
> >  		   atomic_t *score, const char *name, struct device *dev);
> 
> This is a good patch.

Thanks for the review.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
  2023-09-27 14:36   ` Luben Tuikov
@ 2023-10-05  4:02     ` Matthew Brost
  0 siblings, 0 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  4:02 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Wed, Sep 27, 2023 at 10:36:49AM -0400, Luben Tuikov wrote:
> Hi,
> 
> On 2023-09-19 01:01, Matthew Brost wrote:
> > DRM_SCHED_POLICY_SINGLE_ENTITY creates a 1 to 1 relationship between
> > scheduler and entity. No priorities or run queue used in this mode.
> > Intended for devices with firmware schedulers.
> > 
> > v2:
> >   - Drop sched / rq union (Luben)
> > v3:
> >   - Don't pick entity if stopped in drm_sched_select_entity (Danilo)
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_entity.c | 69 ++++++++++++++++++------
> >  drivers/gpu/drm/scheduler/sched_fence.c  |  2 +-
> >  drivers/gpu/drm/scheduler/sched_main.c   | 64 +++++++++++++++++++---
> >  include/drm/gpu_scheduler.h              |  8 +++
> >  4 files changed, 120 insertions(+), 23 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index cf42e2265d64..437c50867c99 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -83,6 +83,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >  	memset(entity, 0, sizeof(struct drm_sched_entity));
> >  	INIT_LIST_HEAD(&entity->list);
> >  	entity->rq = NULL;
> > +	entity->single_sched = NULL;
> >  	entity->guilty = guilty;
> >  	entity->num_sched_list = num_sched_list;
> >  	entity->priority = priority;
> > @@ -90,8 +91,17 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >  	RCU_INIT_POINTER(entity->last_scheduled, NULL);
> >  	RB_CLEAR_NODE(&entity->rb_tree_node);
> >  
> > -	if(num_sched_list)
> > -		entity->rq = &sched_list[0]->sched_rq[entity->priority];
> > +	if (num_sched_list) {
> > +		if (sched_list[0]->sched_policy !=
> > +		    DRM_SCHED_POLICY_SINGLE_ENTITY) {
> > +			entity->rq = &sched_list[0]->sched_rq[entity->priority];
> > +		} else {
> > +			if (num_sched_list != 1 || sched_list[0]->single_entity)
> > +				return -EINVAL;
> > +			sched_list[0]->single_entity = entity;
> > +			entity->single_sched = sched_list[0];
> > +		}
> > +	}
> 
> So much (checking for) negativity...:-)
> Perhaps the simplified form below?
> 
> 	if (num_sched_list) {
> 		if (sched_list[0]->sched_policy !=
> 		    DRM_SCHED_POLICY_SINGLE_ENTITY) {
> 			entity->rq = &sched_list[0]->sched_rq[entity->priority];
> 		} else if (num_sched_list == 1 && !sched_list[0]->single_entity) {
> 			sched_list[0]->single_entity = entity;
> 			entity->single_sched = sched_list[0];
> 		} else {
> 			return -EINVAL;
> 		}
> 	}
> 

Will change.

> >  
> >  	init_completion(&entity->entity_idle);
> >  
> > @@ -124,7 +134,8 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >  				    struct drm_gpu_scheduler **sched_list,
> >  				    unsigned int num_sched_list)
> >  {
> > -	WARN_ON(!num_sched_list || !sched_list);
> > +	WARN_ON(!num_sched_list || !sched_list ||
> > +		!!entity->single_sched);
> >  
> >  	entity->sched_list = sched_list;
> >  	entity->num_sched_list = num_sched_list;
> > @@ -231,13 +242,15 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> >  {
> >  	struct drm_sched_job *job;
> >  	struct dma_fence *prev;
> > +	bool single_entity = !!entity->single_sched;
> >  
> > -	if (!entity->rq)
> > +	if (!entity->rq && !single_entity)
> >  		return;
> >  
> >  	spin_lock(&entity->rq_lock);
> >  	entity->stopped = true;
> > -	drm_sched_rq_remove_entity(entity->rq, entity);
> > +	if (!single_entity)
> > +		drm_sched_rq_remove_entity(entity->rq, entity);
> >  	spin_unlock(&entity->rq_lock);
> >  
> >  	/* Make sure this entity is not used by the scheduler at the moment */
> > @@ -259,6 +272,20 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> >  	dma_fence_put(prev);
> >  }
> >  
> > +/**
> > + * drm_sched_entity_to_scheduler - Schedule entity to GPU scheduler
> 
> Please use verbs. Please?
> 
> Fix:
> /**
>  * drm_sched_entity_to_scheduler - Map a schedule entity to a GPU scheduler
> 
> > + * @entity: scheduler entity
> > + *
> > + * Returns GPU scheduler for the entity
> 
> Fix:
> * Given an entity, return its GPU scheduler.
>

Yep.
 
> > + */
> > +struct drm_gpu_scheduler *
> > +drm_sched_entity_to_scheduler(struct drm_sched_entity *entity)
> > +{
> > +	bool single_entity = !!entity->single_sched;
> > +
> > +	return single_entity ? entity->single_sched : entity->rq->sched;
> > +}
> > +
> >  /**
> >   * drm_sched_entity_flush - Flush a context entity
> >   *
> > @@ -276,11 +303,12 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> >  	struct drm_gpu_scheduler *sched;
> >  	struct task_struct *last_user;
> >  	long ret = timeout;
> > +	bool single_entity = !!entity->single_sched;
> >  
> > -	if (!entity->rq)
> > +	if (!entity->rq && !single_entity)
> >  		return 0;
> >  
> > -	sched = entity->rq->sched;
> > +	sched = drm_sched_entity_to_scheduler(entity);
> >  	/**
> >  	 * The client will not queue more IBs during this fini, consume existing
> >  	 * queued IBs or discard them on SIGKILL
> > @@ -373,7 +401,7 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
> >  		container_of(cb, struct drm_sched_entity, cb);
> >  
> >  	drm_sched_entity_clear_dep(f, cb);
> > -	drm_sched_wakeup_if_can_queue(entity->rq->sched);
> > +	drm_sched_wakeup_if_can_queue(drm_sched_entity_to_scheduler(entity));
> >  }
> >  
> >  /**
> > @@ -387,6 +415,8 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
> >  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >  				   enum drm_sched_priority priority)
> >  {
> > +	WARN_ON(!!entity->single_sched);
> > +
> >  	spin_lock(&entity->rq_lock);
> >  	entity->priority = priority;
> >  	spin_unlock(&entity->rq_lock);
> > @@ -399,7 +429,7 @@ EXPORT_SYMBOL(drm_sched_entity_set_priority);
> >   */
> >  static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
> >  {
> > -	struct drm_gpu_scheduler *sched = entity->rq->sched;
> > +	struct drm_gpu_scheduler *sched = drm_sched_entity_to_scheduler(entity);
> >  	struct dma_fence *fence = entity->dependency;
> >  	struct drm_sched_fence *s_fence;
> >  
> > @@ -501,7 +531,8 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >  	 * Update the entity's location in the min heap according to
> >  	 * the timestamp of the next job, if any.
> >  	 */
> > -	if (entity->rq->sched->sched_policy == DRM_SCHED_POLICY_FIFO) {
> > +	if (drm_sched_entity_to_scheduler(entity)->sched_policy ==
> > +	    DRM_SCHED_POLICY_FIFO) {
> >  		struct drm_sched_job *next;
> >  
> >  		next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> > @@ -524,6 +555,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >  	struct drm_gpu_scheduler *sched;
> >  	struct drm_sched_rq *rq;
> >  
> > +	WARN_ON(!!entity->single_sched);
> > +
> >  	/* single possible engine and already selected */
> >  	if (!entity->sched_list)
> >  		return;
> > @@ -573,12 +606,13 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >  void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> >  {
> >  	struct drm_sched_entity *entity = sched_job->entity;
> > -	bool first, fifo = entity->rq->sched->sched_policy ==
> > -		DRM_SCHED_POLICY_FIFO;
> > +	bool single_entity = !!entity->single_sched;
> > +	bool first;
> >  	ktime_t submit_ts;
> >  
> >  	trace_drm_sched_job(sched_job, entity);
> > -	atomic_inc(entity->rq->sched->score);
> > +	if (!single_entity)
> > +		atomic_inc(entity->rq->sched->score);
> >  	WRITE_ONCE(entity->last_user, current->group_leader);
> >  
> >  	/*
> > @@ -591,6 +625,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> >  
> >  	/* first job wakes up scheduler */
> >  	if (first) {
> > +		struct drm_gpu_scheduler *sched =
> > +			drm_sched_entity_to_scheduler(entity);
> > +		bool fifo = sched->sched_policy == DRM_SCHED_POLICY_FIFO;
> > +
> >  		/* Add the entity to the run queue */
> >  		spin_lock(&entity->rq_lock);
> >  		if (entity->stopped) {
> > @@ -600,13 +638,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> >  			return;
> >  		}
> >  
> > -		drm_sched_rq_add_entity(entity->rq, entity);
> > +		if (!single_entity)
> > +			drm_sched_rq_add_entity(entity->rq, entity);
> >  		spin_unlock(&entity->rq_lock);
> >  
> >  		if (fifo)
> >  			drm_sched_rq_update_fifo(entity, submit_ts);
> >  
> > -		drm_sched_wakeup_if_can_queue(entity->rq->sched);
> > +		drm_sched_wakeup_if_can_queue(sched);
> >  	}
> >  }
> >  EXPORT_SYMBOL(drm_sched_entity_push_job);
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 06cedfe4b486..f6b926f5e188 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -225,7 +225,7 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
> >  {
> >  	unsigned seq;
> >  
> > -	fence->sched = entity->rq->sched;
> > +	fence->sched = drm_sched_entity_to_scheduler(entity);
> >  	seq = atomic_inc_return(&entity->fence_seq);
> >  	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >  		       &fence->lock, entity->fence_context, seq);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index f645f32977ed..588c735f7498 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -32,7 +32,8 @@
> >   * backend operations to the scheduler like submitting a job to hardware run queue,
> >   * returning the dependencies of a job etc.
> >   *
> > - * The organisation of the scheduler is the following:
> > + * The organisation of the scheduler is the following for scheduling policies
> > + * DRM_SCHED_POLICY_RR and DRM_SCHED_POLICY_FIFO:
> 
> Yes, so this was badly written to begin with. If we're adding more information,
> I'd write:
> 
>     * For scheduling policies DRM_SCHED_POLICY_RR and DRM_SCHED_POLICY_FIFO,
>     * the scheduler organization is,

Yep.

> 
> >   *
> >   * 1. Each hw run queue has one scheduler
> >   * 2. Each scheduler has multiple run queues with different priorities
> > @@ -43,6 +44,23 @@
> >   *
> >   * The jobs in a entity are always scheduled in the order that they were pushed.
> >   *
> > + * The organisation of the scheduler is the following for scheduling policy
> > + * DRM_SCHED_POLICY_SINGLE_ENTITY:
> 
> Remember, it's a list, on large enough scale, thus,
> 
>     * For DRM_SCHED_POLICY_SINGLE_ENTITY, the organization of the scheduler is,
> 
> > + *
> > + * 1. One to one relationship between scheduler and entity
> > + * 2. No priorities implemented per scheduler (single job queue)
> > + * 3. No run queues in scheduler rather jobs are directly dequeued from entity
> > + * 4. The entity maintains a queue of jobs that will be scheduled on the
> > + * hardware
> 
> Good! But please fix,
> 
>       4. The entity maintains a queue of jobs that will be scheduler _to_ the hardware.
> 
> > + *
> > + * The jobs in a entity are always scheduled in the order that they were pushed
> > + * regardless of scheduling policy.
> 
> Please add here,
> 	Single-entity scheduling is essentially a FIFO for jobs.
> 
> > + *
> > + * A policy of DRM_SCHED_POLICY_RR or DRM_SCHED_POLICY_FIFO is expected to used
> 
> "... is expected to _be_ used ..."
> 
> > + * when the KMD is scheduling directly on the hardware while a scheduling policy
> 
> I'd spell out "kernel-mode driver" since it makes it terse when reading a processed
> DOC format, and having a three-letter abbreviation spelled out makes for an easier
> reading experience. (There are too many three-letter abbreviations as is...)
> 
> "... directly _to_ the hardware ..." since, ultimately, the DRM scheduler just
> pushes jobs to be executed to the hardware by the hardware and doesn't support
> or control hardware preemption of jobs _on_ the hardware. (See what I did there? :-) )
> 
> > + * of DRM_SCHED_POLICY_SINGLE_ENTITY is expected to be used when there is a
> > + * firmware scheduler.
> > + *
> 
> Yeah, so that's a good explanation--thanks for writing this.
> 
> >   * Note that once a job was taken from the entities queue and pushed to the
> 
> Please only use present tense in software documentation. No past, future, or 
> perfect tenses please.
> 
>     * Note that once a job _is_ taken from the entities queue and pushed to the
> 
> >   * hardware, i.e. the pending queue, the entity must not be referenced anymore
> >   * through the jobs entity pointer.
> 
> Yeah, another good explanation--thanks for including this.
> 

Yes to all the wordin changes.

> > @@ -96,6 +114,8 @@ static inline void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *enti
> >  
> >  void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
> >  {
> > +	WARN_ON(!!entity->single_sched);
> > +
> >  	/*
> >  	 * Both locks need to be grabbed, one to protect from entity->rq change
> >  	 * for entity from within concurrent drm_sched_entity_select_rq and the
> > @@ -126,6 +146,8 @@ void drm_sched_rq_update_fifo(struct drm_sched_entity *entity, ktime_t ts)
> >  static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
> >  			      struct drm_sched_rq *rq)
> >  {
> > +	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
> > +
> >  	spin_lock_init(&rq->lock);
> >  	INIT_LIST_HEAD(&rq->entities);
> >  	rq->rb_tree_root = RB_ROOT_CACHED;
> > @@ -144,6 +166,8 @@ static void drm_sched_rq_init(struct drm_gpu_scheduler *sched,
> >  void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> >  			     struct drm_sched_entity *entity)
> >  {
> > +	WARN_ON(!!entity->single_sched);
> > +
> >  	if (!list_empty(&entity->list))
> >  		return;
> >  
> > @@ -166,6 +190,8 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> >  void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> >  				struct drm_sched_entity *entity)
> >  {
> > +	WARN_ON(!!entity->single_sched);
> > +
> >  	if (list_empty(&entity->list))
> >  		return;
> >  
> > @@ -641,7 +667,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >  		       struct drm_sched_entity *entity,
> >  		       void *owner)
> >  {
> > -	if (!entity->rq)
> > +	if (!entity->rq && !entity->single_sched)
> >  		return -ENOENT;
> >  
> >  	job->entity = entity;
> > @@ -674,13 +700,16 @@ void drm_sched_job_arm(struct drm_sched_job *job)
> >  {
> >  	struct drm_gpu_scheduler *sched;
> >  	struct drm_sched_entity *entity = job->entity;
> > +	bool single_entity = !!entity->single_sched;
> >  
> >  	BUG_ON(!entity);
> > -	drm_sched_entity_select_rq(entity);
> > -	sched = entity->rq->sched;
> > +	if (!single_entity)
> > +		drm_sched_entity_select_rq(entity);
> > +	sched = drm_sched_entity_to_scheduler(entity);
> 
> So here, I wonder, and I've a tiny exploratory request:
> Could we "fake" an rq for the single-entity and thus remove (become unnecessary)
> all those "if (single-entity)" and "if (!single-entity)".
> 
> If we keep adding code peppered with if () everywhere, over the years it'll become
> hard to read. However, if we use maps to achieve choice and selection, such as entity->rq,
> then you'd not need much of the "if (single-entity)" and "if (!single-entity)",
> and the code would naturally stay mostly the same and the sched selection would
> still be abstracted out via the entity->rq.
> 
> What do you think?
> 

I looked into this a little and can't really think of an easy way to do
this. Wouldn't we just end up with a similar number of checks in the rq
code? I'd say for now let's just live the if / else in a few places. If
we start adding more scheduling modes then perhaps we explore another
approach.

> >  
> >  	job->sched = sched;
> > -	job->s_priority = entity->rq - sched->sched_rq;
> > +	if (!single_entity)
> > +		job->s_priority = entity->rq - sched->sched_rq;
> >  	job->id = atomic64_inc_return(&sched->job_id_count);
> >  
> >  	drm_sched_fence_init(job->s_fence, job->entity);
> > @@ -896,6 +925,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> >  	if (!drm_sched_can_queue(sched))
> >  		return NULL;
> >  
> > +	if (sched->single_entity) {
> > +		if (!READ_ONCE(sched->single_entity->stopped) &&
> > +		    drm_sched_entity_is_ready(sched->single_entity))
> > +			return sched->single_entity;
> > +
> > +		return NULL;
> > +	}
> > +
> >  	/* Kernel run queue has higher priority than normal run queue*/
> >  	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> >  		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> > @@ -1092,6 +1129,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  		return -EINVAL;
> >  
> >  	sched->ops = ops;
> > +	sched->single_entity = NULL;
> >  	sched->hw_submission_limit = hw_submission;
> >  	sched->name = name;
> >  	if (!submit_wq) {
> > @@ -1111,7 +1149,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  	sched->dev = dev;
> >  	sched->sched_policy = sched_policy == DRM_SCHED_POLICY_UNSET ?
> >  		drm_sched_policy_default : sched_policy;
> > -	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
> > +	for (i = DRM_SCHED_PRIORITY_MIN; sched_policy !=
> > +	     DRM_SCHED_POLICY_SINGLE_ENTITY && i < DRM_SCHED_PRIORITY_COUNT;
> > +	     i++)
> 
> So, "sched_policy != DRM_SCHED_POLICY_SINGLE_ENTITY" doesn't seem to be
> a loop-invariant, since it doesn't cause the loop to exit over iterations.
> It's just a gate to executing the loop. I am used to seeing only loop
> invariants in the for-loop conditional.
> 
> I wonder if it is clearer to just say what is meant:
> 
> 	if (sched_policy != DRM_SCHED_POLICY_SINGLE_ENTITY) {
> 		for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
> 			...
> 	}
> 

Sure, will add if statement.

> On a larger scheme of things, I believe it is a bit presumptuous to say:
> 
> struct drm_gpu_scheduler {
> 	...
> 	struct drm_sched_rq             sched_rq[DRM_SCHED_PRIORITY_COUNT];
> 	...
> };
> 
> I mean, why does a scheduler have to implement all those priorities? Maybe it
> wants to implement only one. :-)
> 
> Perhaps we can have,
> 
> struct drm_gpu_scheduler {
> 	...
> 	u32                             num_rqs;
> 	struct drm_sched_rq             *sched_rq;
> 	...
> };
> 
> Which might make it easier to fake out an rq for single-entity and then leave
> the code mostly intact, while also implementing single-entity.
> 
> It's not a gating issue, but perhaps it would create a cleaner code in the long
> run? Maybe we should explore this?
> 

See above. I'd vote for leave this as is for now.

> >  		drm_sched_rq_init(sched, &sched->sched_rq[i]);
> >  
> >  	init_waitqueue_head(&sched->job_scheduled);
> > @@ -1143,7 +1183,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >  
> >  	drm_sched_submit_stop(sched);
> >  
> > -	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> > +	if (sched->single_entity) {
> > +		spin_lock(&sched->single_entity->rq_lock);
> > +		sched->single_entity->stopped = true;
> > +		spin_unlock(&sched->single_entity->rq_lock);
> > +	}
> > +
> > +	for (i = DRM_SCHED_PRIORITY_COUNT - 1; sched->sched_policy !=
> > +	     DRM_SCHED_POLICY_SINGLE_ENTITY && i >= DRM_SCHED_PRIORITY_MIN;
> > +	     i--) {
> >  		struct drm_sched_rq *rq = &sched->sched_rq[i];
> 
> Same sentiment here, as above.

Got it.

Matt

> -- 
> Regards,
> Luben
> 
> >  
> >  		spin_lock(&rq->lock);
> > @@ -1186,6 +1234,8 @@ void drm_sched_increase_karma(struct drm_sched_job *bad)
> >  	struct drm_sched_entity *entity;
> >  	struct drm_gpu_scheduler *sched = bad->sched;
> >  
> > +	WARN_ON(sched->sched_policy == DRM_SCHED_POLICY_SINGLE_ENTITY);
> > +
> >  	/* don't change @bad's karma if it's from KERNEL RQ,
> >  	 * because sometimes GPU hang would cause kernel jobs (like VM updating jobs)
> >  	 * corrupt but keep in mind that kernel jobs always considered good.
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 9f830ff84bad..655675f797ea 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -79,6 +79,7 @@ enum drm_sched_policy {
> >  	DRM_SCHED_POLICY_UNSET,
> >  	DRM_SCHED_POLICY_RR,
> >  	DRM_SCHED_POLICY_FIFO,
> > +	DRM_SCHED_POLICY_SINGLE_ENTITY,
> >  	DRM_SCHED_POLICY_COUNT,
> >  };
> >  
> > @@ -112,6 +113,9 @@ struct drm_sched_entity {
> >  	 */
> >  	struct drm_sched_rq		*rq;
> >  
> > +	/** @single_sched: Single scheduler */
> > +	struct drm_gpu_scheduler	*single_sched;
> > +
> >  	/**
> >  	 * @sched_list:
> >  	 *
> > @@ -473,6 +477,7 @@ struct drm_sched_backend_ops {
> >   * struct drm_gpu_scheduler - scheduler instance-specific data
> >   *
> >   * @ops: backend operations provided by the driver.
> > + * @single_entity: Single entity for the scheduler
> >   * @hw_submission_limit: the max size of the hardware queue.
> >   * @timeout: the time after which a job is removed from the scheduler.
> >   * @name: name of the ring for which this scheduler is being used.
> > @@ -504,6 +509,7 @@ struct drm_sched_backend_ops {
> >   */
> >  struct drm_gpu_scheduler {
> >  	const struct drm_sched_backend_ops	*ops;
> > +	struct drm_sched_entity		*single_entity;
> >  	uint32_t			hw_submission_limit;
> >  	long				timeout;
> >  	const char			*name;
> > @@ -587,6 +593,8 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >  			  struct drm_gpu_scheduler **sched_list,
> >  			  unsigned int num_sched_list,
> >  			  atomic_t *guilty);
> > +struct drm_gpu_scheduler *
> > +drm_sched_entity_to_scheduler(struct drm_sched_entity *entity);
> >  long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout);
> >  void drm_sched_entity_fini(struct drm_sched_entity *entity);
> >  void drm_sched_entity_destroy(struct drm_sched_entity *entity);
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item
  2023-09-28 16:14   ` Luben Tuikov
@ 2023-10-05  4:06     ` Matthew Brost
  2023-10-11 23:29       ` Luben Tuikov
  0 siblings, 1 reply; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  4:06 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Thu, Sep 28, 2023 at 12:14:12PM -0400, Luben Tuikov wrote:
> On 2023-09-19 01:01, Matthew Brost wrote:
> > Rather than call free_job and run_job in same work item have a dedicated
> > work item for each. This aligns with the design and intended use of work
> > queues.
> > 
> > v2:
> >    - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
> >      timestamp in free_job() work item (Danilo)
> > v3:
> >   - Drop forward dec of drm_sched_select_entity (Boris)
> >   - Return in drm_sched_run_job_work if entity NULL (Boris)
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 290 +++++++++++++++----------
> >  include/drm/gpu_scheduler.h            |   8 +-
> >  2 files changed, 182 insertions(+), 116 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 588c735f7498..1e21d234fb5c 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -213,11 +213,12 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> >   * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run
> >   *
> >   * @rq: scheduler run queue to check.
> > + * @dequeue: dequeue selected entity
> 
> Change this to "peek" as indicated below.
> 
> >   *
> >   * Try to find a ready entity, returns NULL if none found.
> >   */
> >  static struct drm_sched_entity *
> > -drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> > +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq, bool dequeue)
> >  {
> >  	struct drm_sched_entity *entity;
> >  
> > @@ -227,8 +228,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> >  	if (entity) {
> >  		list_for_each_entry_continue(entity, &rq->entities, list) {
> >  			if (drm_sched_entity_is_ready(entity)) {
> > -				rq->current_entity = entity;
> > -				reinit_completion(&entity->entity_idle);
> > +				if (dequeue) {
> > +					rq->current_entity = entity;
> > +					reinit_completion(&entity->entity_idle);
> > +				}
> 
> Please rename "dequeue" or invert its logic, as from this patch it seems that
> it is hiding (gating out) current behaviour.
> 
> Ideally, I'd prefer it be inverted, so that current behaviour, i.e. what people
> are used to the rq_select_entity_*() to do, is default--preserved.
> 
> Perhaps use "peek" as the name of this new variable, to indicate that
> we're not setting it to be the current entity.
> 
> I prefer "peek" to others, as the former tells me "Hey, I'm only
> peeking at the rq and not really doing the default behaviour I've been
> doing which you're used to." So, probably use "peek". ("Peek" also has historical
> significance...).
> 

Peek it is. Will change.

> >  				spin_unlock(&rq->lock);
> >  				return entity;
> >  			}
> > @@ -238,8 +241,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> >  	list_for_each_entry(entity, &rq->entities, list) {
> >  
> >  		if (drm_sched_entity_is_ready(entity)) {
> > -			rq->current_entity = entity;
> > -			reinit_completion(&entity->entity_idle);
> > +			if (dequeue) {
> 
> 			if (!peek) {
> 

+1

> > +				rq->current_entity = entity;
> > +				reinit_completion(&entity->entity_idle);
> > +			}
> >  			spin_unlock(&rq->lock);
> >  			return entity;
> >  		}
> > @@ -257,11 +262,12 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> >   * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run
> >   *
> >   * @rq: scheduler run queue to check.
> > + * @dequeue: dequeue selected entity
> 
>     * @peek: Just find, don't set to current.
>

+1
 
> >   *
> >   * Find oldest waiting ready entity, returns NULL if none found.>   */
> >  static struct drm_sched_entity *
> > -drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> > +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq, bool dequeue)
> >  {
> >  	struct rb_node *rb;
> >  
> > @@ -271,8 +277,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> >  
> >  		entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node);
> >  		if (drm_sched_entity_is_ready(entity)) {
> > -			rq->current_entity = entity;
> > -			reinit_completion(&entity->entity_idle);
> > +			if (dequeue) {
> 
> 			if (!peek) {
> 
> > +				rq->current_entity = entity;
> > +				reinit_completion(&entity->entity_idle);
> > +			}
> >  			break;
> >  		}
> >  	}
> > @@ -282,13 +290,102 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> >  }
> >  
> >  /**
> > - * drm_sched_submit_queue - scheduler queue submission
> > + * drm_sched_run_job_queue - queue job submission
> > + * @sched: scheduler instance
> > + */
> 
> Perhaps it would be clearer to a DOC reader if there were verbs
> in this function comment? I feel this was mentioned in the review
> to patch 2...
> 
> > +static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> > +{
> > +	if (!READ_ONCE(sched->pause_submit))
> > +		queue_work(sched->submit_wq, &sched->work_run_job);
> > +}
> > +
> > +/**
> > + * drm_sched_can_queue -- Can we queue more to the hardware?
> > + * @sched: scheduler instance
> > + *
> > + * Return true if we can push more jobs to the hw, otherwise false.
> > + */
> > +static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> > +{
> > +	return atomic_read(&sched->hw_rq_count) <
> > +		sched->hw_submission_limit;
> > +}
> > +
> > +/**
> > + * drm_sched_select_entity - Select next entity to process
> > + *
> > + * @sched: scheduler instance
> > + * @dequeue: dequeue selected entity
> 
> When I see "dequeue" I'm thinking "list_del()". Let's
> use "peek" here as mentioned above.
> 
> > + *
> > + * Returns the entity to process or NULL if none are found.
> > + */
> > +static struct drm_sched_entity *
> > +drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool dequeue)
> 
> drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool peek)
> 

+1

> > +{
> > +	struct drm_sched_entity *entity;
> > +	int i;
> > +
> > +	if (!drm_sched_can_queue(sched))
> > +		return NULL;
> > +
> > +	if (sched->single_entity) {
> > +		if (!READ_ONCE(sched->single_entity->stopped) &&
> > +		    drm_sched_entity_is_ready(sched->single_entity))
> > +			return sched->single_entity;
> > +
> > +		return NULL;
> > +	}
> > +
> > +	/* Kernel run queue has higher priority than normal run queue*/
> > +	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> > +		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> > +			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i],
> > +							dequeue) :
> > +			drm_sched_rq_select_entity_rr(&sched->sched_rq[i],
> > +						      dequeue);
> > +		if (entity)
> > +			break;
> > +	}
> > +
> > +	return entity;
> > +}
> > +
> > +/**
> > + * drm_sched_run_job_queue_if_ready - queue job submission if ready
> >   * @sched: scheduler instance
> >   */
> > -static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
> > +static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> > +{
> > +	if (drm_sched_select_entity(sched, false))
> > +		drm_sched_run_job_queue(sched);
> > +}
> > +
> > +/**
> > + * drm_sched_free_job_queue - queue free job
> 
>  * drm_sched_free_job_queue - enqueue free-job work
> 
> > + *
> > + * @sched: scheduler instance to queue free job
> 
>  * @sched: scheduler instance to queue free job work for
> 
> 

Will change both.

> > + */
> > +static void drm_sched_free_job_queue(struct drm_gpu_scheduler *sched)
> >  {
> >  	if (!READ_ONCE(sched->pause_submit))
> > -		queue_work(sched->submit_wq, &sched->work_submit);
> > +		queue_work(sched->submit_wq, &sched->work_free_job);
> > +}
> > +
> > +/**
> > + * drm_sched_free_job_queue_if_ready - queue free job if ready
> 
>  * drm_sched_free_job_queue_if_ready - enqueue free-job work if ready
>

Will change this too.
 
> > + *
> > + * @sched: scheduler instance to queue free job
> > + */
> > +static void drm_sched_free_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> > +{
> > +	struct drm_sched_job *job;
> > +
> > +	spin_lock(&sched->job_list_lock);
> > +	job = list_first_entry_or_null(&sched->pending_list,
> > +				       struct drm_sched_job, list);
> > +	if (job && dma_fence_is_signaled(&job->s_fence->finished))
> > +		drm_sched_free_job_queue(sched);
> > +	spin_unlock(&sched->job_list_lock);
> >  }
> >  
> >  /**
> > @@ -310,7 +407,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
> >  	dma_fence_get(&s_fence->finished);
> >  	drm_sched_fence_finished(s_fence, result);
> >  	dma_fence_put(&s_fence->finished);
> > -	drm_sched_submit_queue(sched);
> > +	drm_sched_free_job_queue(sched);
> >  }
> >  
> >  /**
> > @@ -885,18 +982,6 @@ void drm_sched_job_cleanup(struct drm_sched_job *job)
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_cleanup);
> >  
> > -/**
> > - * drm_sched_can_queue -- Can we queue more to the hardware?
> > - * @sched: scheduler instance
> > - *
> > - * Return true if we can push more jobs to the hw, otherwise false.
> > - */
> > -static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> > -{
> > -	return atomic_read(&sched->hw_rq_count) <
> > -		sched->hw_submission_limit;
> > -}
> > -
> >  /**
> >   * drm_sched_wakeup_if_can_queue - Wake up the scheduler
> >   * @sched: scheduler instance
> > @@ -906,43 +991,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
> >  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
> >  {
> >  	if (drm_sched_can_queue(sched))
> > -		drm_sched_submit_queue(sched);
> > -}
> > -
> > -/**
> > - * drm_sched_select_entity - Select next entity to process
> > - *
> > - * @sched: scheduler instance
> > - *
> > - * Returns the entity to process or NULL if none are found.
> > - */
> > -static struct drm_sched_entity *
> > -drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> > -{
> > -	struct drm_sched_entity *entity;
> > -	int i;
> > -
> > -	if (!drm_sched_can_queue(sched))
> > -		return NULL;
> > -
> > -	if (sched->single_entity) {
> > -		if (!READ_ONCE(sched->single_entity->stopped) &&
> > -		    drm_sched_entity_is_ready(sched->single_entity))
> > -			return sched->single_entity;
> > -
> > -		return NULL;
> > -	}
> > -
> > -	/* Kernel run queue has higher priority than normal run queue*/
> > -	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
> > -		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
> > -			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
> > -			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
> > -		if (entity)
> > -			break;
> > -	}
> > -
> > -	return entity;
> > +		drm_sched_run_job_queue(sched);
> >  }
> >  
> >  /**
> > @@ -974,8 +1023,10 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> >  						typeof(*next), list);
> >  
> >  		if (next) {
> > -			next->s_fence->scheduled.timestamp =
> > -				job->s_fence->finished.timestamp;
> > +			if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
> > +				     &next->s_fence->scheduled.flags))
> > +				next->s_fence->scheduled.timestamp =
> > +					job->s_fence->finished.timestamp;
> >  			/* start TO timer for next job */
> >  			drm_sched_start_timeout(sched);
> >  		}
> > @@ -1025,74 +1076,84 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
> >  EXPORT_SYMBOL(drm_sched_pick_best);
> >  
> >  /**
> > - * drm_sched_main - main scheduler thread
> > + * drm_sched_free_job_work - worker to call free_job
> >   *
> > - * @param: scheduler instance
> > + * @w: free job work
> >   */
> > -static void drm_sched_main(struct work_struct *w)
> > +static void drm_sched_free_job_work(struct work_struct *w)
> >  {
> >  	struct drm_gpu_scheduler *sched =
> > -		container_of(w, struct drm_gpu_scheduler, work_submit);
> > -	struct drm_sched_entity *entity;
> > +		container_of(w, struct drm_gpu_scheduler, work_free_job);
> >  	struct drm_sched_job *cleanup_job;
> > -	int r;
> >  
> >  	if (READ_ONCE(sched->pause_submit))
> >  		return;
> >  
> >  	cleanup_job = drm_sched_get_cleanup_job(sched);
> > -	entity = drm_sched_select_entity(sched);
> > -
> > -	if (!entity && !cleanup_job)
> > -		return;	/* No more work */
> > -
> > -	if (cleanup_job)
> > +	if (cleanup_job) {
> >  		sched->ops->free_job(cleanup_job);
> >  
> > -	if (entity) {
> > -		struct dma_fence *fence;
> > -		struct drm_sched_fence *s_fence;
> > -		struct drm_sched_job *sched_job;
> > -
> > -		sched_job = drm_sched_entity_pop_job(entity);
> > -		if (!sched_job) {
> > -			complete_all(&entity->entity_idle);
> > -			if (!cleanup_job)
> > -				return;	/* No more work */
> > -			goto again;
> > -		}
> > +		drm_sched_free_job_queue_if_ready(sched);
> > +		drm_sched_run_job_queue_if_ready(sched);
> > +	}
> > +}
> > +
> > +/**
> > + * drm_sched_run_job_work - worker to call run_job
> > + *
> > + * @w: run job work
> > + */
> > +static void drm_sched_run_job_work(struct work_struct *w)
> > +{
> > +	struct drm_gpu_scheduler *sched =
> > +		container_of(w, struct drm_gpu_scheduler, work_run_job);
> > +	struct drm_sched_entity *entity;
> > +	struct dma_fence *fence;
> > +	struct drm_sched_fence *s_fence;
> > +	struct drm_sched_job *sched_job;
> > +	int r;
> >  
> > -		s_fence = sched_job->s_fence;
> > +	if (READ_ONCE(sched->pause_submit))
> > +		return;
> >  
> > -		atomic_inc(&sched->hw_rq_count);
> > -		drm_sched_job_begin(sched_job);
> > +	entity = drm_sched_select_entity(sched, true);
> > +	if (!entity)
> > +		return;
> >  
> > -		trace_drm_run_job(sched_job, entity);
> > -		fence = sched->ops->run_job(sched_job);
> > +	sched_job = drm_sched_entity_pop_job(entity);
> > +	if (!sched_job) {
> >  		complete_all(&entity->entity_idle);
> > -		drm_sched_fence_scheduled(s_fence, fence);
> > +		return;	/* No more work */
> > +	}
> >  
> > -		if (!IS_ERR_OR_NULL(fence)) {
> > -			/* Drop for original kref_init of the fence */
> > -			dma_fence_put(fence);
> > +	s_fence = sched_job->s_fence;
> >  
> > -			r = dma_fence_add_callback(fence, &sched_job->cb,
> > -						   drm_sched_job_done_cb);
> > -			if (r == -ENOENT)
> > -				drm_sched_job_done(sched_job, fence->error);
> > -			else if (r)
> > -				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> > -					  r);
> > -		} else {
> > -			drm_sched_job_done(sched_job, IS_ERR(fence) ?
> > -					   PTR_ERR(fence) : 0);
> > -		}
> > +	atomic_inc(&sched->hw_rq_count);
> > +	drm_sched_job_begin(sched_job);
> > +
> > +	trace_drm_run_job(sched_job, entity);
> > +	fence = sched->ops->run_job(sched_job);
> > +	complete_all(&entity->entity_idle);
> > +	drm_sched_fence_scheduled(s_fence, fence);
> >  
> > -		wake_up(&sched->job_scheduled);
> > +	if (!IS_ERR_OR_NULL(fence)) {
> > +		/* Drop for original kref_init of the fence */
> > +		dma_fence_put(fence);
> > +
> > +		r = dma_fence_add_callback(fence, &sched_job->cb,
> > +					   drm_sched_job_done_cb);
> > +		if (r == -ENOENT)
> > +			drm_sched_job_done(sched_job, fence->error);
> > +		else if (r)
> > +			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> > +				  r);
> 
> Please align "r);" to the open brace on the previous line. If you're using Emacs
> with sane Linux settings, press the "Tab" key anywhere on the line to indent it.
> (It should run c-indent-line-or-region, usually using leading-tabs-only mode. Pressing
> it again, over and over, on an already indented line, does nothing. Column indenting--say
> for columns in 2D/3D/etc., array, usually happens using spaces, which is portable.
> Also please take an overview with "scrips/checkpatch.pl --strict".)
> 

Will run checkpatch.

> Wrap-around was bumped to 100 in the Linux kernel so you can put the 'r' on
> the same line without style problems.
> 

Using Vi with wrap around of 80 but know 100 is allowed. Will fix.

> > +	} else {
> > +		drm_sched_job_done(sched_job, IS_ERR(fence) ?
> > +				   PTR_ERR(fence) : 0);
> >  	}
> >  
> > -again:
> > -	drm_sched_submit_queue(sched);
> > +	wake_up(&sched->job_scheduled);
> > +	drm_sched_run_job_queue_if_ready(sched);
> >  }
> >  
> >  /**
> > @@ -1159,7 +1220,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >  	spin_lock_init(&sched->job_list_lock);
> >  	atomic_set(&sched->hw_rq_count, 0);
> >  	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> > -	INIT_WORK(&sched->work_submit, drm_sched_main);
> > +	INIT_WORK(&sched->work_run_job, drm_sched_run_job_work);
> > +	INIT_WORK(&sched->work_free_job, drm_sched_free_job_work);
> >  	atomic_set(&sched->_score, 0);
> >  	atomic64_set(&sched->job_id_count, 0);
> >  	sched->pause_submit = false;
> > @@ -1286,7 +1348,8 @@ EXPORT_SYMBOL(drm_sched_submit_ready);
> >  void drm_sched_submit_stop(struct drm_gpu_scheduler *sched)
> >  {
> >  	WRITE_ONCE(sched->pause_submit, true);
> > -	cancel_work_sync(&sched->work_submit);
> > +	cancel_work_sync(&sched->work_run_job);
> > +	cancel_work_sync(&sched->work_free_job);
> >  }
> >  EXPORT_SYMBOL(drm_sched_submit_stop);
> >  
> > @@ -1298,6 +1361,7 @@ EXPORT_SYMBOL(drm_sched_submit_stop);
> >  void drm_sched_submit_start(struct drm_gpu_scheduler *sched)
> >  {
> >  	WRITE_ONCE(sched->pause_submit, false);
> > -	queue_work(sched->submit_wq, &sched->work_submit);
> > +	queue_work(sched->submit_wq, &sched->work_run_job);
> > +	queue_work(sched->submit_wq, &sched->work_free_job);
> >  }
> >  EXPORT_SYMBOL(drm_sched_submit_start);
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 655675f797ea..7e6c121003ca 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -487,9 +487,10 @@ struct drm_sched_backend_ops {
> >   *                 finished.
> >   * @hw_rq_count: the number of jobs currently in the hardware queue.
> >   * @job_id_count: used to assign unique id to the each job.
> > - * @submit_wq: workqueue used to queue @work_submit
> > + * @submit_wq: workqueue used to queue @work_run_job and @work_free_job
> >   * @timeout_wq: workqueue used to queue @work_tdr
> > - * @work_submit: schedules jobs and cleans up entities
> > + * @work_run_job: schedules jobs
> > + * @work_free_job: cleans up jobs
> >   * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
> >   *            timeout interval is over.
> >   * @pending_list: the list of jobs which are currently in the job queue.
> > @@ -519,7 +520,8 @@ struct drm_gpu_scheduler {
> >  	atomic64_t			job_id_count;
> >  	struct workqueue_struct		*submit_wq;
> >  	struct workqueue_struct		*timeout_wq;
> > -	struct work_struct		work_submit;
> > +	struct work_struct		work_run_job;
> > +	struct work_struct		work_free_job;
> >  	struct delayed_work		work_tdr;
> >  	struct list_head		pending_list;
> >  	spinlock_t			job_list_lock;
> 
> Yeah, so this is a good patch. Thanks for doing this!

Thanks for the review.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR
  2023-09-29 21:58   ` Luben Tuikov
@ 2023-10-05  4:11     ` Matthew Brost
  0 siblings, 0 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-05  4:11 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, christian.koenig, boris.brezillon, dakr, donald.robson,
	intel-xe, faith.ekstrand

On Fri, Sep 29, 2023 at 05:58:46PM -0400, Luben Tuikov wrote:
> Hi,
> 
> On 2023-09-19 01:01, Matthew Brost wrote:
> > If the TDR is set to a value, it can fire before a job is submitted in
> > drm_sched_main. The job should be always be submitted before the TDR
> > fires, fix this ordering.
> > 
> > v2:
> >   - Add to pending list before run_job, start TDR after (Luben, Boris)
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index a5cc9b6c2faa..e8a3e6033f66 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -517,7 +517,6 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
> >  
> >  	spin_lock(&sched->job_list_lock);
> >  	list_add_tail(&s_job->list, &sched->pending_list);
> > -	drm_sched_start_timeout(sched);
> >  	spin_unlock(&sched->job_list_lock);
> >  }
> >  
> > @@ -1138,6 +1137,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
> >  	fence = sched->ops->run_job(sched_job);
> >  	complete_all(&entity->entity_idle);
> >  	drm_sched_fence_scheduled(s_fence, fence);
> > +	drm_sched_start_timeout_unlocked(sched);
> >  
> >  	if (!IS_ERR_OR_NULL(fence)) {
> >  		/* Drop for original kref_init of the fence */
> 
> No.
> 
> See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3@amd.com>,
> and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d@amd.com>,
> and Message-ID: <24bc965f-61fb-4b92-9afa-360ca85a53af@amd.com>.

See reply to previous patch, will drop this.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-05  3:33     ` Matthew Brost
@ 2023-10-05  4:13       ` Luben Tuikov
  2023-10-05 15:19         ` Matthew Brost
  2023-10-06  7:59         ` Tvrtko Ursulin
  0 siblings, 2 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-05  4:13 UTC (permalink / raw)
  To: Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On 2023-10-04 23:33, Matthew Brost wrote:
> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>> Hi,
>>
>> On 2023-09-19 01:01, Matthew Brost wrote:
>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>> seems a bit odd but let us explain the reasoning below.
>>>
>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>> guaranteed to be the same completion even if targeting the same hardware
>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>> apart as the TDR expects submission order == completion order. Using a
>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>
>>> 2. In XE submissions are done via programming a ring buffer (circular
>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>> control on the ring for free.
>>>
>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>> kthread for submission / job cleanup. This doesn't scale if a large
>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>> use a worker rather than kthread for submission / job cleanup.
>>>
>>> v2:
>>>   - (Rob Clark) Fix msm build
>>>   - Pass in run work queue
>>> v3:
>>>   - (Boris) don't have loop in worker
>>> v4:
>>>   - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>> v5:
>>>   - (Boris) default to ordered work queue
>>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>  drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>  drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>  drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>  drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>  drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>  drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>  include/drm/gpu_scheduler.h                |  14 ++-
>>>  9 files changed, 79 insertions(+), 75 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index e366f61c3aed..16f3cfe1574a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>  			break;
>>>  		}
>>>  
>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>  				   ring->num_hw_submission, 0,
>>>  				   timeout, adev->reset_domain->wq,
>>>  				   ring->sched_score, ring->name,
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index 345fec6cb1a4..618a804ddc34 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>  {
>>>  	int ret;
>>>  
>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>  			     msecs_to_jiffies(500), NULL, NULL,
>>>  			     dev_name(gpu->dev), gpu->dev);
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index ffd91a5ee299..8d858aed0e56 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>  
>>>  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>  
>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>  			      lima_job_hang_limit,
>>>  			      msecs_to_jiffies(timeout), NULL,
>>>  			      NULL, name, pipe->ldev->dev);
>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>> index 40c0bc35a44c..b8865e61b40f 100644
>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>  	 /* currently managing hangcheck ourselves: */
>>>  	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>  
>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>  			num_hw_submissions, 0, sched_timeout,
>>>  			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>
>> checkpatch.pl complains here about unmatched open parens.
>>
> 
> Will fix and run checkpatch before posting next rev.
> 
>>>  	if (ret) {
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> index 88217185e0f3..d458c2227d4f 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>  	if (!drm->sched_wq)
>>>  		return -ENOMEM;
>>>  
>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>  			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>  }
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 033f5e684707..326ca1ddf1d7 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>  		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>  
>>>  		ret = drm_sched_init(&js->queue[j].sched,
>>> -				     &panfrost_sched_ops,
>>> +				     &panfrost_sched_ops, NULL,
>>>  				     nentries, 0,
>>>  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>  				     pfdev->reset.wq,
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index e4fa62abca41..ee6281942e36 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -48,7 +48,6 @@
>>>   * through the jobs entity pointer.
>>>   */
>>>  
>>> -#include <linux/kthread.h>
>>>  #include <linux/wait.h>
>>>  #include <linux/sched.h>
>>>  #include <linux/completion.h>
>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>  	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>  }
>>>  
>>> +/**
>>> + * drm_sched_submit_queue - scheduler queue submission
>>
>> There is no verb in the description, and is not clear what
>> this function does unless one reads the code. Given that this
>> is DOC, this should be clearer here. Something like "queue
>> scheduler work to be executed" or something to that effect.
>>
> 
> Will fix.
>  
>> Coming back to this from reading the patch below, it was somewhat
>> unclear what "drm_sched_submit_queue()" does, since when reading
>> below, "submit" was being read by my mind as an adjective, as opposed
>> to a verb. Perhaps something like:
>>
>> drm_sched_queue_submit(), or
>> drm_sched_queue_exec(), or
>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>
> 
> I prefer the name as is. In this patch we have:
> 
> drm_sched_submit_queue()
> drm_sched_submit_start)
> drm_sched_submit_stop()
> drm_sched_submit_ready()
> 
> I like all these functions start with 'drm_sched_submit' which allows
> for easy searching for the functions that touch the DRM scheduler
> submission state.
> 
> With a little better doc are you fine with the names as is.

Notice the following scheme in the naming,

drm_sched_submit_queue()
drm_sched_submit_start)
drm_sched_submit_stop()
drm_sched_submit_ready()
\---+---/ \--+-/ \-+-/ 
    |        |     +---> a verb
    |        +---------> should be a noun (something in the component)
    +------------------> the kernel/software component

And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
like this:

drm_sched_submit_enqueue()

And using "submit" as the noun of the component is a bit cringy,
since "submit" is really a verb, and it's cringy to make it a "state"
or an "object" we operate on in the DRM Scheduler. "Submission" is
a noun, but "submission enqueue/start/stop/ready" doesn't sound
very well thought out. "Submission" really is what the work-queue
does.

I'd rather it be a real object, like for instance,

drm_sched_wqueue_enqueue()
drm_sched_wqueue_start)
drm_sched_wqueue_stop()
drm_sched_wqueue_ready()

Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
Plus, all these functions actually do operate on work-queues, directly or indirectly, 
are new, so it's a win-win naming scheme.

I think that that would be most likeable.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-05  4:13       ` Luben Tuikov
@ 2023-10-05 15:19         ` Matthew Brost
  2023-10-06  7:59         ` Tvrtko Ursulin
  1 sibling, 0 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-05 15:19 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On Thu, Oct 05, 2023 at 12:13:01AM -0400, Luben Tuikov wrote:
> On 2023-10-04 23:33, Matthew Brost wrote:
> > On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
> >> Hi,
> >>
> >> On 2023-09-19 01:01, Matthew Brost wrote:
> >>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> >>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> >>> seems a bit odd but let us explain the reasoning below.
> >>>
> >>> 1. In XE the submission order from multiple drm_sched_entity is not
> >>> guaranteed to be the same completion even if targeting the same hardware
> >>> engine. This is because in XE we have a firmware scheduler, the GuC,
> >>> which allowed to reorder, timeslice, and preempt submissions. If a using
> >>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> >>> apart as the TDR expects submission order == completion order. Using a
> >>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> >>>
> >>> 2. In XE submissions are done via programming a ring buffer (circular
> >>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> >>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> >>> control on the ring for free.
> >>>
> >>> A problem with this design is currently a drm_gpu_scheduler uses a
> >>> kthread for submission / job cleanup. This doesn't scale if a large
> >>> number of drm_gpu_scheduler are used. To work around the scaling issue,
> >>> use a worker rather than kthread for submission / job cleanup.
> >>>
> >>> v2:
> >>>   - (Rob Clark) Fix msm build
> >>>   - Pass in run work queue
> >>> v3:
> >>>   - (Boris) don't have loop in worker
> >>> v4:
> >>>   - (Tvrtko) break out submit ready, stop, start helpers into own patch
> >>> v5:
> >>>   - (Boris) default to ordered work queue
> >>>
> >>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
> >>>  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
> >>>  drivers/gpu/drm/lima/lima_sched.c          |   2 +-
> >>>  drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
> >>>  drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
> >>>  drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
> >>>  drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
> >>>  drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
> >>>  include/drm/gpu_scheduler.h                |  14 ++-
> >>>  9 files changed, 79 insertions(+), 75 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> index e366f61c3aed..16f3cfe1574a 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> >>>  			break;
> >>>  		}
> >>>  
> >>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
> >>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
> >>>  				   ring->num_hw_submission, 0,
> >>>  				   timeout, adev->reset_domain->wq,
> >>>  				   ring->sched_score, ring->name,
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> index 345fec6cb1a4..618a804ddc34 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
> >>>  {
> >>>  	int ret;
> >>>  
> >>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
> >>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
> >>>  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> >>>  			     msecs_to_jiffies(500), NULL, NULL,
> >>>  			     dev_name(gpu->dev), gpu->dev);
> >>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>> index ffd91a5ee299..8d858aed0e56 100644
> >>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
> >>>  
> >>>  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
> >>>  
> >>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
> >>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
> >>>  			      lima_job_hang_limit,
> >>>  			      msecs_to_jiffies(timeout), NULL,
> >>>  			      NULL, name, pipe->ldev->dev);
> >>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> >>> index 40c0bc35a44c..b8865e61b40f 100644
> >>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> >>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> >>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
> >>>  	 /* currently managing hangcheck ourselves: */
> >>>  	sched_timeout = MAX_SCHEDULE_TIMEOUT;
> >>>  
> >>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
> >>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
> >>>  			num_hw_submissions, 0, sched_timeout,
> >>>  			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> >>
> >> checkpatch.pl complains here about unmatched open parens.
> >>
> > 
> > Will fix and run checkpatch before posting next rev.
> > 
> >>>  	if (ret) {
> >>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> >>> index 88217185e0f3..d458c2227d4f 100644
> >>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> >>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> >>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
> >>>  	if (!drm->sched_wq)
> >>>  		return -ENOMEM;
> >>>  
> >>> -	return drm_sched_init(sched, &nouveau_sched_ops,
> >>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
> >>>  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> >>>  			      NULL, NULL, "nouveau_sched", drm->dev->dev);
> >>>  }
> >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> index 033f5e684707..326ca1ddf1d7 100644
> >>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
> >>>  		js->queue[j].fence_context = dma_fence_context_alloc(1);
> >>>  
> >>>  		ret = drm_sched_init(&js->queue[j].sched,
> >>> -				     &panfrost_sched_ops,
> >>> +				     &panfrost_sched_ops, NULL,
> >>>  				     nentries, 0,
> >>>  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
> >>>  				     pfdev->reset.wq,
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index e4fa62abca41..ee6281942e36 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -48,7 +48,6 @@
> >>>   * through the jobs entity pointer.
> >>>   */
> >>>  
> >>> -#include <linux/kthread.h>
> >>>  #include <linux/wait.h>
> >>>  #include <linux/sched.h>
> >>>  #include <linux/completion.h>
> >>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> >>>  	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
> >>>  }
> >>>  
> >>> +/**
> >>> + * drm_sched_submit_queue - scheduler queue submission
> >>
> >> There is no verb in the description, and is not clear what
> >> this function does unless one reads the code. Given that this
> >> is DOC, this should be clearer here. Something like "queue
> >> scheduler work to be executed" or something to that effect.
> >>
> > 
> > Will fix.
> >  
> >> Coming back to this from reading the patch below, it was somewhat
> >> unclear what "drm_sched_submit_queue()" does, since when reading
> >> below, "submit" was being read by my mind as an adjective, as opposed
> >> to a verb. Perhaps something like:
> >>
> >> drm_sched_queue_submit(), or
> >> drm_sched_queue_exec(), or
> >> drm_sched_queue_push(), or something to that effect. You pick. :-)
> >>
> > 
> > I prefer the name as is. In this patch we have:
> > 
> > drm_sched_submit_queue()
> > drm_sched_submit_start)
> > drm_sched_submit_stop()
> > drm_sched_submit_ready()
> > 
> > I like all these functions start with 'drm_sched_submit' which allows
> > for easy searching for the functions that touch the DRM scheduler
> > submission state.
> > 
> > With a little better doc are you fine with the names as is.
> 
> Notice the following scheme in the naming,
> 
> drm_sched_submit_queue()
> drm_sched_submit_start)
> drm_sched_submit_stop()
> drm_sched_submit_ready()
> \---+---/ \--+-/ \-+-/ 
>     |        |     +---> a verb
>     |        +---------> should be a noun (something in the component)
>     +------------------> the kernel/software component
> 
> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
> like this:
> 
> drm_sched_submit_enqueue()
> 
> And using "submit" as the noun of the component is a bit cringy,
> since "submit" is really a verb, and it's cringy to make it a "state"
> or an "object" we operate on in the DRM Scheduler. "Submission" is
> a noun, but "submission enqueue/start/stop/ready" doesn't sound
> very well thought out. "Submission" really is what the work-queue
> does.
> 
> I'd rather it be a real object, like for instance,
> 
> drm_sched_wqueue_enqueue()
> drm_sched_wqueue_start)
> drm_sched_wqueue_stop()
> drm_sched_wqueue_ready()
> 
> Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
> and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
> Plus, all these functions actually do operate on work-queues, directly or indirectly, 
> are new, so it's a win-win naming scheme.
> 
> I think that that would be most likeable.

Thanks for the detailed explaination. I can adjust the names in the next rev.

Matt

> -- 
> Regards,
> Luben
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-05  4:13       ` Luben Tuikov
  2023-10-05 15:19         ` Matthew Brost
@ 2023-10-06  7:59         ` Tvrtko Ursulin
  2023-10-06 15:14           ` Matthew Brost
  2023-10-11 23:10           ` Luben Tuikov
  1 sibling, 2 replies; 45+ messages in thread
From: Tvrtko Ursulin @ 2023-10-06  7:59 UTC (permalink / raw)
  To: Luben Tuikov, Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, intel-xe, boris.brezillon, dakr, donald.robson,
	christian.koenig, faith.ekstrand


On 05/10/2023 05:13, Luben Tuikov wrote:
> On 2023-10-04 23:33, Matthew Brost wrote:
>> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>>> Hi,
>>>
>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>>> seems a bit odd but let us explain the reasoning below.
>>>>
>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>> guaranteed to be the same completion even if targeting the same hardware
>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>>> apart as the TDR expects submission order == completion order. Using a
>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>
>>>> 2. In XE submissions are done via programming a ring buffer (circular
>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>>> control on the ring for free.
>>>>
>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>>> use a worker rather than kthread for submission / job cleanup.
>>>>
>>>> v2:
>>>>    - (Rob Clark) Fix msm build
>>>>    - Pass in run work queue
>>>> v3:
>>>>    - (Boris) don't have loop in worker
>>>> v4:
>>>>    - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>>> v5:
>>>>    - (Boris) default to ordered work queue
>>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>>   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>>   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>>   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>>   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>>   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>>   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>>   include/drm/gpu_scheduler.h                |  14 ++-
>>>>   9 files changed, 79 insertions(+), 75 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index e366f61c3aed..16f3cfe1574a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>>   			break;
>>>>   		}
>>>>   
>>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>>   				   ring->num_hw_submission, 0,
>>>>   				   timeout, adev->reset_domain->wq,
>>>>   				   ring->sched_score, ring->name,
>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>> index 345fec6cb1a4..618a804ddc34 100644
>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>>   {
>>>>   	int ret;
>>>>   
>>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>>   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>>   			     msecs_to_jiffies(500), NULL, NULL,
>>>>   			     dev_name(gpu->dev), gpu->dev);
>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>> index ffd91a5ee299..8d858aed0e56 100644
>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>>   
>>>>   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>>   
>>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>>   			      lima_job_hang_limit,
>>>>   			      msecs_to_jiffies(timeout), NULL,
>>>>   			      NULL, name, pipe->ldev->dev);
>>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>> index 40c0bc35a44c..b8865e61b40f 100644
>>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>>   	 /* currently managing hangcheck ourselves: */
>>>>   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>>   
>>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>>   			num_hw_submissions, 0, sched_timeout,
>>>>   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>>
>>> checkpatch.pl complains here about unmatched open parens.
>>>
>>
>> Will fix and run checkpatch before posting next rev.
>>
>>>>   	if (ret) {
>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>> index 88217185e0f3..d458c2227d4f 100644
>>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>>   	if (!drm->sched_wq)
>>>>   		return -ENOMEM;
>>>>   
>>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>>   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>>   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>>   }
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> index 033f5e684707..326ca1ddf1d7 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>>   		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>>   
>>>>   		ret = drm_sched_init(&js->queue[j].sched,
>>>> -				     &panfrost_sched_ops,
>>>> +				     &panfrost_sched_ops, NULL,
>>>>   				     nentries, 0,
>>>>   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>>   				     pfdev->reset.wq,
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index e4fa62abca41..ee6281942e36 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -48,7 +48,6 @@
>>>>    * through the jobs entity pointer.
>>>>    */
>>>>   
>>>> -#include <linux/kthread.h>
>>>>   #include <linux/wait.h>
>>>>   #include <linux/sched.h>
>>>>   #include <linux/completion.h>
>>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>>   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>>   }
>>>>   
>>>> +/**
>>>> + * drm_sched_submit_queue - scheduler queue submission
>>>
>>> There is no verb in the description, and is not clear what
>>> this function does unless one reads the code. Given that this
>>> is DOC, this should be clearer here. Something like "queue
>>> scheduler work to be executed" or something to that effect.
>>>
>>
>> Will fix.
>>   
>>> Coming back to this from reading the patch below, it was somewhat
>>> unclear what "drm_sched_submit_queue()" does, since when reading
>>> below, "submit" was being read by my mind as an adjective, as opposed
>>> to a verb. Perhaps something like:
>>>
>>> drm_sched_queue_submit(), or
>>> drm_sched_queue_exec(), or
>>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>>
>>
>> I prefer the name as is. In this patch we have:
>>
>> drm_sched_submit_queue()
>> drm_sched_submit_start)
>> drm_sched_submit_stop()
>> drm_sched_submit_ready()
>>
>> I like all these functions start with 'drm_sched_submit' which allows
>> for easy searching for the functions that touch the DRM scheduler
>> submission state.
>>
>> With a little better doc are you fine with the names as is.
> 
> Notice the following scheme in the naming,
> 
> drm_sched_submit_queue()
> drm_sched_submit_start)
> drm_sched_submit_stop()
> drm_sched_submit_ready()
> \---+---/ \--+-/ \-+-/
>      |        |     +---> a verb
>      |        +---------> should be a noun (something in the component)
>      +------------------> the kernel/software component
> 
> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
> like this:
> 
> drm_sched_submit_enqueue()
> 
> And using "submit" as the noun of the component is a bit cringy,
> since "submit" is really a verb, and it's cringy to make it a "state"
> or an "object" we operate on in the DRM Scheduler. "Submission" is
> a noun, but "submission enqueue/start/stop/ready" doesn't sound
> very well thought out. "Submission" really is what the work-queue
> does.
> 
> I'd rather it be a real object, like for instance,
> 
> drm_sched_wqueue_enqueue()
> drm_sched_wqueue_start)
> drm_sched_wqueue_stop()
> drm_sched_wqueue_ready()
> 
> Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
> and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
> Plus, all these functions actually do operate on work-queues, directly or indirectly,
> are new, so it's a win-win naming scheme.
> 
> I think that that would be most likeable.

FWIW I was suggesting not to encode the fact submit queue is implemented 
with a workqueue in the API name. IMO it would be nicer and less 
maintenance churn should something channge if the external components 
can be isolated from that detail.

drm_sched_submit_queue_$verb? If not viewed as too verbose...

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06  7:59         ` Tvrtko Ursulin
@ 2023-10-06 15:14           ` Matthew Brost
  2023-10-06 23:43             ` Matthew Brost
  2023-10-11 23:11             ` Luben Tuikov
  2023-10-11 23:10           ` Luben Tuikov
  1 sibling, 2 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-06 15:14 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, intel-xe, Luben Tuikov, dakr, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand

On Fri, Oct 06, 2023 at 08:59:15AM +0100, Tvrtko Ursulin wrote:
> 
> On 05/10/2023 05:13, Luben Tuikov wrote:
> > On 2023-10-04 23:33, Matthew Brost wrote:
> > > On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
> > > > Hi,
> > > > 
> > > > On 2023-09-19 01:01, Matthew Brost wrote:
> > > > > In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> > > > > mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> > > > > seems a bit odd but let us explain the reasoning below.
> > > > > 
> > > > > 1. In XE the submission order from multiple drm_sched_entity is not
> > > > > guaranteed to be the same completion even if targeting the same hardware
> > > > > engine. This is because in XE we have a firmware scheduler, the GuC,
> > > > > which allowed to reorder, timeslice, and preempt submissions. If a using
> > > > > shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> > > > > apart as the TDR expects submission order == completion order. Using a
> > > > > dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> > > > > 
> > > > > 2. In XE submissions are done via programming a ring buffer (circular
> > > > > buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> > > > > limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> > > > > control on the ring for free.
> > > > > 
> > > > > A problem with this design is currently a drm_gpu_scheduler uses a
> > > > > kthread for submission / job cleanup. This doesn't scale if a large
> > > > > number of drm_gpu_scheduler are used. To work around the scaling issue,
> > > > > use a worker rather than kthread for submission / job cleanup.
> > > > > 
> > > > > v2:
> > > > >    - (Rob Clark) Fix msm build
> > > > >    - Pass in run work queue
> > > > > v3:
> > > > >    - (Boris) don't have loop in worker
> > > > > v4:
> > > > >    - (Tvrtko) break out submit ready, stop, start helpers into own patch
> > > > > v5:
> > > > >    - (Boris) default to ordered work queue
> > > > > 
> > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
> > > > >   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
> > > > >   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
> > > > >   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
> > > > >   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
> > > > >   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
> > > > >   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
> > > > >   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
> > > > >   include/drm/gpu_scheduler.h                |  14 ++-
> > > > >   9 files changed, 79 insertions(+), 75 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > index e366f61c3aed..16f3cfe1574a 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> > > > >   			break;
> > > > >   		}
> > > > > -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
> > > > > +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
> > > > >   				   ring->num_hw_submission, 0,
> > > > >   				   timeout, adev->reset_domain->wq,
> > > > >   				   ring->sched_score, ring->name,
> > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > index 345fec6cb1a4..618a804ddc34 100644
> > > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
> > > > >   {
> > > > >   	int ret;
> > > > > -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
> > > > > +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
> > > > >   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> > > > >   			     msecs_to_jiffies(500), NULL, NULL,
> > > > >   			     dev_name(gpu->dev), gpu->dev);
> > > > > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > > > > index ffd91a5ee299..8d858aed0e56 100644
> > > > > --- a/drivers/gpu/drm/lima/lima_sched.c
> > > > > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > > > > @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
> > > > >   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
> > > > > -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
> > > > > +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
> > > > >   			      lima_job_hang_limit,
> > > > >   			      msecs_to_jiffies(timeout), NULL,
> > > > >   			      NULL, name, pipe->ldev->dev);
> > > > > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > index 40c0bc35a44c..b8865e61b40f 100644
> > > > > --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
> > > > >   	 /* currently managing hangcheck ourselves: */
> > > > >   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
> > > > > -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
> > > > > +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
> > > > >   			num_hw_submissions, 0, sched_timeout,
> > > > >   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> > > > 
> > > > checkpatch.pl complains here about unmatched open parens.
> > > > 
> > > 
> > > Will fix and run checkpatch before posting next rev.
> > > 
> > > > >   	if (ret) {
> > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > index 88217185e0f3..d458c2227d4f 100644
> > > > > --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
> > > > >   	if (!drm->sched_wq)
> > > > >   		return -ENOMEM;
> > > > > -	return drm_sched_init(sched, &nouveau_sched_ops,
> > > > > +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
> > > > >   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> > > > >   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
> > > > >   }
> > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > index 033f5e684707..326ca1ddf1d7 100644
> > > > > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
> > > > >   		js->queue[j].fence_context = dma_fence_context_alloc(1);
> > > > >   		ret = drm_sched_init(&js->queue[j].sched,
> > > > > -				     &panfrost_sched_ops,
> > > > > +				     &panfrost_sched_ops, NULL,
> > > > >   				     nentries, 0,
> > > > >   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
> > > > >   				     pfdev->reset.wq,
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > index e4fa62abca41..ee6281942e36 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > @@ -48,7 +48,6 @@
> > > > >    * through the jobs entity pointer.
> > > > >    */
> > > > > -#include <linux/kthread.h>
> > > > >   #include <linux/wait.h>
> > > > >   #include <linux/sched.h>
> > > > >   #include <linux/completion.h>
> > > > > @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> > > > >   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
> > > > >   }
> > > > > +/**
> > > > > + * drm_sched_submit_queue - scheduler queue submission
> > > > 
> > > > There is no verb in the description, and is not clear what
> > > > this function does unless one reads the code. Given that this
> > > > is DOC, this should be clearer here. Something like "queue
> > > > scheduler work to be executed" or something to that effect.
> > > > 
> > > 
> > > Will fix.
> > > > Coming back to this from reading the patch below, it was somewhat
> > > > unclear what "drm_sched_submit_queue()" does, since when reading
> > > > below, "submit" was being read by my mind as an adjective, as opposed
> > > > to a verb. Perhaps something like:
> > > > 
> > > > drm_sched_queue_submit(), or
> > > > drm_sched_queue_exec(), or
> > > > drm_sched_queue_push(), or something to that effect. You pick. :-)
> > > > 
> > > 
> > > I prefer the name as is. In this patch we have:
> > > 
> > > drm_sched_submit_queue()
> > > drm_sched_submit_start)
> > > drm_sched_submit_stop()
> > > drm_sched_submit_ready()
> > > 
> > > I like all these functions start with 'drm_sched_submit' which allows
> > > for easy searching for the functions that touch the DRM scheduler
> > > submission state.
> > > 
> > > With a little better doc are you fine with the names as is.
> > 
> > Notice the following scheme in the naming,
> > 
> > drm_sched_submit_queue()
> > drm_sched_submit_start)
> > drm_sched_submit_stop()
> > drm_sched_submit_ready()
> > \---+---/ \--+-/ \-+-/
> >      |        |     +---> a verb
> >      |        +---------> should be a noun (something in the component)
> >      +------------------> the kernel/software component
> > 
> > And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
> > like this:
> > 
> > drm_sched_submit_enqueue()
> > 
> > And using "submit" as the noun of the component is a bit cringy,
> > since "submit" is really a verb, and it's cringy to make it a "state"
> > or an "object" we operate on in the DRM Scheduler. "Submission" is
> > a noun, but "submission enqueue/start/stop/ready" doesn't sound
> > very well thought out. "Submission" really is what the work-queue
> > does.
> > 
> > I'd rather it be a real object, like for instance,
> > 
> > drm_sched_wqueue_enqueue()
> > drm_sched_wqueue_start)
> > drm_sched_wqueue_stop()
> > drm_sched_wqueue_ready()
> > 

How about:

drm_sched_submission_enqueue()
drm_sched_submission_start)
drm_sched_submission_stop()
drm_sched_submission_ready()

Matt

> > Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
> > and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
> > Plus, all these functions actually do operate on work-queues, directly or indirectly,
> > are new, so it's a win-win naming scheme.
> > 
> > I think that that would be most likeable.
> 
> FWIW I was suggesting not to encode the fact submit queue is implemented
> with a workqueue in the API name. IMO it would be nicer and less maintenance
> churn should something channge if the external components can be isolated
> from that detail.
> 
> drm_sched_submit_queue_$verb? If not viewed as too verbose...
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06 15:14           ` Matthew Brost
@ 2023-10-06 23:43             ` Matthew Brost
  2023-10-09  8:35               ` Tvrtko Ursulin
  2023-10-11 23:19               ` Luben Tuikov
  2023-10-11 23:11             ` Luben Tuikov
  1 sibling, 2 replies; 45+ messages in thread
From: Matthew Brost @ 2023-10-06 23:43 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, christian.koenig, Luben Tuikov, dakr, donald.robson,
	boris.brezillon, intel-xe, faith.ekstrand

On Fri, Oct 06, 2023 at 03:14:04PM +0000, Matthew Brost wrote:
> On Fri, Oct 06, 2023 at 08:59:15AM +0100, Tvrtko Ursulin wrote:
> > 
> > On 05/10/2023 05:13, Luben Tuikov wrote:
> > > On 2023-10-04 23:33, Matthew Brost wrote:
> > > > On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
> > > > > Hi,
> > > > > 
> > > > > On 2023-09-19 01:01, Matthew Brost wrote:
> > > > > > In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> > > > > > mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> > > > > > seems a bit odd but let us explain the reasoning below.
> > > > > > 
> > > > > > 1. In XE the submission order from multiple drm_sched_entity is not
> > > > > > guaranteed to be the same completion even if targeting the same hardware
> > > > > > engine. This is because in XE we have a firmware scheduler, the GuC,
> > > > > > which allowed to reorder, timeslice, and preempt submissions. If a using
> > > > > > shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> > > > > > apart as the TDR expects submission order == completion order. Using a
> > > > > > dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> > > > > > 
> > > > > > 2. In XE submissions are done via programming a ring buffer (circular
> > > > > > buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> > > > > > limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> > > > > > control on the ring for free.
> > > > > > 
> > > > > > A problem with this design is currently a drm_gpu_scheduler uses a
> > > > > > kthread for submission / job cleanup. This doesn't scale if a large
> > > > > > number of drm_gpu_scheduler are used. To work around the scaling issue,
> > > > > > use a worker rather than kthread for submission / job cleanup.
> > > > > > 
> > > > > > v2:
> > > > > >    - (Rob Clark) Fix msm build
> > > > > >    - Pass in run work queue
> > > > > > v3:
> > > > > >    - (Boris) don't have loop in worker
> > > > > > v4:
> > > > > >    - (Tvrtko) break out submit ready, stop, start helpers into own patch
> > > > > > v5:
> > > > > >    - (Boris) default to ordered work queue
> > > > > > 
> > > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > > ---
> > > > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
> > > > > >   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
> > > > > >   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
> > > > > >   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
> > > > > >   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
> > > > > >   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
> > > > > >   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
> > > > > >   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
> > > > > >   include/drm/gpu_scheduler.h                |  14 ++-
> > > > > >   9 files changed, 79 insertions(+), 75 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > index e366f61c3aed..16f3cfe1574a 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > > > @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
> > > > > >   			break;
> > > > > >   		}
> > > > > > -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
> > > > > > +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
> > > > > >   				   ring->num_hw_submission, 0,
> > > > > >   				   timeout, adev->reset_domain->wq,
> > > > > >   				   ring->sched_score, ring->name,
> > > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > index 345fec6cb1a4..618a804ddc34 100644
> > > > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
> > > > > >   {
> > > > > >   	int ret;
> > > > > > -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
> > > > > > +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
> > > > > >   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> > > > > >   			     msecs_to_jiffies(500), NULL, NULL,
> > > > > >   			     dev_name(gpu->dev), gpu->dev);
> > > > > > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > > > > > index ffd91a5ee299..8d858aed0e56 100644
> > > > > > --- a/drivers/gpu/drm/lima/lima_sched.c
> > > > > > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > > > > > @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
> > > > > >   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
> > > > > > -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
> > > > > > +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
> > > > > >   			      lima_job_hang_limit,
> > > > > >   			      msecs_to_jiffies(timeout), NULL,
> > > > > >   			      NULL, name, pipe->ldev->dev);
> > > > > > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > > index 40c0bc35a44c..b8865e61b40f 100644
> > > > > > --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > > > > > @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
> > > > > >   	 /* currently managing hangcheck ourselves: */
> > > > > >   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
> > > > > > -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
> > > > > > +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
> > > > > >   			num_hw_submissions, 0, sched_timeout,
> > > > > >   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
> > > > > 
> > > > > checkpatch.pl complains here about unmatched open parens.
> > > > > 
> > > > 
> > > > Will fix and run checkpatch before posting next rev.
> > > > 
> > > > > >   	if (ret) {
> > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > > index 88217185e0f3..d458c2227d4f 100644
> > > > > > --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> > > > > > @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
> > > > > >   	if (!drm->sched_wq)
> > > > > >   		return -ENOMEM;
> > > > > > -	return drm_sched_init(sched, &nouveau_sched_ops,
> > > > > > +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
> > > > > >   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> > > > > >   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
> > > > > >   }
> > > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > index 033f5e684707..326ca1ddf1d7 100644
> > > > > > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
> > > > > >   		js->queue[j].fence_context = dma_fence_context_alloc(1);
> > > > > >   		ret = drm_sched_init(&js->queue[j].sched,
> > > > > > -				     &panfrost_sched_ops,
> > > > > > +				     &panfrost_sched_ops, NULL,
> > > > > >   				     nentries, 0,
> > > > > >   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
> > > > > >   				     pfdev->reset.wq,
> > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > index e4fa62abca41..ee6281942e36 100644
> > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > @@ -48,7 +48,6 @@
> > > > > >    * through the jobs entity pointer.
> > > > > >    */
> > > > > > -#include <linux/kthread.h>
> > > > > >   #include <linux/wait.h>
> > > > > >   #include <linux/sched.h>
> > > > > >   #include <linux/completion.h>
> > > > > > @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> > > > > >   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
> > > > > >   }
> > > > > > +/**
> > > > > > + * drm_sched_submit_queue - scheduler queue submission
> > > > > 
> > > > > There is no verb in the description, and is not clear what
> > > > > this function does unless one reads the code. Given that this
> > > > > is DOC, this should be clearer here. Something like "queue
> > > > > scheduler work to be executed" or something to that effect.
> > > > > 
> > > > 
> > > > Will fix.
> > > > > Coming back to this from reading the patch below, it was somewhat
> > > > > unclear what "drm_sched_submit_queue()" does, since when reading
> > > > > below, "submit" was being read by my mind as an adjective, as opposed
> > > > > to a verb. Perhaps something like:
> > > > > 
> > > > > drm_sched_queue_submit(), or
> > > > > drm_sched_queue_exec(), or
> > > > > drm_sched_queue_push(), or something to that effect. You pick. :-)
> > > > > 
> > > > 
> > > > I prefer the name as is. In this patch we have:
> > > > 
> > > > drm_sched_submit_queue()
> > > > drm_sched_submit_start)
> > > > drm_sched_submit_stop()
> > > > drm_sched_submit_ready()
> > > > 
> > > > I like all these functions start with 'drm_sched_submit' which allows
> > > > for easy searching for the functions that touch the DRM scheduler
> > > > submission state.
> > > > 
> > > > With a little better doc are you fine with the names as is.
> > > 
> > > Notice the following scheme in the naming,
> > > 
> > > drm_sched_submit_queue()
> > > drm_sched_submit_start)
> > > drm_sched_submit_stop()
> > > drm_sched_submit_ready()
> > > \---+---/ \--+-/ \-+-/
> > >      |        |     +---> a verb
> > >      |        +---------> should be a noun (something in the component)
> > >      +------------------> the kernel/software component
> > > 
> > > And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
> > > like this:
> > > 
> > > drm_sched_submit_enqueue()
> > > 
> > > And using "submit" as the noun of the component is a bit cringy,
> > > since "submit" is really a verb, and it's cringy to make it a "state"
> > > or an "object" we operate on in the DRM Scheduler. "Submission" is
> > > a noun, but "submission enqueue/start/stop/ready" doesn't sound
> > > very well thought out. "Submission" really is what the work-queue
> > > does.
> > > 
> > > I'd rather it be a real object, like for instance,
> > > 
> > > drm_sched_wqueue_enqueue()
> > > drm_sched_wqueue_start)
> > > drm_sched_wqueue_stop()
> > > drm_sched_wqueue_ready()
> > > 
> 
> How about:
> 
> drm_sched_submission_enqueue()
> drm_sched_submission_start)
> drm_sched_submission_stop()
> drm_sched_submission_ready()
> 
> Matt

Ignore this, read Tvrtko commnt and not Luben's fully.

I prefer drm_sched_wqueue over drm_sched_submit_queue as submit queue is
a made of thing. drm_sched_submission would be my top choice but if Luben
is opposed will go with drm_sched_wqueue in next rev.

Matt

> 
> > > Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
> > > and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
> > > Plus, all these functions actually do operate on work-queues, directly or indirectly,
> > > are new, so it's a win-win naming scheme.
> > > 
> > > I think that that would be most likeable.
> > 
> > FWIW I was suggesting not to encode the fact submit queue is implemented
> > with a workqueue in the API name. IMO it would be nicer and less maintenance
> > churn should something channge if the external components can be isolated
> > from that detail.
> > 
> > drm_sched_submit_queue_$verb? If not viewed as too verbose...
> > 
> > Regards,
> > 
> > Tvrtko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06 23:43             ` Matthew Brost
@ 2023-10-09  8:35               ` Tvrtko Ursulin
  2023-10-11 23:19               ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Tvrtko Ursulin @ 2023-10-09  8:35 UTC (permalink / raw)
  To: Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, christian.koenig, Luben Tuikov, dakr, donald.robson,
	boris.brezillon, intel-xe, faith.ekstrand


On 07/10/2023 00:43, Matthew Brost wrote:
> On Fri, Oct 06, 2023 at 03:14:04PM +0000, Matthew Brost wrote:
>> On Fri, Oct 06, 2023 at 08:59:15AM +0100, Tvrtko Ursulin wrote:
>>>
>>> On 05/10/2023 05:13, Luben Tuikov wrote:
>>>> On 2023-10-04 23:33, Matthew Brost wrote:
>>>>> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>>>>>> seems a bit odd but let us explain the reasoning below.
>>>>>>>
>>>>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>>>>> guaranteed to be the same completion even if targeting the same hardware
>>>>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>>>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>>>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>>>>>> apart as the TDR expects submission order == completion order. Using a
>>>>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>>>>
>>>>>>> 2. In XE submissions are done via programming a ring buffer (circular
>>>>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>>>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>>>>>> control on the ring for free.
>>>>>>>
>>>>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>>>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>>>>>> use a worker rather than kthread for submission / job cleanup.
>>>>>>>
>>>>>>> v2:
>>>>>>>     - (Rob Clark) Fix msm build
>>>>>>>     - Pass in run work queue
>>>>>>> v3:
>>>>>>>     - (Boris) don't have loop in worker
>>>>>>> v4:
>>>>>>>     - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>>>>>> v5:
>>>>>>>     - (Boris) default to ordered work queue
>>>>>>>
>>>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>>>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>>>>>    drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>>>>>    drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>>>>>    drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>>>>>    drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>>>>>    drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>>>>>    drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>>>>>    include/drm/gpu_scheduler.h                |  14 ++-
>>>>>>>    9 files changed, 79 insertions(+), 75 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index e366f61c3aed..16f3cfe1574a 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>>>>>    			break;
>>>>>>>    		}
>>>>>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>>>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>>>>>    				   ring->num_hw_submission, 0,
>>>>>>>    				   timeout, adev->reset_domain->wq,
>>>>>>>    				   ring->sched_score, ring->name,
>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> index 345fec6cb1a4..618a804ddc34 100644
>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>>>>>    {
>>>>>>>    	int ret;
>>>>>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>>>>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>>>>>    			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>>>>>    			     msecs_to_jiffies(500), NULL, NULL,
>>>>>>>    			     dev_name(gpu->dev), gpu->dev);
>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> index ffd91a5ee299..8d858aed0e56 100644
>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>>>>>    	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>>>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>>>>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>>>>>    			      lima_job_hang_limit,
>>>>>>>    			      msecs_to_jiffies(timeout), NULL,
>>>>>>>    			      NULL, name, pipe->ldev->dev);
>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> index 40c0bc35a44c..b8865e61b40f 100644
>>>>>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>>>>>    	 /* currently managing hangcheck ourselves: */
>>>>>>>    	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>>>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>>>>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>>>>>    			num_hw_submissions, 0, sched_timeout,
>>>>>>>    			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>>>>>
>>>>>> checkpatch.pl complains here about unmatched open parens.
>>>>>>
>>>>>
>>>>> Will fix and run checkpatch before posting next rev.
>>>>>
>>>>>>>    	if (ret) {
>>>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> index 88217185e0f3..d458c2227d4f 100644
>>>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>>>>>    	if (!drm->sched_wq)
>>>>>>>    		return -ENOMEM;
>>>>>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>>>>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>>>>>    			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>>>>>    			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>>>>>    }
>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> index 033f5e684707..326ca1ddf1d7 100644
>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>>>>>    		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>>>>>    		ret = drm_sched_init(&js->queue[j].sched,
>>>>>>> -				     &panfrost_sched_ops,
>>>>>>> +				     &panfrost_sched_ops, NULL,
>>>>>>>    				     nentries, 0,
>>>>>>>    				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>>>>>    				     pfdev->reset.wq,
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> index e4fa62abca41..ee6281942e36 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -48,7 +48,6 @@
>>>>>>>     * through the jobs entity pointer.
>>>>>>>     */
>>>>>>> -#include <linux/kthread.h>
>>>>>>>    #include <linux/wait.h>
>>>>>>>    #include <linux/sched.h>
>>>>>>>    #include <linux/completion.h>
>>>>>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>>>>>    	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>>>>>    }
>>>>>>> +/**
>>>>>>> + * drm_sched_submit_queue - scheduler queue submission
>>>>>>
>>>>>> There is no verb in the description, and is not clear what
>>>>>> this function does unless one reads the code. Given that this
>>>>>> is DOC, this should be clearer here. Something like "queue
>>>>>> scheduler work to be executed" or something to that effect.
>>>>>>
>>>>>
>>>>> Will fix.
>>>>>> Coming back to this from reading the patch below, it was somewhat
>>>>>> unclear what "drm_sched_submit_queue()" does, since when reading
>>>>>> below, "submit" was being read by my mind as an adjective, as opposed
>>>>>> to a verb. Perhaps something like:
>>>>>>
>>>>>> drm_sched_queue_submit(), or
>>>>>> drm_sched_queue_exec(), or
>>>>>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>>>>>
>>>>>
>>>>> I prefer the name as is. In this patch we have:
>>>>>
>>>>> drm_sched_submit_queue()
>>>>> drm_sched_submit_start)
>>>>> drm_sched_submit_stop()
>>>>> drm_sched_submit_ready()
>>>>>
>>>>> I like all these functions start with 'drm_sched_submit' which allows
>>>>> for easy searching for the functions that touch the DRM scheduler
>>>>> submission state.
>>>>>
>>>>> With a little better doc are you fine with the names as is.
>>>>
>>>> Notice the following scheme in the naming,
>>>>
>>>> drm_sched_submit_queue()
>>>> drm_sched_submit_start)
>>>> drm_sched_submit_stop()
>>>> drm_sched_submit_ready()
>>>> \---+---/ \--+-/ \-+-/
>>>>       |        |     +---> a verb
>>>>       |        +---------> should be a noun (something in the component)
>>>>       +------------------> the kernel/software component
>>>>
>>>> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
>>>> like this:
>>>>
>>>> drm_sched_submit_enqueue()
>>>>
>>>> And using "submit" as the noun of the component is a bit cringy,
>>>> since "submit" is really a verb, and it's cringy to make it a "state"
>>>> or an "object" we operate on in the DRM Scheduler. "Submission" is
>>>> a noun, but "submission enqueue/start/stop/ready" doesn't sound
>>>> very well thought out. "Submission" really is what the work-queue
>>>> does.
>>>>
>>>> I'd rather it be a real object, like for instance,
>>>>
>>>> drm_sched_wqueue_enqueue()
>>>> drm_sched_wqueue_start)
>>>> drm_sched_wqueue_stop()
>>>> drm_sched_wqueue_ready()
>>>>
>>
>> How about:
>>
>> drm_sched_submission_enqueue()
>> drm_sched_submission_start)
>> drm_sched_submission_stop()
>> drm_sched_submission_ready()
>>
>> Matt
> 
> Ignore this, read Tvrtko commnt and not Luben's fully.
> 
> I prefer drm_sched_wqueue over drm_sched_submit_queue as submit queue is
> a made of thing. drm_sched_submission would be my top choice but if Luben
> is opposed will go with drm_sched_wqueue in next rev.

I suppose you meant "made up"? All the verbs are also then made up so I 
don't really see that as an argument why implementation detail should be 
encoded into the API naming but your call folks.

Regards,

Tvrtko

>>>> Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
>>>> and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
>>>> Plus, all these functions actually do operate on work-queues, directly or indirectly,
>>>> are new, so it's a win-win naming scheme.
>>>>
>>>> I think that that would be most likeable.
>>>
>>> FWIW I was suggesting not to encode the fact submit queue is implemented
>>> with a workqueue in the API name. IMO it would be nicer and less maintenance
>>> churn should something channge if the external components can be isolated
>>> from that detail.
>>>
>>> drm_sched_submit_queue_$verb? If not viewed as too verbose...
>>>
>>> Regards,
>>>
>>> Tvrtko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06  7:59         ` Tvrtko Ursulin
  2023-10-06 15:14           ` Matthew Brost
@ 2023-10-11 23:10           ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-11 23:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, christian.koenig, boris.brezillon, dakr, donald.robson,
	intel-xe, faith.ekstrand

On 2023-10-06 03:59, Tvrtko Ursulin wrote:
> 
> On 05/10/2023 05:13, Luben Tuikov wrote:
>> On 2023-10-04 23:33, Matthew Brost wrote:
>>> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>>>> Hi,
>>>>
>>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>>>> seems a bit odd but let us explain the reasoning below.
>>>>>
>>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>>> guaranteed to be the same completion even if targeting the same hardware
>>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>>>> apart as the TDR expects submission order == completion order. Using a
>>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>>
>>>>> 2. In XE submissions are done via programming a ring buffer (circular
>>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>>>> control on the ring for free.
>>>>>
>>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>>>> use a worker rather than kthread for submission / job cleanup.
>>>>>
>>>>> v2:
>>>>>    - (Rob Clark) Fix msm build
>>>>>    - Pass in run work queue
>>>>> v3:
>>>>>    - (Boris) don't have loop in worker
>>>>> v4:
>>>>>    - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>>>> v5:
>>>>>    - (Boris) default to ordered work queue
>>>>>
>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>>>   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>>>   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>>>   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>>>   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>>>   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>>>   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>>>   include/drm/gpu_scheduler.h                |  14 ++-
>>>>>   9 files changed, 79 insertions(+), 75 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index e366f61c3aed..16f3cfe1574a 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>>>   			break;
>>>>>   		}
>>>>>   
>>>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>>>   				   ring->num_hw_submission, 0,
>>>>>   				   timeout, adev->reset_domain->wq,
>>>>>   				   ring->sched_score, ring->name,
>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> index 345fec6cb1a4..618a804ddc34 100644
>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>>>   {
>>>>>   	int ret;
>>>>>   
>>>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>>>   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>>>   			     msecs_to_jiffies(500), NULL, NULL,
>>>>>   			     dev_name(gpu->dev), gpu->dev);
>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>> index ffd91a5ee299..8d858aed0e56 100644
>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>>>   
>>>>>   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>>>   
>>>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>>>   			      lima_job_hang_limit,
>>>>>   			      msecs_to_jiffies(timeout), NULL,
>>>>>   			      NULL, name, pipe->ldev->dev);
>>>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>> index 40c0bc35a44c..b8865e61b40f 100644
>>>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>>>   	 /* currently managing hangcheck ourselves: */
>>>>>   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>>>   
>>>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>>>   			num_hw_submissions, 0, sched_timeout,
>>>>>   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>>>
>>>> checkpatch.pl complains here about unmatched open parens.
>>>>
>>>
>>> Will fix and run checkpatch before posting next rev.
>>>
>>>>>   	if (ret) {
>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>> index 88217185e0f3..d458c2227d4f 100644
>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>>>   	if (!drm->sched_wq)
>>>>>   		return -ENOMEM;
>>>>>   
>>>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>>>   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>>>   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>>>   }
>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> index 033f5e684707..326ca1ddf1d7 100644
>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>>>   		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>>>   
>>>>>   		ret = drm_sched_init(&js->queue[j].sched,
>>>>> -				     &panfrost_sched_ops,
>>>>> +				     &panfrost_sched_ops, NULL,
>>>>>   				     nentries, 0,
>>>>>   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>>>   				     pfdev->reset.wq,
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index e4fa62abca41..ee6281942e36 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -48,7 +48,6 @@
>>>>>    * through the jobs entity pointer.
>>>>>    */
>>>>>   
>>>>> -#include <linux/kthread.h>
>>>>>   #include <linux/wait.h>
>>>>>   #include <linux/sched.h>
>>>>>   #include <linux/completion.h>
>>>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>>>   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>>>   }
>>>>>   
>>>>> +/**
>>>>> + * drm_sched_submit_queue - scheduler queue submission
>>>>
>>>> There is no verb in the description, and is not clear what
>>>> this function does unless one reads the code. Given that this
>>>> is DOC, this should be clearer here. Something like "queue
>>>> scheduler work to be executed" or something to that effect.
>>>>
>>>
>>> Will fix.
>>>   
>>>> Coming back to this from reading the patch below, it was somewhat
>>>> unclear what "drm_sched_submit_queue()" does, since when reading
>>>> below, "submit" was being read by my mind as an adjective, as opposed
>>>> to a verb. Perhaps something like:
>>>>
>>>> drm_sched_queue_submit(), or
>>>> drm_sched_queue_exec(), or
>>>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>>>
>>>
>>> I prefer the name as is. In this patch we have:
>>>
>>> drm_sched_submit_queue()
>>> drm_sched_submit_start)
>>> drm_sched_submit_stop()
>>> drm_sched_submit_ready()
>>>
>>> I like all these functions start with 'drm_sched_submit' which allows
>>> for easy searching for the functions that touch the DRM scheduler
>>> submission state.
>>>
>>> With a little better doc are you fine with the names as is.
>>
>> Notice the following scheme in the naming,
>>
>> drm_sched_submit_queue()
>> drm_sched_submit_start)
>> drm_sched_submit_stop()
>> drm_sched_submit_ready()
>> \---+---/ \--+-/ \-+-/
>>      |        |     +---> a verb
>>      |        +---------> should be a noun (something in the component)
>>      +------------------> the kernel/software component
>>
>> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
>> like this:
>>
>> drm_sched_submit_enqueue()
>>
>> And using "submit" as the noun of the component is a bit cringy,
>> since "submit" is really a verb, and it's cringy to make it a "state"
>> or an "object" we operate on in the DRM Scheduler. "Submission" is
>> a noun, but "submission enqueue/start/stop/ready" doesn't sound
>> very well thought out. "Submission" really is what the work-queue
>> does.
>>
>> I'd rather it be a real object, like for instance,
>>
>> drm_sched_wqueue_enqueue()
>> drm_sched_wqueue_start)
>> drm_sched_wqueue_stop()
>> drm_sched_wqueue_ready()
>>
>> Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
>> and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
>> Plus, all these functions actually do operate on work-queues, directly or indirectly,
>> are new, so it's a win-win naming scheme.
>>
>> I think that that would be most likeable.
> 
> FWIW I was suggesting not to encode the fact submit queue is implemented 

No. Overengineering.

> with a workqueue in the API name. IMO it would be nicer and less 
> maintenance churn should something channge if the external components 
> can be isolated from that detail.
> 
> drm_sched_submit_queue_$verb? If not viewed as too verbose...

No.

That sounds like an unnecessary overengineering: "what if". No.

-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06 15:14           ` Matthew Brost
  2023-10-06 23:43             ` Matthew Brost
@ 2023-10-11 23:11             ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-11 23:11 UTC (permalink / raw)
  To: Matthew Brost, Tvrtko Ursulin
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, intel-xe, boris.brezillon, dakr, donald.robson,
	christian.koenig, faith.ekstrand

On 2023-10-06 11:14, Matthew Brost wrote:
> On Fri, Oct 06, 2023 at 08:59:15AM +0100, Tvrtko Ursulin wrote:
>>
>> On 05/10/2023 05:13, Luben Tuikov wrote:
>>> On 2023-10-04 23:33, Matthew Brost wrote:
>>>> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>>>>> Hi,
>>>>>
>>>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>>>>> seems a bit odd but let us explain the reasoning below.
>>>>>>
>>>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>>>> guaranteed to be the same completion even if targeting the same hardware
>>>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>>>>> apart as the TDR expects submission order == completion order. Using a
>>>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>>>
>>>>>> 2. In XE submissions are done via programming a ring buffer (circular
>>>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>>>>> control on the ring for free.
>>>>>>
>>>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>>>>> use a worker rather than kthread for submission / job cleanup.
>>>>>>
>>>>>> v2:
>>>>>>    - (Rob Clark) Fix msm build
>>>>>>    - Pass in run work queue
>>>>>> v3:
>>>>>>    - (Boris) don't have loop in worker
>>>>>> v4:
>>>>>>    - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>>>>> v5:
>>>>>>    - (Boris) default to ordered work queue
>>>>>>
>>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>>>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>>>>   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>>>>   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>>>>   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>>>>   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>>>>   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>>>>   include/drm/gpu_scheduler.h                |  14 ++-
>>>>>>   9 files changed, 79 insertions(+), 75 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index e366f61c3aed..16f3cfe1574a 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>>>>   			break;
>>>>>>   		}
>>>>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>>>>   				   ring->num_hw_submission, 0,
>>>>>>   				   timeout, adev->reset_domain->wq,
>>>>>>   				   ring->sched_score, ring->name,
>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>> index 345fec6cb1a4..618a804ddc34 100644
>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>>>>   {
>>>>>>   	int ret;
>>>>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>>>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>>>>   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>>>>   			     msecs_to_jiffies(500), NULL, NULL,
>>>>>>   			     dev_name(gpu->dev), gpu->dev);
>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>> index ffd91a5ee299..8d858aed0e56 100644
>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>>>>   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>>>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>>>>   			      lima_job_hang_limit,
>>>>>>   			      msecs_to_jiffies(timeout), NULL,
>>>>>>   			      NULL, name, pipe->ldev->dev);
>>>>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>> index 40c0bc35a44c..b8865e61b40f 100644
>>>>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>>>>   	 /* currently managing hangcheck ourselves: */
>>>>>>   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>>>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>>>>   			num_hw_submissions, 0, sched_timeout,
>>>>>>   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>>>>
>>>>> checkpatch.pl complains here about unmatched open parens.
>>>>>
>>>>
>>>> Will fix and run checkpatch before posting next rev.
>>>>
>>>>>>   	if (ret) {
>>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>> index 88217185e0f3..d458c2227d4f 100644
>>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>>>>   	if (!drm->sched_wq)
>>>>>>   		return -ENOMEM;
>>>>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>>>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>>>>   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>>>>   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>>>>   }
>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>> index 033f5e684707..326ca1ddf1d7 100644
>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>>>>   		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>>>>   		ret = drm_sched_init(&js->queue[j].sched,
>>>>>> -				     &panfrost_sched_ops,
>>>>>> +				     &panfrost_sched_ops, NULL,
>>>>>>   				     nentries, 0,
>>>>>>   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>>>>   				     pfdev->reset.wq,
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index e4fa62abca41..ee6281942e36 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -48,7 +48,6 @@
>>>>>>    * through the jobs entity pointer.
>>>>>>    */
>>>>>> -#include <linux/kthread.h>
>>>>>>   #include <linux/wait.h>
>>>>>>   #include <linux/sched.h>
>>>>>>   #include <linux/completion.h>
>>>>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>>>>   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>>>>   }
>>>>>> +/**
>>>>>> + * drm_sched_submit_queue - scheduler queue submission
>>>>>
>>>>> There is no verb in the description, and is not clear what
>>>>> this function does unless one reads the code. Given that this
>>>>> is DOC, this should be clearer here. Something like "queue
>>>>> scheduler work to be executed" or something to that effect.
>>>>>
>>>>
>>>> Will fix.
>>>>> Coming back to this from reading the patch below, it was somewhat
>>>>> unclear what "drm_sched_submit_queue()" does, since when reading
>>>>> below, "submit" was being read by my mind as an adjective, as opposed
>>>>> to a verb. Perhaps something like:
>>>>>
>>>>> drm_sched_queue_submit(), or
>>>>> drm_sched_queue_exec(), or
>>>>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>>>>
>>>>
>>>> I prefer the name as is. In this patch we have:
>>>>
>>>> drm_sched_submit_queue()
>>>> drm_sched_submit_start)
>>>> drm_sched_submit_stop()
>>>> drm_sched_submit_ready()
>>>>
>>>> I like all these functions start with 'drm_sched_submit' which allows
>>>> for easy searching for the functions that touch the DRM scheduler
>>>> submission state.
>>>>
>>>> With a little better doc are you fine with the names as is.
>>>
>>> Notice the following scheme in the naming,
>>>
>>> drm_sched_submit_queue()
>>> drm_sched_submit_start)
>>> drm_sched_submit_stop()
>>> drm_sched_submit_ready()
>>> \---+---/ \--+-/ \-+-/
>>>      |        |     +---> a verb
>>>      |        +---------> should be a noun (something in the component)
>>>      +------------------> the kernel/software component
>>>
>>> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
>>> like this:
>>>
>>> drm_sched_submit_enqueue()
>>>
>>> And using "submit" as the noun of the component is a bit cringy,
>>> since "submit" is really a verb, and it's cringy to make it a "state"
>>> or an "object" we operate on in the DRM Scheduler. "Submission" is
>>> a noun, but "submission enqueue/start/stop/ready" doesn't sound
>>> very well thought out. "Submission" really is what the work-queue
>>> does.
>>>
>>> I'd rather it be a real object, like for instance,
>>>
>>> drm_sched_wqueue_enqueue()
>>> drm_sched_wqueue_start)
>>> drm_sched_wqueue_stop()
>>> drm_sched_wqueue_ready()
>>>
> 
> How about:
> 
> drm_sched_submission_enqueue()
> drm_sched_submission_start)
> drm_sched_submission_stop()
> drm_sched_submission_ready()

No.

-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread
  2023-10-06 23:43             ` Matthew Brost
  2023-10-09  8:35               ` Tvrtko Ursulin
@ 2023-10-11 23:19               ` Luben Tuikov
  1 sibling, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-11 23:19 UTC (permalink / raw)
  To: Matthew Brost, Tvrtko Ursulin
  Cc: robdclark, sarah.walker, ketil.johnsen, lina, mcanal, Liviu.Dudau,
	dri-devel, christian.koenig, boris.brezillon, dakr, donald.robson,
	intel-xe, faith.ekstrand

On 2023-10-06 19:43, Matthew Brost wrote:
> On Fri, Oct 06, 2023 at 03:14:04PM +0000, Matthew Brost wrote:
>> On Fri, Oct 06, 2023 at 08:59:15AM +0100, Tvrtko Ursulin wrote:
>>>
>>> On 05/10/2023 05:13, Luben Tuikov wrote:
>>>> On 2023-10-04 23:33, Matthew Brost wrote:
>>>>> On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2023-09-19 01:01, Matthew Brost wrote:
>>>>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
>>>>>>> seems a bit odd but let us explain the reasoning below.
>>>>>>>
>>>>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>>>>> guaranteed to be the same completion even if targeting the same hardware
>>>>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>>>>> which allowed to reorder, timeslice, and preempt submissions. If a using
>>>>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
>>>>>>> apart as the TDR expects submission order == completion order. Using a
>>>>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>>>>
>>>>>>> 2. In XE submissions are done via programming a ring buffer (circular
>>>>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
>>>>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
>>>>>>> control on the ring for free.
>>>>>>>
>>>>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>>>>> number of drm_gpu_scheduler are used. To work around the scaling issue,
>>>>>>> use a worker rather than kthread for submission / job cleanup.
>>>>>>>
>>>>>>> v2:
>>>>>>>    - (Rob Clark) Fix msm build
>>>>>>>    - Pass in run work queue
>>>>>>> v3:
>>>>>>>    - (Boris) don't have loop in worker
>>>>>>> v4:
>>>>>>>    - (Tvrtko) break out submit ready, stop, start helpers into own patch
>>>>>>> v5:
>>>>>>>    - (Boris) default to ordered work queue
>>>>>>>
>>>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>>>>>>>   drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
>>>>>>>   drivers/gpu/drm/lima/lima_sched.c          |   2 +-
>>>>>>>   drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
>>>>>>>   drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
>>>>>>>   drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
>>>>>>>   drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
>>>>>>>   drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
>>>>>>>   include/drm/gpu_scheduler.h                |  14 ++-
>>>>>>>   9 files changed, 79 insertions(+), 75 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index e366f61c3aed..16f3cfe1574a 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>>>>>   			break;
>>>>>>>   		}
>>>>>>> -		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>>>>> +		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
>>>>>>>   				   ring->num_hw_submission, 0,
>>>>>>>   				   timeout, adev->reset_domain->wq,
>>>>>>>   				   ring->sched_score, ring->name,
>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> index 345fec6cb1a4..618a804ddc34 100644
>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>>>>>>>   {
>>>>>>>   	int ret;
>>>>>>> -	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
>>>>>>> +	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
>>>>>>>   			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>>>>>>>   			     msecs_to_jiffies(500), NULL, NULL,
>>>>>>>   			     dev_name(gpu->dev), gpu->dev);
>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> index ffd91a5ee299..8d858aed0e56 100644
>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
>>>>>>>   	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
>>>>>>> -	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
>>>>>>> +	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
>>>>>>>   			      lima_job_hang_limit,
>>>>>>>   			      msecs_to_jiffies(timeout), NULL,
>>>>>>>   			      NULL, name, pipe->ldev->dev);
>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> index 40c0bc35a44c..b8865e61b40f 100644
>>>>>>> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
>>>>>>> @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
>>>>>>>   	 /* currently managing hangcheck ourselves: */
>>>>>>>   	sched_timeout = MAX_SCHEDULE_TIMEOUT;
>>>>>>> -	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
>>>>>>> +	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
>>>>>>>   			num_hw_submissions, 0, sched_timeout,
>>>>>>>   			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
>>>>>>
>>>>>> checkpatch.pl complains here about unmatched open parens.
>>>>>>
>>>>>
>>>>> Will fix and run checkpatch before posting next rev.
>>>>>
>>>>>>>   	if (ret) {
>>>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> index 88217185e0f3..d458c2227d4f 100644
>>>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>>>> @@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>>>>>>>   	if (!drm->sched_wq)
>>>>>>>   		return -ENOMEM;
>>>>>>> -	return drm_sched_init(sched, &nouveau_sched_ops,
>>>>>>> +	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
>>>>>>>   			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
>>>>>>>   			      NULL, NULL, "nouveau_sched", drm->dev->dev);
>>>>>>>   }
>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> index 033f5e684707..326ca1ddf1d7 100644
>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> @@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>>>>>>>   		js->queue[j].fence_context = dma_fence_context_alloc(1);
>>>>>>>   		ret = drm_sched_init(&js->queue[j].sched,
>>>>>>> -				     &panfrost_sched_ops,
>>>>>>> +				     &panfrost_sched_ops, NULL,
>>>>>>>   				     nentries, 0,
>>>>>>>   				     msecs_to_jiffies(JOB_TIMEOUT_MS),
>>>>>>>   				     pfdev->reset.wq,
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> index e4fa62abca41..ee6281942e36 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -48,7 +48,6 @@
>>>>>>>    * through the jobs entity pointer.
>>>>>>>    */
>>>>>>> -#include <linux/kthread.h>
>>>>>>>   #include <linux/wait.h>
>>>>>>>   #include <linux/sched.h>
>>>>>>>   #include <linux/completion.h>
>>>>>>> @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>>>>>   	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
>>>>>>>   }
>>>>>>> +/**
>>>>>>> + * drm_sched_submit_queue - scheduler queue submission
>>>>>>
>>>>>> There is no verb in the description, and is not clear what
>>>>>> this function does unless one reads the code. Given that this
>>>>>> is DOC, this should be clearer here. Something like "queue
>>>>>> scheduler work to be executed" or something to that effect.
>>>>>>
>>>>>
>>>>> Will fix.
>>>>>> Coming back to this from reading the patch below, it was somewhat
>>>>>> unclear what "drm_sched_submit_queue()" does, since when reading
>>>>>> below, "submit" was being read by my mind as an adjective, as opposed
>>>>>> to a verb. Perhaps something like:
>>>>>>
>>>>>> drm_sched_queue_submit(), or
>>>>>> drm_sched_queue_exec(), or
>>>>>> drm_sched_queue_push(), or something to that effect. You pick. :-)
>>>>>>
>>>>>
>>>>> I prefer the name as is. In this patch we have:
>>>>>
>>>>> drm_sched_submit_queue()
>>>>> drm_sched_submit_start)
>>>>> drm_sched_submit_stop()
>>>>> drm_sched_submit_ready()
>>>>>
>>>>> I like all these functions start with 'drm_sched_submit' which allows
>>>>> for easy searching for the functions that touch the DRM scheduler
>>>>> submission state.
>>>>>
>>>>> With a little better doc are you fine with the names as is.
>>>>
>>>> Notice the following scheme in the naming,
>>>>
>>>> drm_sched_submit_queue()
>>>> drm_sched_submit_start)
>>>> drm_sched_submit_stop()
>>>> drm_sched_submit_ready()
>>>> \---+---/ \--+-/ \-+-/
>>>>      |        |     +---> a verb
>>>>      |        +---------> should be a noun (something in the component)
>>>>      +------------------> the kernel/software component
>>>>
>>>> And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
>>>> like this:
>>>>
>>>> drm_sched_submit_enqueue()
>>>>
>>>> And using "submit" as the noun of the component is a bit cringy,
>>>> since "submit" is really a verb, and it's cringy to make it a "state"
>>>> or an "object" we operate on in the DRM Scheduler. "Submission" is
>>>> a noun, but "submission enqueue/start/stop/ready" doesn't sound
>>>> very well thought out. "Submission" really is what the work-queue
>>>> does.

^^^^^^^^^^^^^^^^^^^ Here ^^^^^^^^^^^^^^^^^^^^^^^^^^

>>>>
>>>> I'd rather it be a real object, like for instance,
>>>>
>>>> drm_sched_wqueue_enqueue()
>>>> drm_sched_wqueue_start)
>>>> drm_sched_wqueue_stop()
>>>> drm_sched_wqueue_ready()
>>>>
>>
>> How about:
>>
>> drm_sched_submission_enqueue()
>> drm_sched_submission_start)
>> drm_sched_submission_stop()
>> drm_sched_submission_ready()
>>
>> Matt
> 
> Ignore this, read Tvrtko commnt and not Luben's fully.
> 
> I prefer drm_sched_wqueue over drm_sched_submit_queue as submit queue is
> a made of thing. drm_sched_submission would be my top choice but if Luben
> is opposed will go with drm_sched_wqueue in next rev.

You had me at "opposed."

I think I've explained why up there.

drm_sched_wqueue_[verb]() is clear and clean. We don't need yet another
abstraction, to the abstraction, to the naming.

If we ever implement anything different than work-queue in the future,
we may split the code and decide to keep both, maybe depending on what
a driver would like to use, etc., so it's cleanest to convey what it means.

"drm_sched_submission_[verb]()" is really cringy.

Plus, it's a good reminder to know that's it's a work-queue. Keeps driver
writers informed. There is no reason to obfuscate the code.
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item
  2023-10-05  4:06     ` Matthew Brost
@ 2023-10-11 23:29       ` Luben Tuikov
  0 siblings, 0 replies; 45+ messages in thread
From: Luben Tuikov @ 2023-10-11 23:29 UTC (permalink / raw)
  To: Matthew Brost
  Cc: robdclark, sarah.walker, ketil.johnsen, Liviu.Dudau, mcanal,
	frank.binns, dri-devel, christian.koenig, boris.brezillon, dakr,
	donald.robson, daniel, lina, airlied, intel-xe, faith.ekstrand

On 2023-10-05 00:06, Matthew Brost wrote:
> On Thu, Sep 28, 2023 at 12:14:12PM -0400, Luben Tuikov wrote:
>> On 2023-09-19 01:01, Matthew Brost wrote:
>>> Rather than call free_job and run_job in same work item have a dedicated
>>> work item for each. This aligns with the design and intended use of work
>>> queues.
>>>
>>> v2:
>>>    - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
>>>      timestamp in free_job() work item (Danilo)
>>> v3:
>>>   - Drop forward dec of drm_sched_select_entity (Boris)
>>>   - Return in drm_sched_run_job_work if entity NULL (Boris)
>>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>  drivers/gpu/drm/scheduler/sched_main.c | 290 +++++++++++++++----------
>>>  include/drm/gpu_scheduler.h            |   8 +-
>>>  2 files changed, 182 insertions(+), 116 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 588c735f7498..1e21d234fb5c 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -213,11 +213,12 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>   * drm_sched_rq_select_entity_rr - Select an entity which could provide a job to run
>>>   *
>>>   * @rq: scheduler run queue to check.
>>> + * @dequeue: dequeue selected entity
>>
>> Change this to "peek" as indicated below.
>>
>>>   *
>>>   * Try to find a ready entity, returns NULL if none found.
>>>   */
>>>  static struct drm_sched_entity *
>>> -drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq, bool dequeue)
>>>  {
>>>  	struct drm_sched_entity *entity;
>>>  
>>> @@ -227,8 +228,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>>  	if (entity) {
>>>  		list_for_each_entry_continue(entity, &rq->entities, list) {
>>>  			if (drm_sched_entity_is_ready(entity)) {
>>> -				rq->current_entity = entity;
>>> -				reinit_completion(&entity->entity_idle);
>>> +				if (dequeue) {
>>> +					rq->current_entity = entity;
>>> +					reinit_completion(&entity->entity_idle);
>>> +				}
>>
>> Please rename "dequeue" or invert its logic, as from this patch it seems that
>> it is hiding (gating out) current behaviour.
>>
>> Ideally, I'd prefer it be inverted, so that current behaviour, i.e. what people
>> are used to the rq_select_entity_*() to do, is default--preserved.
>>
>> Perhaps use "peek" as the name of this new variable, to indicate that
>> we're not setting it to be the current entity.
>>
>> I prefer "peek" to others, as the former tells me "Hey, I'm only
>> peeking at the rq and not really doing the default behaviour I've been
>> doing which you're used to." So, probably use "peek". ("Peek" also has historical
>> significance...).
>>
> 
> Peek it is. Will change.
> 
>>>  				spin_unlock(&rq->lock);
>>>  				return entity;
>>>  			}
>>> @@ -238,8 +241,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>>  	list_for_each_entry(entity, &rq->entities, list) {
>>>  
>>>  		if (drm_sched_entity_is_ready(entity)) {
>>> -			rq->current_entity = entity;
>>> -			reinit_completion(&entity->entity_idle);
>>> +			if (dequeue) {
>>
>> 			if (!peek) {
>>
> 
> +1
> 
>>> +				rq->current_entity = entity;
>>> +				reinit_completion(&entity->entity_idle);
>>> +			}
>>>  			spin_unlock(&rq->lock);
>>>  			return entity;
>>>  		}
>>> @@ -257,11 +262,12 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>>   * drm_sched_rq_select_entity_fifo - Select an entity which provides a job to run
>>>   *
>>>   * @rq: scheduler run queue to check.
>>> + * @dequeue: dequeue selected entity
>>
>>     * @peek: Just find, don't set to current.
>>
> 
> +1
>  
>>>   *
>>>   * Find oldest waiting ready entity, returns NULL if none found.>   */
>>>  static struct drm_sched_entity *
>>> -drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq, bool dequeue)
>>>  {
>>>  	struct rb_node *rb;
>>>  
>>> @@ -271,8 +277,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>  
>>>  		entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node);
>>>  		if (drm_sched_entity_is_ready(entity)) {
>>> -			rq->current_entity = entity;
>>> -			reinit_completion(&entity->entity_idle);
>>> +			if (dequeue) {
>>
>> 			if (!peek) {
>>
>>> +				rq->current_entity = entity;
>>> +				reinit_completion(&entity->entity_idle);
>>> +			}
>>>  			break;
>>>  		}
>>>  	}
>>> @@ -282,13 +290,102 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>>  }
>>>  
>>>  /**
>>> - * drm_sched_submit_queue - scheduler queue submission
>>> + * drm_sched_run_job_queue - queue job submission
>>> + * @sched: scheduler instance
>>> + */
>>
>> Perhaps it would be clearer to a DOC reader if there were verbs
>> in this function comment? I feel this was mentioned in the review
>> to patch 2...
>>
>>> +static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
>>> +{
>>> +	if (!READ_ONCE(sched->pause_submit))
>>> +		queue_work(sched->submit_wq, &sched->work_run_job);
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_can_queue -- Can we queue more to the hardware?
>>> + * @sched: scheduler instance
>>> + *
>>> + * Return true if we can push more jobs to the hw, otherwise false.
>>> + */
>>> +static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
>>> +{
>>> +	return atomic_read(&sched->hw_rq_count) <
>>> +		sched->hw_submission_limit;
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_select_entity - Select next entity to process
>>> + *
>>> + * @sched: scheduler instance
>>> + * @dequeue: dequeue selected entity
>>
>> When I see "dequeue" I'm thinking "list_del()". Let's
>> use "peek" here as mentioned above.
>>
>>> + *
>>> + * Returns the entity to process or NULL if none are found.
>>> + */
>>> +static struct drm_sched_entity *
>>> +drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool dequeue)
>>
>> drm_sched_select_entity(struct drm_gpu_scheduler *sched, bool peek)
>>
> 
> +1
> 
>>> +{
>>> +	struct drm_sched_entity *entity;
>>> +	int i;
>>> +
>>> +	if (!drm_sched_can_queue(sched))
>>> +		return NULL;
>>> +
>>> +	if (sched->single_entity) {
>>> +		if (!READ_ONCE(sched->single_entity->stopped) &&
>>> +		    drm_sched_entity_is_ready(sched->single_entity))
>>> +			return sched->single_entity;
>>> +
>>> +		return NULL;
>>> +	}
>>> +
>>> +	/* Kernel run queue has higher priority than normal run queue*/
>>> +	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
>>> +		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
>>> +			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i],
>>> +							dequeue) :
>>> +			drm_sched_rq_select_entity_rr(&sched->sched_rq[i],
>>> +						      dequeue);
>>> +		if (entity)
>>> +			break;
>>> +	}
>>> +
>>> +	return entity;
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_run_job_queue_if_ready - queue job submission if ready
>>>   * @sched: scheduler instance
>>>   */
>>> -static void drm_sched_submit_queue(struct drm_gpu_scheduler *sched)
>>> +static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
>>> +{
>>> +	if (drm_sched_select_entity(sched, false))
>>> +		drm_sched_run_job_queue(sched);
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_free_job_queue - queue free job
>>
>>  * drm_sched_free_job_queue - enqueue free-job work
>>
>>> + *
>>> + * @sched: scheduler instance to queue free job
>>
>>  * @sched: scheduler instance to queue free job work for
>>
>>
> 
> Will change both.
> 
>>> + */
>>> +static void drm_sched_free_job_queue(struct drm_gpu_scheduler *sched)
>>>  {
>>>  	if (!READ_ONCE(sched->pause_submit))
>>> -		queue_work(sched->submit_wq, &sched->work_submit);
>>> +		queue_work(sched->submit_wq, &sched->work_free_job);
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_free_job_queue_if_ready - queue free job if ready
>>
>>  * drm_sched_free_job_queue_if_ready - enqueue free-job work if ready
>>
> 
> Will change this too.
>  
>>> + *
>>> + * @sched: scheduler instance to queue free job
>>> + */
>>> +static void drm_sched_free_job_queue_if_ready(struct drm_gpu_scheduler *sched)
>>> +{
>>> +	struct drm_sched_job *job;
>>> +
>>> +	spin_lock(&sched->job_list_lock);
>>> +	job = list_first_entry_or_null(&sched->pending_list,
>>> +				       struct drm_sched_job, list);
>>> +	if (job && dma_fence_is_signaled(&job->s_fence->finished))
>>> +		drm_sched_free_job_queue(sched);
>>> +	spin_unlock(&sched->job_list_lock);
>>>  }
>>>  
>>>  /**
>>> @@ -310,7 +407,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result)
>>>  	dma_fence_get(&s_fence->finished);
>>>  	drm_sched_fence_finished(s_fence, result);
>>>  	dma_fence_put(&s_fence->finished);
>>> -	drm_sched_submit_queue(sched);
>>> +	drm_sched_free_job_queue(sched);
>>>  }
>>>  
>>>  /**
>>> @@ -885,18 +982,6 @@ void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>  }
>>>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>  
>>> -/**
>>> - * drm_sched_can_queue -- Can we queue more to the hardware?
>>> - * @sched: scheduler instance
>>> - *
>>> - * Return true if we can push more jobs to the hw, otherwise false.
>>> - */
>>> -static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
>>> -{
>>> -	return atomic_read(&sched->hw_rq_count) <
>>> -		sched->hw_submission_limit;
>>> -}
>>> -
>>>  /**
>>>   * drm_sched_wakeup_if_can_queue - Wake up the scheduler
>>>   * @sched: scheduler instance
>>> @@ -906,43 +991,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched)
>>>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched)
>>>  {
>>>  	if (drm_sched_can_queue(sched))
>>> -		drm_sched_submit_queue(sched);
>>> -}
>>> -
>>> -/**
>>> - * drm_sched_select_entity - Select next entity to process
>>> - *
>>> - * @sched: scheduler instance
>>> - *
>>> - * Returns the entity to process or NULL if none are found.
>>> - */
>>> -static struct drm_sched_entity *
>>> -drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>> -{
>>> -	struct drm_sched_entity *entity;
>>> -	int i;
>>> -
>>> -	if (!drm_sched_can_queue(sched))
>>> -		return NULL;
>>> -
>>> -	if (sched->single_entity) {
>>> -		if (!READ_ONCE(sched->single_entity->stopped) &&
>>> -		    drm_sched_entity_is_ready(sched->single_entity))
>>> -			return sched->single_entity;
>>> -
>>> -		return NULL;
>>> -	}
>>> -
>>> -	/* Kernel run queue has higher priority than normal run queue*/
>>> -	for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) {
>>> -		entity = sched->sched_policy == DRM_SCHED_POLICY_FIFO ?
>>> -			drm_sched_rq_select_entity_fifo(&sched->sched_rq[i]) :
>>> -			drm_sched_rq_select_entity_rr(&sched->sched_rq[i]);
>>> -		if (entity)
>>> -			break;
>>> -	}
>>> -
>>> -	return entity;
>>> +		drm_sched_run_job_queue(sched);
>>>  }
>>>  
>>>  /**
>>> @@ -974,8 +1023,10 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>>  						typeof(*next), list);
>>>  
>>>  		if (next) {
>>> -			next->s_fence->scheduled.timestamp =
>>> -				job->s_fence->finished.timestamp;
>>> +			if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
>>> +				     &next->s_fence->scheduled.flags))
>>> +				next->s_fence->scheduled.timestamp =
>>> +					job->s_fence->finished.timestamp;
>>>  			/* start TO timer for next job */
>>>  			drm_sched_start_timeout(sched);
>>>  		}
>>> @@ -1025,74 +1076,84 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
>>>  EXPORT_SYMBOL(drm_sched_pick_best);
>>>  
>>>  /**
>>> - * drm_sched_main - main scheduler thread
>>> + * drm_sched_free_job_work - worker to call free_job
>>>   *
>>> - * @param: scheduler instance
>>> + * @w: free job work
>>>   */
>>> -static void drm_sched_main(struct work_struct *w)
>>> +static void drm_sched_free_job_work(struct work_struct *w)
>>>  {
>>>  	struct drm_gpu_scheduler *sched =
>>> -		container_of(w, struct drm_gpu_scheduler, work_submit);
>>> -	struct drm_sched_entity *entity;
>>> +		container_of(w, struct drm_gpu_scheduler, work_free_job);
>>>  	struct drm_sched_job *cleanup_job;
>>> -	int r;
>>>  
>>>  	if (READ_ONCE(sched->pause_submit))
>>>  		return;
>>>  
>>>  	cleanup_job = drm_sched_get_cleanup_job(sched);
>>> -	entity = drm_sched_select_entity(sched);
>>> -
>>> -	if (!entity && !cleanup_job)
>>> -		return;	/* No more work */
>>> -
>>> -	if (cleanup_job)
>>> +	if (cleanup_job) {
>>>  		sched->ops->free_job(cleanup_job);
>>>  
>>> -	if (entity) {
>>> -		struct dma_fence *fence;
>>> -		struct drm_sched_fence *s_fence;
>>> -		struct drm_sched_job *sched_job;
>>> -
>>> -		sched_job = drm_sched_entity_pop_job(entity);
>>> -		if (!sched_job) {
>>> -			complete_all(&entity->entity_idle);
>>> -			if (!cleanup_job)
>>> -				return;	/* No more work */
>>> -			goto again;
>>> -		}
>>> +		drm_sched_free_job_queue_if_ready(sched);
>>> +		drm_sched_run_job_queue_if_ready(sched);
>>> +	}
>>> +}
>>> +
>>> +/**
>>> + * drm_sched_run_job_work - worker to call run_job
>>> + *
>>> + * @w: run job work
>>> + */
>>> +static void drm_sched_run_job_work(struct work_struct *w)
>>> +{
>>> +	struct drm_gpu_scheduler *sched =
>>> +		container_of(w, struct drm_gpu_scheduler, work_run_job);
>>> +	struct drm_sched_entity *entity;
>>> +	struct dma_fence *fence;
>>> +	struct drm_sched_fence *s_fence;
>>> +	struct drm_sched_job *sched_job;
>>> +	int r;
>>>  
>>> -		s_fence = sched_job->s_fence;
>>> +	if (READ_ONCE(sched->pause_submit))
>>> +		return;
>>>  
>>> -		atomic_inc(&sched->hw_rq_count);
>>> -		drm_sched_job_begin(sched_job);
>>> +	entity = drm_sched_select_entity(sched, true);
>>> +	if (!entity)
>>> +		return;
>>>  
>>> -		trace_drm_run_job(sched_job, entity);
>>> -		fence = sched->ops->run_job(sched_job);
>>> +	sched_job = drm_sched_entity_pop_job(entity);
>>> +	if (!sched_job) {
>>>  		complete_all(&entity->entity_idle);
>>> -		drm_sched_fence_scheduled(s_fence, fence);
>>> +		return;	/* No more work */
>>> +	}
>>>  
>>> -		if (!IS_ERR_OR_NULL(fence)) {
>>> -			/* Drop for original kref_init of the fence */
>>> -			dma_fence_put(fence);
>>> +	s_fence = sched_job->s_fence;
>>>  
>>> -			r = dma_fence_add_callback(fence, &sched_job->cb,
>>> -						   drm_sched_job_done_cb);
>>> -			if (r == -ENOENT)
>>> -				drm_sched_job_done(sched_job, fence->error);
>>> -			else if (r)
>>> -				DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
>>> -					  r);
>>> -		} else {
>>> -			drm_sched_job_done(sched_job, IS_ERR(fence) ?
>>> -					   PTR_ERR(fence) : 0);
>>> -		}
>>> +	atomic_inc(&sched->hw_rq_count);
>>> +	drm_sched_job_begin(sched_job);
>>> +
>>> +	trace_drm_run_job(sched_job, entity);
>>> +	fence = sched->ops->run_job(sched_job);
>>> +	complete_all(&entity->entity_idle);
>>> +	drm_sched_fence_scheduled(s_fence, fence);
>>>  
>>> -		wake_up(&sched->job_scheduled);
>>> +	if (!IS_ERR_OR_NULL(fence)) {
>>> +		/* Drop for original kref_init of the fence */
>>> +		dma_fence_put(fence);
>>> +
>>> +		r = dma_fence_add_callback(fence, &sched_job->cb,
>>> +					   drm_sched_job_done_cb);
>>> +		if (r == -ENOENT)
>>> +			drm_sched_job_done(sched_job, fence->error);
>>> +		else if (r)
>>> +			DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
>>> +				  r);
>>
>> Please align "r);" to the open brace on the previous line. If you're using Emacs
>> with sane Linux settings, press the "Tab" key anywhere on the line to indent it.
>> (It should run c-indent-line-or-region, usually using leading-tabs-only mode. Pressing
>> it again, over and over, on an already indented line, does nothing. Column indenting--say
>> for columns in 2D/3D/etc., array, usually happens using spaces, which is portable.
>> Also please take an overview with "scrips/checkpatch.pl --strict".)
>>
> 
> Will run checkpatch.
> 
>> Wrap-around was bumped to 100 in the Linux kernel so you can put the 'r' on
>> the same line without style problems.
>>
> 
> Using Vi with wrap around of 80 but know 100 is allowed. Will fix.
Ah, yes--I now see where the real problem is. :-D
-- 
Regards,
Luben


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2023-10-11 23:29 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-19  5:01 [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Matthew Brost
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 01/10] drm/sched: Add drm_sched_submit_* helpers Matthew Brost
2023-09-19  5:58   ` Christian König
2023-09-21  3:41     ` Luben Tuikov
2023-09-27  1:07   ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-09-27  3:32   ` Luben Tuikov
2023-10-05  3:33     ` Matthew Brost
2023-10-05  4:13       ` Luben Tuikov
2023-10-05 15:19         ` Matthew Brost
2023-10-06  7:59         ` Tvrtko Ursulin
2023-10-06 15:14           ` Matthew Brost
2023-10-06 23:43             ` Matthew Brost
2023-10-09  8:35               ` Tvrtko Ursulin
2023-10-11 23:19               ` Luben Tuikov
2023-10-11 23:11             ` Luben Tuikov
2023-10-11 23:10           ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 03/10] drm/sched: Move schedule policy to scheduler Matthew Brost
2023-09-24  1:18   ` kernel test robot
2023-09-27 12:13   ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 04/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
2023-09-27 14:36   ` Luben Tuikov
2023-10-05  4:02     ` Matthew Brost
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 05/10] drm/sched: Split free_job into own work item Matthew Brost
2023-09-28 16:14   ` Luben Tuikov
2023-10-05  4:06     ` Matthew Brost
2023-10-11 23:29       ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 06/10] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-09-29 21:23   ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start Matthew Brost
2023-09-29 21:53   ` Luben Tuikov
2023-09-30 19:48     ` Luben Tuikov
2023-10-05  3:11       ` Matthew Brost
2023-10-05  3:18         ` Luben Tuikov
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 08/10] drm/sched: Submit job before starting TDR Matthew Brost
2023-09-29 21:58   ` Luben Tuikov
2023-10-05  4:11     ` Matthew Brost
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs Matthew Brost
2023-09-29 22:44   ` Luben Tuikov
2023-10-05  3:22     ` Matthew Brost
2023-09-19  5:01 ` [Intel-xe] [PATCH v4 10/10] drm/sched: Update maintainers of GPU scheduler Matthew Brost
2023-09-19  5:32 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev6) Patchwork
2023-09-19 11:44 ` [Intel-xe] [PATCH v4 00/10] DRM scheduler changes for Xe Danilo Krummrich
2023-09-25 21:47   ` Danilo Krummrich
2023-09-27  7:33 ` Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox