Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <christian.koenig@amd.com>,
	<pstanner@redhat.com>, <dakr@kernel.org>
Subject: Re: [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state
Date: Tue, 18 Nov 2025 09:45:12 -0800	[thread overview]
Message-ID: <aRywqNOh6HUhFnRd@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <aRt-OPJh01t8AhVG@nvishwa1-desk>

On Mon, Nov 17, 2025 at 11:57:44AM -0800, Niranjana Vishwanathapura wrote:
> On Thu, Oct 16, 2025 at 01:48:21PM -0700, Matthew Brost wrote:
> > Add helpers to see if scheduler is stopped and a jobs signaled state.
> > Expected to be used driver side on recovery and debug flows.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/scheduler/sched_main.c |  4 ++--
> > include/drm/gpu_scheduler.h            | 32 ++++++++++++++++++++++++--
> > 2 files changed, 32 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 46119aacb809..69bd6e482268 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -344,7 +344,7 @@ drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched,
> >  */
> > static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> > {
> > -	if (!READ_ONCE(sched->pause_submit))
> > +	if (!drm_sched_is_stopped(sched))
> > 		queue_work(sched->submit_wq, &sched->work_run_job);
> > }
> > 
> > @@ -354,7 +354,7 @@ static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
> >  */
> > static void drm_sched_run_free_queue(struct drm_gpu_scheduler *sched)
> > {
> > -	if (!READ_ONCE(sched->pause_submit))
> > +	if (!drm_sched_is_stopped(sched))
> > 		queue_work(sched->submit_wq, &sched->work_free_job);
> > }
> > 
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7f31eba3bd61..d1a2d7f61c1d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -700,6 +700,17 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> > 
> > /* Inlines */
> > 
> > +/**
> > + * drm_sched_is_stopped() - DRM is stopped
> > + * @sched: DRM scheduler
> > + *
> > + * Return: True if sched is stopped, False otherwise
> > + */
> > +static inline bool drm_sched_is_stopped(struct drm_gpu_scheduler *sched)
> > +{
> > +	return READ_ONCE(sched->pause_submit);
> > +}
> > +
> > /**
> >  * struct drm_sched_pending_job_iter - DRM scheduler pending job iterator state
> >  * @sched: DRM scheduler associated with pending job iterator
> > @@ -716,7 +727,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> > 		.sched = sched,
> > 	};
> > 
> > -	WARN_ON(!READ_ONCE(sched->pause_submit));
> > +	WARN_ON(!drm_sched_is_stopped(sched));
> > 	return iter;
> > }
> 
> NIT...instead of modifying the functions added in previous patch, may be this
> patch should go in first and the previous patch can be added after that with
> drm_sched_is_stopped() usage?
> 

Yes, I think that would be better ordering. Will fix.

> > 
> > @@ -724,7 +735,7 @@ __drm_sched_pending_job_iter_begin(struct drm_gpu_scheduler *sched)
> > static inline void
> > __drm_sched_pending_job_iter_end(const struct drm_sched_pending_job_iter iter)
> > {
> > -	WARN_ON(!READ_ONCE(iter.sched->pause_submit));
> > +	WARN_ON(!drm_sched_is_stopped(iter.sched));
> > }
> > 
> > DEFINE_CLASS(drm_sched_pending_job_iter, struct drm_sched_pending_job_iter,
> > @@ -750,4 +761,21 @@ class_drm_sched_pending_job_iter_lock_ptr(class_drm_sched_pending_job_iter_t *_T
> > 		list_for_each_entry((__job), &(__sched)->pending_list, list)	\
> > 			for_each_if(!(__entity) || (__job)->entity == (__entity))
> > 
> > +/**
> > + * drm_sched_job_is_signaled() - DRM scheduler job is signaled
> > + * @job: DRM scheduler job
> > + *
> > + * Determine if DRM scheduler job is signaled. DRM scheduler should be stopped
> > + * to obtain a stable snapshot of state.
> > + *
> > + * Return: True if job is signaled, False otherwise
> > + */
> > +static inline bool drm_sched_job_is_signaled(struct drm_sched_job *job)
> > +{
> > +	struct drm_sched_fence *s_fence = job->s_fence;
> > +
> > +	WARN_ON(!drm_sched_is_stopped(job->sched));
> > +	return dma_fence_is_signaled(&s_fence->finished);
> > +}
> 
> NIT..In patch#4 where xe driver uses this function in couple places,
> I am seeing originally it checks if the s_fence->parent is signaled
> instead of &s_fence->finished as done here.
> I do see below message in the 's_fence->parent' kernel-doc,
> "We signal the &drm_sched_fence.finished fence once parent is signalled."
> So, probably it is fine, but just want to ensure.
> 

It more or less is the same check. Techincally the hardware fence
(parent) can signal before software fence (finished) but it is pretty
small race window which practice should never be hit but I could make
this function more robust can check on the parent fence too, that is
probably better I guess. Let me change this.

Matt

> Niranjana
> 
> > +
> > #endif
> > -- 
> > 2.34.1
> > 

  reply	other threads:[~2025-11-18 17:45 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16 20:48 [PATCH v3 0/7] Fix DRM scheduler layering violations in Xe Matthew Brost
2025-10-16 20:48 ` [PATCH v3 1/7] drm/sched: Add pending job list iterator Matthew Brost
2025-11-15  1:25   ` Niranjana Vishwanathapura
2025-11-18 17:52     ` Matthew Brost
2025-11-18 21:12       ` Niranjana Vishwanathapura
2025-10-16 20:48 ` [PATCH v3 2/7] drm/sched: Add several job helpers to avoid drivers touching scheduler state Matthew Brost
2025-11-17 19:57   ` Niranjana Vishwanathapura
2025-11-18 17:45     ` Matthew Brost [this message]
2025-10-16 20:48 ` [PATCH v3 3/7] drm/xe: Add dedicated message lock Matthew Brost
2025-11-17 19:58   ` Niranjana Vishwanathapura
2025-11-18 17:53     ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 4/7] drm/xe: Stop abusing DRM scheduler internals Matthew Brost
2025-11-18  6:39   ` Niranjana Vishwanathapura
2025-11-18 17:59     ` Matthew Brost
2025-11-18 21:17       ` Niranjana Vishwanathapura
2025-11-18 22:54         ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 5/7] drm/xe: Do not deregister queues in TDR Matthew Brost
2025-11-18  6:41   ` Niranjana Vishwanathapura
2025-11-18 18:02     ` Matthew Brost
2025-11-18 21:19       ` Niranjana Vishwanathapura
2025-11-18 22:59         ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 6/7] drm/xe: Remove special casing for LR queues in submission Matthew Brost
2025-11-18  6:45   ` Niranjana Vishwanathapura
2025-11-18 18:03     ` Matthew Brost
2025-10-16 20:48 ` [PATCH v3 7/7] drm/xe: Only toggle scheduling in TDR if GuC is running Matthew Brost
2025-11-15  1:01   ` Niranjana Vishwanathapura
2025-11-18 18:06     ` Matthew Brost
2025-10-16 20:55 ` ✗ CI.checkpatch: warning for Fix DRM scheduler layering violations in Xe (rev3) Patchwork
2025-10-16 20:56 ` ✓ CI.KUnit: success " Patchwork
2025-10-16 21:36 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-17 18:43 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRywqNOh6HUhFnRd@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=niranjana.vishwanathapura@intel.com \
    --cc=pstanner@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox