Re: [PATCH v2 2/8] drm/sched: Allow drivers to skip the reset and keep on running

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

From: Philipp Stanner <phasta@mailbox.org>
To: "Maíra Canal" <mcanal@igalia.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"David Airlie" <airlied@gmail.com>,
	"Melissa Wen" <mwen@igalia.com>,
	"Lucas Stach" <l.stach@pengutronix.de>,
	"Russell King" <linux+etnaviv@armlinux.org.uk>,
	"Christian Gmeiner" <christian.gmeiner@gmail.com>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Rob Herring" <robh@kernel.org>,
	"Steven Price" <steven.price@arm.com>,
	"Liviu Dudau" <liviu.dudau@arm.com>
Cc: kernel-dev@igalia.com, dri-devel@lists.freedesktop.org,
	 etnaviv@lists.freedesktop.org, intel-xe@lists.freedesktop.org
Subject: Re: [PATCH v2 2/8] drm/sched: Allow drivers to skip the reset and keep on running
Date: Mon, 02 Jun 2025 09:06:49 +0200	[thread overview]
Message-ID: <1e0fb3c8bbbcc18b0fb771b6e2d4616a0a9a11a3.camel@mailbox.org> (raw)
In-Reply-To: <20250530-sched-skip-reset-v2-2-c40a8d2d8daa@igalia.com>

Hi,

thx for the update. Seems to be developing nicely. Some comments below.

On Fri, 2025-05-30 at 11:01 -0300, Maíra Canal wrote:
> When the DRM scheduler times out, it's possible that the GPU isn't
> hung;
> instead, a job may still be running, and there may be no valid reason
> to
> reset the hardware. This can occur in two situations:
> 
>   1. The GPU exposes some mechanism that ensures the GPU is still
> making
>      progress. By checking this mechanism, the driver can safely skip
> the
>      reset, re-arm the timeout, and allow the job to continue running
> until
>      completion. This is the case for v3d, Etnaviv, and Xe.
>   2. Timeout has fired before the free-job worker. Consequently, the
>      scheduler calls `timedout_job()` for a job that isn't timed out.
> 
> These two scenarios are problematic because the job was removed from
> the
> `sched->pending_list` before calling `sched->ops->timedout_job()`,
> which
> means that when the job finishes, it won't be freed by the scheduler
> though `sched->ops->free_job()`. As a result, the job and its
> resources
> won't be freed, leading to a memory leak.

nit: redundant "won't bee freed"

> 
> To resolve those scenarios, create a new `drm_gpu_sched_stat`, called

nit:
s/resolve those scenarios/solve those problems

> DRM_GPU_SCHED_STAT_NO_HANG, that allows a driver to skip the reset.
> The
> new status will indicate that the job should be reinserted into the

nit:
s/should/must

> pending list, and the hardware / driver is still responsible to
> signal job completion.

The driver is *always* responsible for signaling, well, "the hardware
fence". We could have a discussion about whether a job is "completed"
if the driver signals its hardware fence through the timedout_job()
callback, but I think it's safer to just change this sentence to:

"and the hardware / driver will still complete that job."

> 
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 49
> ++++++++++++++++++++++++++++++++--
>  include/drm/gpu_scheduler.h            |  3 +++
>  2 files changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index
> 3b0760dfa4fe2fc63e893cda733e78d08dd451d5..ddc53eadab7bb6a15109f43989a
> fa1f7a95a3b41 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -379,11 +379,16 @@ static void drm_sched_run_free_queue(struct
> drm_gpu_scheduler *sched)
>  {
>  	struct drm_sched_job *job;
>  
> -	spin_lock(&sched->job_list_lock);
>  	job = list_first_entry_or_null(&sched->pending_list,
>  				       struct drm_sched_job, list);
>  	if (job && dma_fence_is_signaled(&job->s_fence->finished))
>  		__drm_sched_run_free_queue(sched);
> +}
> +
> +static void drm_sched_run_free_queue_unlocked(struct
> drm_gpu_scheduler *sched)
> +{
> +	spin_lock(&sched->job_list_lock);
> +	drm_sched_run_free_queue(sched);
>  	spin_unlock(&sched->job_list_lock);
>  }

nit:
Took me a few seconds to realize why that's necessary. A sentence in
the commit message might have been good. But no big thing, up to you

>  
> @@ -536,6 +541,32 @@ static void drm_sched_job_begin(struct
> drm_sched_job *s_job)
>  	spin_unlock(&sched->job_list_lock);
>  }
>  
> +/**
> + * drm_sched_job_reinsert_on_false_timeout - Reinsert the job on a
> false timeout
> + *

Please remove this empty line. Our docu style in those files is not
consistent, and I'd like to move towards a more unified style.

> + * @sched: scheduler instance
> + * @job: job to be reinserted on the pending list
> + *
> + * In the case of a "false timeout" - when a timeout occurs but the
> GPU isn't
> + * hung and the job is making progress, the scheduler must reinsert
> the job back
> + * into the pending list. Otherwise, the job and its resources won't
> be freed
> + * through the &drm_sched_backend_ops.free_job callback.
> + *
> + * Note that after reinserting the job, the scheduler enqueues the
> free-job
> + * work again if ready. Otherwise, a signaled job could be added to
> the pending
> + * list, but never freed.
> + *
> + * This function must be used in "false timeout" cases only.
> + */
> +static void drm_sched_job_reinsert_on_false_timeout(struct
> drm_gpu_scheduler *sched,
> +						    struct
> drm_sched_job *job)
> +{
> +	spin_lock(&sched->job_list_lock);
> +	list_add(&job->list, &sched->pending_list);
> +	drm_sched_run_free_queue(sched);
> +	spin_unlock(&sched->job_list_lock);
> +}
> +
>  static void drm_sched_job_timedout(struct work_struct *work)
>  {
>  	struct drm_gpu_scheduler *sched;
> @@ -569,6 +600,14 @@ static void drm_sched_job_timedout(struct
> work_struct *work)
>  			job->sched->ops->free_job(job);
>  			sched->free_guilty = false;
>  		}
> +
> +		/*
> +		 * If the driver indicated that the GPU is still
> running and wants
> +		 * to skip the reset, reinsert the job back into the
> pending list
> +		 * and re-arm the timeout.

Doesn't sound entirely correct to me – at this point, the driver itself
did already skip the reset. The scheduler has no control over that.

You might also just drop the comment, I think the function name and the
function's docstring make what's happening perfectly clear.

> +		 */
> +		if (status == DRM_GPU_SCHED_STAT_NO_HANG)
> +			drm_sched_job_reinsert_on_false_timeout(sche
> d, job);
>  	} else {
>  		spin_unlock(&sched->job_list_lock);
>  	}
> @@ -591,6 +630,9 @@ static void drm_sched_job_timedout(struct
> work_struct *work)
>   * This function is typically used for reset recovery (see the docu
> of
>   * drm_sched_backend_ops.timedout_job() for details). Do not call it
> for
>   * scheduler teardown, i.e., before calling drm_sched_fini().
> + *
> + * As it's used for reset recovery, drm_sched_stop() shouldn't be
> called
> + * if the driver skipped the timeout (DRM_GPU_SCHED_STAT_NO_HANG).

s/timeout/reset

>   */
>  void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
> drm_sched_job *bad)
>  {
> @@ -676,6 +718,9 @@ EXPORT_SYMBOL(drm_sched_stop);
>   * drm_sched_backend_ops.timedout_job() for details). Do not call it
> for
>   * scheduler startup. The scheduler itself is fully operational
> after
>   * drm_sched_init() succeeded.
> + *
> + * As it's used for reset recovery, drm_sched_start() shouldn't be
> called
> + * if the driver skipped the timeout (DRM_GPU_SCHED_STAT_NO_HANG).

same

>   */
>  void drm_sched_start(struct drm_gpu_scheduler *sched, int errno)
>  {
> @@ -1197,7 +1242,7 @@ static void drm_sched_free_job_work(struct
> work_struct *w)
>  	if (job)
>  		sched->ops->free_job(job);
>  
> -	drm_sched_run_free_queue(sched);
> +	drm_sched_run_free_queue_unlocked(sched);
>  	drm_sched_run_job_queue(sched);
>  }
>  
> diff --git a/include/drm/gpu_scheduler.h
> b/include/drm/gpu_scheduler.h
> index
> 83e5c00d8dd9a83ab20547a93d6fc572de97616e..063c1915841aa54a0859bdccd3c
> 1ef6028105bec 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -393,11 +393,14 @@ struct drm_sched_job {
>   * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
>   * @DRM_GPU_SCHED_STAT_RESET: The GPU hung and successfully reset.
>   * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available
> anymore.
> + * @DRM_GPU_SCHED_STAT_NO_HANG: Contrary to scheduler's belief, the
> GPU
> + * did not hang and it's operational.

s/it's/is

>   */
>  enum drm_gpu_sched_stat {
>  	DRM_GPU_SCHED_STAT_NONE,
>  	DRM_GPU_SCHED_STAT_RESET,
>  	DRM_GPU_SCHED_STAT_ENODEV,
> +	DRM_GPU_SCHED_STAT_NO_HANG,
>  };
>  
>  /**
> 

Thx, I'll look through the other ones soonish, too

next prev parent reply	other threads:[~2025-06-02  7:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-30 14:01 [PATCH v2 0/8] drm/sched: Allow drivers to skip the reset with DRM_GPU_SCHED_STAT_NO_HANG Maíra Canal
2025-05-30 14:01 ` [PATCH v2 1/8] drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET Maíra Canal
2025-05-30 14:01 ` [PATCH v2 2/8] drm/sched: Allow drivers to skip the reset and keep on running Maíra Canal
2025-06-02  7:06   ` Philipp Stanner [this message]
2025-05-30 14:01 ` [PATCH v2 3/8] drm/sched: Reduce scheduler's timeout for timeout tests Maíra Canal
2025-06-02  8:54   ` Philipp Stanner
2025-06-02  9:06   ` Tvrtko Ursulin
2025-05-30 14:01 ` [PATCH v2 4/8] drm/sched: Add new test for DRM_GPU_SCHED_STAT_NO_HANG Maíra Canal
2025-06-02  9:34   ` Tvrtko Ursulin
2025-05-30 14:01 ` [PATCH v2 5/8] drm/v3d: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset Maíra Canal
2025-06-02  7:13   ` Philipp Stanner
2025-06-02 11:27     ` Maíra Canal
2025-05-30 14:01 ` [PATCH v2 6/8] drm/etnaviv: " Maíra Canal
2025-06-02  7:28   ` Philipp Stanner
2025-06-02 11:36     ` Maíra Canal
2025-06-02 11:59       ` Philipp Stanner
2025-05-30 14:01 ` [PATCH v2 7/8] drm/xe: " Maíra Canal
2025-06-02  7:47   ` Philipp Stanner
2025-05-30 14:01 ` [PATCH v2 8/8] drm/panfrost: " Maíra Canal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e0fb3c8bbbcc18b0fb771b6e2d4616a0a9a11a3.camel@mailbox.org \
    --to=phasta@mailbox.org \
    --cc=airlied@gmail.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.gmeiner@gmail.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=etnaviv@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=l.stach@pengutronix.de \
    --cc=linux+etnaviv@armlinux.org.uk \
    --cc=liviu.dudau@arm.com \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mcanal@igalia.com \
    --cc=mwen@igalia.com \
    --cc=phasta@kernel.org \
    --cc=robh@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).