Re: [PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

From: Philipp Stanner <phasta@mailbox.org>
To: "Maíra Canal" <mcanal@igalia.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Melissa Wen" <mwen@igalia.com>,
	"Lucas Stach" <l.stach@pengutronix.de>,
	"Russell King" <linux+etnaviv@armlinux.org.uk>,
	"Christian Gmeiner" <christian.gmeiner@gmail.com>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Rob Herring" <robh@kernel.org>,
	"Steven Price" <steven.price@arm.com>
Cc: kernel-dev@igalia.com, dri-devel@lists.freedesktop.org,
	 etnaviv@lists.freedesktop.org, intel-xe@lists.freedesktop.org
Subject: Re: [PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running
Date: Tue, 13 May 2025 09:26:26 +0200	[thread overview]
Message-ID: <4242fd242c7e16d0ecdf11c5d0ad795efda727a5.camel@mailbox.org> (raw)
In-Reply-To: <20250503-sched-skip-reset-v1-1-ed0d6701a3fe@igalia.com>

On Sat, 2025-05-03 at 17:59 -0300, Maíra Canal wrote:
> When the DRM scheduler times out, it's possible that the GPU isn't
> hung;
> instead, a job may still be running, and there may be no valid reason
> to
> reset the hardware. This can occur in two situations:
> 
>   1. The GPU exposes some mechanism that ensures the GPU is still
> making
>      progress. By checking this mechanism, we can safely skip the
> reset,
>      rearm the timeout, and allow the job to continue running until
>      completion. This is the case for v3d and Etnaviv.
>   2. TDR has fired before the IRQ that signals the fence.
> Consequently,
>      the job actually finishes, but it triggers a timeout before
> signaling
>      the completion fence.
> 
> These two scenarios are problematic because we remove the job from
> the
> `sched->pending_list` before calling `sched->ops->timedout_job()`.
> This
> means that when the job finally signals completion (e.g. in the IRQ
> handler), the scheduler won't call `sched->ops->free_job()`. As a
> result,
> the job and its resources won't be freed, leading to a memory leak.

We have discussed this and discovered another, related issue. See
below.

> 
> To resolve this issue, we create a new `drm_gpu_sched_stat` that
> allows a
> driver to skip the reset. This new status will indicate that the job
> should be reinserted into the pending list, and the driver will still
> signal its completion.
> 
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 14 ++++++++++++++
>  include/drm/gpu_scheduler.h            |  2 ++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index
> 829579c41c6b5d8b2abce5ad373c7017469b7680..68ca827d77e32187a034309f881
> 135dbc639a9b4 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -568,6 +568,17 @@ static void drm_sched_job_timedout(struct
> work_struct *work)

So, the fundamental design problem we have is that the scheduler
assumes that when a timeout occurs, the GPU is completely hung. Your
patch addresses another aspect of that very problem.

But if the GPU is not hung, it can signal the hardware fence at any
moment. So that's racy.

It could, theoretically, lead to backend_ops.timedout_job() being
called with a signaled job, i.e., a job that is not really timed out.

Would you say this is *the same* issue you're describing, or a separate
one? It seems to me that it's a separate one.

Anyways. What I propose is that we wait until your series here has been
merged. Once that's done, we should document that drivers should expect
that backend_ops.timedout_job() can get called with a job that has not
actually timed out, and tell the scheduler about it through
DRM_GPU_SCHED_STAT_NOT_HANGING. Then the scheduler reverts the
timeout's actions, as you propose here.


>  			job->sched->ops->free_job(job);
>  			sched->free_guilty = false;
>  		}
> +
> +		/*
> +		 * If the driver indicated that the GPU is still
> running and wants to skip
> +		 * the reset, reinsert the job back into the pending
> list and realarm the
> +		 * timeout.
> +		 */
> +		if (status == DRM_GPU_SCHED_STAT_RUNNING) {
> +			spin_lock(&sched->job_list_lock);
> +			list_add(&job->list, &sched->pending_list);
> +			spin_unlock(&sched->job_list_lock);
> +		}

btw, if you go for Matt's requeue work item approach, it'll be better
to write a helper function with a clear name for all that.

drm_sched_job_reinsert_on_false_timout() maybe.


P.


>  	} else {
>  		spin_unlock(&sched->job_list_lock);
>  	}
> @@ -590,6 +601,9 @@ static void drm_sched_job_timedout(struct
> work_struct *work)
>   * This function is typically used for reset recovery (see the docu
> of
>   * drm_sched_backend_ops.timedout_job() for details). Do not call it
> for
>   * scheduler teardown, i.e., before calling drm_sched_fini().
> + *
> + * As it's used for reset recovery, drm_sched_stop() shouldn't be
> called
> + * if the scheduler skipped the timeout (DRM_SCHED_STAT_RUNNING).
>   */
>  void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
> drm_sched_job *bad)
>  {
> diff --git a/include/drm/gpu_scheduler.h
> b/include/drm/gpu_scheduler.h
> index
> 1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b
> 927202a507d51 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -389,11 +389,13 @@ struct drm_sched_job {
>   * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
>   * @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded.
>   * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available
> anymore.
> + * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the
> reset.
>   */
>  enum drm_gpu_sched_stat {
>  	DRM_GPU_SCHED_STAT_NONE,
>  	DRM_GPU_SCHED_STAT_NOMINAL,
>  	DRM_GPU_SCHED_STAT_ENODEV,
> +	DRM_GPU_SCHED_STAT_RUNNING,
>  };
>  
>  /**
>

next prev parent reply	other threads:[~2025-05-13  7:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-03 20:59 [PATCH 0/8] drm/sched: Allow drivers to skip the reset with DRM_GPU_SCHED_STAT_RUNNING Maíra Canal
2025-05-03 20:59 ` [PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running Maíra Canal
2025-05-06  2:41   ` Matthew Brost
2025-05-06 14:32     ` Matthew Brost
2025-05-12 11:13       ` Philipp Stanner
2025-05-12 14:04         ` Maíra Canal
2025-05-12 14:09           ` Philipp Stanner
2025-05-12 14:58             ` Philipp Stanner
2025-05-06 11:32   ` Tvrtko Ursulin
2025-05-07 12:33     ` Maíra Canal
2025-05-07 12:50       ` Tvrtko Ursulin
2025-05-12  9:24         ` Philipp Stanner
2025-05-12 11:04   ` Philipp Stanner
2025-05-12 13:59     ` Maíra Canal
2025-05-13  7:26   ` Philipp Stanner [this message]
2025-05-24 13:33     ` Maíra Canal
2025-05-03 20:59 ` [PATCH 2/8] drm/sched: Always free the job after the timeout Maíra Canal
2025-05-06 11:49   ` Tvrtko Ursulin
2025-05-06 12:46     ` Maíra Canal
2025-05-06 13:28       ` Tvrtko Ursulin
2025-05-06 13:38         ` Maíra Canal
2025-05-06 14:18           ` Tvrtko Ursulin
2025-05-03 20:59 ` [PATCH 3/8] drm/sched: Reduce scheduler's timeout for timeout tests Maíra Canal
2025-05-06 12:03   ` Tvrtko Ursulin
2025-05-06 12:56     ` Maíra Canal
2025-05-06 13:20       ` Tvrtko Ursulin
2025-05-03 20:59 ` [PATCH 4/8] drm/sched: Add new test for DRM_GPU_SCHED_STAT_RUNNING Maíra Canal
2025-05-06 13:58   ` Tvrtko Ursulin
2025-05-03 20:59 ` [PATCH 5/8] drm/v3d: Use DRM_GPU_SCHED_STAT_RUNNING to skip the reset Maíra Canal
2025-05-03 20:59 ` [PATCH 6/8] drm/etnaviv: " Maíra Canal
2025-05-03 20:59 ` [PATCH 7/8] drm/xe: " Maíra Canal
2025-05-06  4:25   ` Matthew Brost
2025-05-03 20:59 ` [PATCH 8/8] drm/panfrost: " Maíra Canal
2025-05-08 15:04   ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4242fd242c7e16d0ecdf11c5d0ad795efda727a5.camel@mailbox.org \
    --to=phasta@mailbox.org \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.gmeiner@gmail.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=etnaviv@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=l.stach@pengutronix.de \
    --cc=linux+etnaviv@armlinux.org.uk \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mcanal@igalia.com \
    --cc=mwen@igalia.com \
    --cc=phasta@kernel.org \
    --cc=robh@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).