Re: [PATCH 1/1] drm/scheduler: Job timeout handler returns status (v2)

From: Steven Price <steven.price@arm.com>
To: Luben Tuikov <luben.tuikov@amd.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Cc: "Andrey Grodzovsky" <Andrey.Grodzovsky@amd.com>,
	"kernel test robot" <lkp@intel.com>,
	"Tomeu Vizoso" <tomeu.vizoso@collabora.com>,
	"Rob Herring" <robh@kernel.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Alyssa Rosenzweig" <alyssa.rosenzweig@collabora.com>,
	"Eric Anholt" <eric@anholt.net>,
	"Christian Gmeiner" <christian.gmeiner@gmail.com>,
	"Qiang Yu" <yuq825@gmail.com>,
	"Russell King" <linux+etnaviv@armlinux.org.uk>,
	"Alexander Deucher" <Alexander.Deucher@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Lucas Stach" <l.stach@pengutronix.de>
Subject: Re: [PATCH 1/1] drm/scheduler: Job timeout handler returns status (v2)
Date: Thu, 10 Dec 2020 09:46:22 +0000	[thread overview]
Message-ID: <5adf573f-8b56-9f85-96c4-727cc6fcadfc@arm.com> (raw)
In-Reply-To: <20201210021438.9190-2-luben.tuikov@amd.com>

On 10/12/2020 02:14, Luben Tuikov wrote:
> This patch does not change current behaviour.
> 
> The driver's job timeout handler now returns
> status indicating back to the DRM layer whether
> the task (job) was successfully aborted or whether
> more time should be given to the task to complete.

I find the definitions given a little confusing, see below.

> Default behaviour as of this patch, is preserved,
> except in obvious-by-comment case in the Panfrost
> driver, as documented below.
> 
> All drivers which make use of the
> drm_sched_backend_ops' .timedout_job() callback
> have been accordingly renamed and return the
> would've-been default value of
> DRM_TASK_STATUS_ALIVE to restart the task's
> timeout timer--this is the old behaviour, and
> is preserved by this patch.
> 
> In the case of the Panfrost driver, its timedout
> callback correctly first checks if the job had
> completed in due time and if so, it now returns
> DRM_TASK_STATUS_COMPLETE to notify the DRM layer
> that the task can be moved to the done list, to be
> freed later. In the other two subsequent checks,
> the value of DRM_TASK_STATUS_ALIVE is returned, as
> per the default behaviour.
> 
> A more involved driver's solutions can be had
> in subequent patches.

NIT: ^^^^^^^^^ subsequent

> 
> v2: Use enum as the status of a driver's job
>      timeout callback method.
> 
> Cc: Alexander Deucher <Alexander.Deucher@amd.com>
> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Eric Anholt <eric@anholt.net>
> Reported-by: kernel test robot <lkp@intel.com>

This reported-by seems a little odd for this patch.

> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 +++--
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++-
>   drivers/gpu/drm/lima/lima_sched.c       |  4 +++-
>   drivers/gpu/drm/panfrost/panfrost_job.c |  9 ++++---
>   drivers/gpu/drm/scheduler/sched_main.c  |  4 +---
>   drivers/gpu/drm/v3d/v3d_sched.c         | 32 +++++++++++++------------
>   include/drm/gpu_scheduler.h             | 20 +++++++++++++---
>   7 files changed, 57 insertions(+), 28 deletions(-)
> 

[....]

> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..cedfc5394e52 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>   }
>   
> +enum drm_task_status {
> +	DRM_TASK_STATUS_COMPLETE,
> +	DRM_TASK_STATUS_ALIVE
> +};
> +
>   /**
>    * struct drm_sched_backend_ops
>    *
> @@ -230,10 +235,19 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy
> +	 * and executing in the hardware, i.e. it needs more time.

So 'alive' means the job (was) alive, and GPU recovery is happening. 
I.e. it's the job just takes too long. Panfrost will trigger a GPU reset 
(killing the job) in this case while returning DRM_TASK_STATUS_ALIVE.

> +	 *
> +	 * Return DRM_TASK_STATUS_COMPLETE, if the task (job) has
> +	 * been aborted or is unknown to the hardware, i.e. if
> +	 * the task is out of the hardware, and maybe it is now
> +	 * in the done list, or it was completed long ago, or
> +	 * if it is unknown to the hardware.

Where 'complete' seems to mean a variety of things:

  * The job completed successfully (i.e. the timeout raced), this is the 
situation that Panfrost detects. In this case (and only this case) the 
GPU reset will *not* happen.

  * The job failed (aborted) and is no longer on the hardware. Panfrost 
currently handles a job failure by triggering drm_sched_fault() to 
trigger the timeout handler. But the timeout handler doesn't handle this 
differently so will return DRM_TASK_STATUS_ALIVE.

  * The job is "unknown to hardware". There are some corner cases in 
Panfrost (specifically two early returns from panfrost_job_hw_submit()) 
where the job never actually lands on the hardware, but the scheduler 
isn't informed. We currently rely on the timeout handling to recover 
from that. However, again, the timeout handler doesn't know about this 
soo will return DRM_TASK_STATUS_ALIVE.

So of the four cases listed in these comments, Panfrost is only getting 
2 'correct' after this change.

But what I really want to know is what the scheduler is planning to do 
in these situations? The Panfrost return value in this patch is really a 
"did we trigger a GPU reset" - and doesn't seem to match the 
descriptions above.

Steve

>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx