Re: [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Adrián Larumbe" <adrian.larumbe@collabora.com>
Cc: Steven Price <steven.price@arm.com>,
	Liviu Dudau <liviu.dudau@arm.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	kernel@collabora.com, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend
Date: Mon, 14 Oct 2024 09:27:04 +0200	[thread overview]
Message-ID: <20241014092704.50a21276@collabora.com> (raw)
In-Reply-To: <20241011225906.3789965-3-adrian.larumbe@collabora.com>

On Fri, 11 Oct 2024 23:57:01 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> On rk3588 SoCs, during a runtime PM suspend, the transition to the
> lowest voltage/frequency pair might sometimes fail for reasons not yet
> understood. In that case, even a slow FW reset will fail, leaving the
> device's PM runtime status as unusuable.
> 
> When that happens, successive attempts to resume the device upon running
> a job will always fail.
> 
> Fix it by forcing a synchronous device reset, which will lead to a
> successful FW reload, and also reset the device's PM runtime error
> status before resuming it.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_device.c | 10 ++++++++++
>  drivers/gpu/drm/panthor/panthor_device.h |  2 ++
>  drivers/gpu/drm/panthor/panthor_sched.c  |  7 +++++++
>  3 files changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 5430557bd0b8..ec6fed5e996b 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -105,6 +105,16 @@ static void panthor_device_reset_cleanup(struct drm_device *ddev, void *data)
>  	destroy_workqueue(ptdev->reset.wq);
>  }
>  
> +int panthor_device_reset_sync(struct panthor_device *ptdev)
> +{
> +	panthor_fw_pre_reset(ptdev, false);
> +	panthor_mmu_pre_reset(ptdev);
> +	panthor_gpu_soft_reset(ptdev);
> +	panthor_gpu_l2_power_on(ptdev);
> +	panthor_mmu_post_reset(ptdev);
> +	return panthor_fw_post_reset(ptdev);
> +}
> +
>  static void panthor_device_reset_work(struct work_struct *work)
>  {
>  	struct panthor_device *ptdev = container_of(work, struct panthor_device, reset.work);
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 0e68f5a70d20..05a5a7233378 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -217,6 +217,8 @@ struct panthor_file {
>  int panthor_device_init(struct panthor_device *ptdev);
>  void panthor_device_unplug(struct panthor_device *ptdev);
>  
> +int panthor_device_reset_sync(struct panthor_device *ptdev);
> +
>  /**
>   * panthor_device_schedule_reset() - Schedules a reset operation
>   */
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index c7b350fc3eba..9a854c8c5718 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3101,6 +3101,13 @@ queue_run_job(struct drm_sched_job *sched_job)
>  		return dma_fence_get(job->done_fence);
>  	}
>  
> +	if (ptdev->base.dev->power.runtime_error) {
> +		ret = panthor_device_reset_sync(ptdev);
> +		if (drm_WARN_ON(&ptdev->base, ret))
> +			return ERR_PTR(ret);
> +		drm_WARN_ON(&ptdev->base, pm_runtime_set_active(ptdev->base.dev));
> +	}

I'd rather pretend the suspend/resume worked (even if it didn't) and
deal with the consequences (force a slow reset on the next resume), than
spread the 'if-PM-op-failed-force-sync-reset' thing everywhere we do a
pm_runtime_resume_and_get(). Also not sure how resetting the GPU will
help fixing the OPP transition failure.

> +
>  	ret = pm_runtime_resume_and_get(ptdev->base.dev);
>  	if (drm_WARN_ON(&ptdev->base, ret))
>  		return ERR_PTR(ret);

next prev parent reply	other threads:[~2024-10-14  7:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
2024-10-14  7:28   ` Boris Brezillon
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
2024-10-14  7:27   ` Boris Brezillon [this message]
2024-10-16  9:14   ` kernel test robot
2024-10-11 23:22 ` [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Liviu Dudau
2024-10-14  7:12 ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241014092704.50a21276@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=adrian.larumbe@collabora.com \
    --cc=airlied@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liviu.dudau@arm.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox