public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] drm/panthor: flush FW AS caches in slow reset path
@ 2024-09-02 13:02 Adrián Larumbe
  2024-09-02 15:11 ` Steven Price
  0 siblings, 1 reply; 4+ messages in thread
From: Adrián Larumbe @ 2024-09-02 13:02 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter,
	Heiko Stuebner, Grant Likely
  Cc: kernel, Adrián Larumbe, stable, dri-devel, linux-kernel

In the off-chance that waiting for the firmware to signal its booted status
timed out in the fast reset path, one must flush the cache lines for the
entire FW VM address space before reloading the regions, otherwise stale
values eventually lead to a scheduler job timeout.

Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
Cc: stable@vger.kernel.org
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c  |  8 +++++++-
 drivers/gpu/drm/panthor/panthor_mmu.c | 21 ++++++++++++++++++---
 drivers/gpu/drm/panthor/panthor_mmu.h |  1 +
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 857f3f11258a..ef232c0c2049 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1089,6 +1089,12 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
 		panthor_fw_stop(ptdev);
 		ptdev->fw->fast_reset = false;
 		drm_err(&ptdev->base, "FW fast reset failed, trying a slow reset");
+
+		ret = panthor_vm_flush_all(ptdev->fw->vm);
+		if (ret) {
+			drm_err(&ptdev->base, "FW slow reset failed (couldn't flush FW's AS l2cache)");
+			return ret;
+		}
 	}
 
 	/* Reload all sections, including RO ones. We're not supposed
@@ -1099,7 +1105,7 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
 
 	ret = panthor_fw_start(ptdev);
 	if (ret) {
-		drm_err(&ptdev->base, "FW slow reset failed");
+		drm_err(&ptdev->base, "FW slow reset failed (couldn't start the FW )");
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index d47972806d50..bbc12728437f 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -576,6 +576,12 @@ static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
 	if (as_nr < 0)
 		return 0;
 
+	/*
+	 * If the AS number is greater than zero, then we can be sure
+	 * the device is up and running, so we don't need to explicitly
+	 * power it up
+	 */
+
 	if (op != AS_COMMAND_UNLOCK)
 		lock_region(ptdev, as_nr, iova, size);
 
@@ -874,14 +880,23 @@ static int panthor_vm_flush_range(struct panthor_vm *vm, u64 iova, u64 size)
 	if (!drm_dev_enter(&ptdev->base, &cookie))
 		return 0;
 
-	/* Flush the PTs only if we're already awake */
-	if (pm_runtime_active(ptdev->base.dev))
-		ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
+	ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
 
 	drm_dev_exit(cookie);
 	return ret;
 }
 
+/**
+ * panthor_vm_flush_all() - Flush L2 caches for the entirety of a VM's AS
+ * @vm: VM whose cache to flush
+ *
+ * Return: 0 on success, a negative error code if flush failed.
+ */
+int panthor_vm_flush_all(struct panthor_vm *vm)
+{
+	return panthor_vm_flush_range(vm, vm->base.mm_start, vm->base.mm_range);
+}
+
 static int panthor_vm_unmap_pages(struct panthor_vm *vm, u64 iova, u64 size)
 {
 	struct panthor_device *ptdev = vm->ptdev;
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
index f3c1ed19f973..6788771071e3 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.h
+++ b/drivers/gpu/drm/panthor/panthor_mmu.h
@@ -31,6 +31,7 @@ panthor_vm_get_bo_for_va(struct panthor_vm *vm, u64 va, u64 *bo_offset);
 int panthor_vm_active(struct panthor_vm *vm);
 void panthor_vm_idle(struct panthor_vm *vm);
 int panthor_vm_as(struct panthor_vm *vm);
+int panthor_vm_flush_all(struct panthor_vm *vm);
 
 struct panthor_heap_pool *
 panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panthor: flush FW AS caches in slow reset path
  2024-09-02 13:02 [PATCH v2] drm/panthor: flush FW AS caches in slow reset path Adrián Larumbe
@ 2024-09-02 15:11 ` Steven Price
  2024-09-02 15:20   ` Boris Brezillon
  2024-09-05  7:46   ` Boris Brezillon
  0 siblings, 2 replies; 4+ messages in thread
From: Steven Price @ 2024-09-02 15:11 UTC (permalink / raw)
  To: Adrián Larumbe, Boris Brezillon, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Daniel Vetter, Heiko Stuebner, Grant Likely
  Cc: kernel, stable, dri-devel, linux-kernel

On 02/09/2024 14:02, Adrián Larumbe wrote:
> In the off-chance that waiting for the firmware to signal its booted status
> timed out in the fast reset path, one must flush the cache lines for the
> entire FW VM address space before reloading the regions, otherwise stale
> values eventually lead to a scheduler job timeout.
> 
> Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
> Cc: stable@vger.kernel.org
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Acked-by: Liviu Dudau <liviu.dudau@arm.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panthor/panthor_fw.c  |  8 +++++++-
>  drivers/gpu/drm/panthor/panthor_mmu.c | 21 ++++++++++++++++++---
>  drivers/gpu/drm/panthor/panthor_mmu.h |  1 +
>  3 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 857f3f11258a..ef232c0c2049 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1089,6 +1089,12 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
>  		panthor_fw_stop(ptdev);
>  		ptdev->fw->fast_reset = false;
>  		drm_err(&ptdev->base, "FW fast reset failed, trying a slow reset");
> +
> +		ret = panthor_vm_flush_all(ptdev->fw->vm);
> +		if (ret) {
> +			drm_err(&ptdev->base, "FW slow reset failed (couldn't flush FW's AS l2cache)");
> +			return ret;
> +		}
>  	}
>  
>  	/* Reload all sections, including RO ones. We're not supposed
> @@ -1099,7 +1105,7 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
>  
>  	ret = panthor_fw_start(ptdev);
>  	if (ret) {
> -		drm_err(&ptdev->base, "FW slow reset failed");
> +		drm_err(&ptdev->base, "FW slow reset failed (couldn't start the FW )");
>  		return ret;
>  	}
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index d47972806d50..bbc12728437f 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -576,6 +576,12 @@ static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
>  	if (as_nr < 0)
>  		return 0;
>  
> +	/*
> +	 * If the AS number is greater than zero, then we can be sure
> +	 * the device is up and running, so we don't need to explicitly
> +	 * power it up
> +	 */
> +
>  	if (op != AS_COMMAND_UNLOCK)
>  		lock_region(ptdev, as_nr, iova, size);
>  
> @@ -874,14 +880,23 @@ static int panthor_vm_flush_range(struct panthor_vm *vm, u64 iova, u64 size)
>  	if (!drm_dev_enter(&ptdev->base, &cookie))
>  		return 0;
>  
> -	/* Flush the PTs only if we're already awake */
> -	if (pm_runtime_active(ptdev->base.dev))
> -		ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
> +	ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
>  
>  	drm_dev_exit(cookie);
>  	return ret;
>  }
>  
> +/**
> + * panthor_vm_flush_all() - Flush L2 caches for the entirety of a VM's AS
> + * @vm: VM whose cache to flush
> + *
> + * Return: 0 on success, a negative error code if flush failed.
> + */
> +int panthor_vm_flush_all(struct panthor_vm *vm)
> +{
> +	return panthor_vm_flush_range(vm, vm->base.mm_start, vm->base.mm_range);
> +}
> +
>  static int panthor_vm_unmap_pages(struct panthor_vm *vm, u64 iova, u64 size)
>  {
>  	struct panthor_device *ptdev = vm->ptdev;
> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
> index f3c1ed19f973..6788771071e3 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.h
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.h
> @@ -31,6 +31,7 @@ panthor_vm_get_bo_for_va(struct panthor_vm *vm, u64 va, u64 *bo_offset);
>  int panthor_vm_active(struct panthor_vm *vm);
>  void panthor_vm_idle(struct panthor_vm *vm);
>  int panthor_vm_as(struct panthor_vm *vm);
> +int panthor_vm_flush_all(struct panthor_vm *vm);
>  
>  struct panthor_heap_pool *
>  panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panthor: flush FW AS caches in slow reset path
  2024-09-02 15:11 ` Steven Price
@ 2024-09-02 15:20   ` Boris Brezillon
  2024-09-05  7:46   ` Boris Brezillon
  1 sibling, 0 replies; 4+ messages in thread
From: Boris Brezillon @ 2024-09-02 15:20 UTC (permalink / raw)
  To: Steven Price
  Cc: Adrián Larumbe, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter,
	Heiko Stuebner, Grant Likely, kernel, stable, dri-devel,
	linux-kernel

On Mon, 2 Sep 2024 16:11:51 +0100
Steven Price <steven.price@arm.com> wrote:

> On 02/09/2024 14:02, Adrián Larumbe wrote:
> > In the off-chance that waiting for the firmware to signal its booted status
> > timed out in the fast reset path, one must flush the cache lines for the
> > entire FW VM address space before reloading the regions, otherwise stale
> > values eventually lead to a scheduler job timeout.
> > 
> > Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Acked-by: Liviu Dudau <liviu.dudau@arm.com>  
> 
> Reviewed-by: Steven Price <steven.price@arm.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> 
> > ---
> >  drivers/gpu/drm/panthor/panthor_fw.c  |  8 +++++++-
> >  drivers/gpu/drm/panthor/panthor_mmu.c | 21 ++++++++++++++++++---
> >  drivers/gpu/drm/panthor/panthor_mmu.h |  1 +
> >  3 files changed, 26 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> > index 857f3f11258a..ef232c0c2049 100644
> > --- a/drivers/gpu/drm/panthor/panthor_fw.c
> > +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> > @@ -1089,6 +1089,12 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
> >  		panthor_fw_stop(ptdev);
> >  		ptdev->fw->fast_reset = false;
> >  		drm_err(&ptdev->base, "FW fast reset failed, trying a slow reset");
> > +
> > +		ret = panthor_vm_flush_all(ptdev->fw->vm);
> > +		if (ret) {
> > +			drm_err(&ptdev->base, "FW slow reset failed (couldn't flush FW's AS l2cache)");
> > +			return ret;
> > +		}
> >  	}
> >  
> >  	/* Reload all sections, including RO ones. We're not supposed
> > @@ -1099,7 +1105,7 @@ int panthor_fw_post_reset(struct panthor_device *ptdev)
> >  
> >  	ret = panthor_fw_start(ptdev);
> >  	if (ret) {
> > -		drm_err(&ptdev->base, "FW slow reset failed");
> > +		drm_err(&ptdev->base, "FW slow reset failed (couldn't start the FW )");
> >  		return ret;
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > index d47972806d50..bbc12728437f 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > @@ -576,6 +576,12 @@ static int mmu_hw_do_operation_locked(struct panthor_device *ptdev, int as_nr,
> >  	if (as_nr < 0)
> >  		return 0;
> >  
> > +	/*
> > +	 * If the AS number is greater than zero, then we can be sure
> > +	 * the device is up and running, so we don't need to explicitly
> > +	 * power it up
> > +	 */
> > +
> >  	if (op != AS_COMMAND_UNLOCK)
> >  		lock_region(ptdev, as_nr, iova, size);
> >  
> > @@ -874,14 +880,23 @@ static int panthor_vm_flush_range(struct panthor_vm *vm, u64 iova, u64 size)
> >  	if (!drm_dev_enter(&ptdev->base, &cookie))
> >  		return 0;
> >  
> > -	/* Flush the PTs only if we're already awake */
> > -	if (pm_runtime_active(ptdev->base.dev))
> > -		ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
> > +	ret = mmu_hw_do_operation(vm, iova, size, AS_COMMAND_FLUSH_PT);
> >  
> >  	drm_dev_exit(cookie);
> >  	return ret;
> >  }
> >  
> > +/**
> > + * panthor_vm_flush_all() - Flush L2 caches for the entirety of a VM's AS
> > + * @vm: VM whose cache to flush
> > + *
> > + * Return: 0 on success, a negative error code if flush failed.
> > + */
> > +int panthor_vm_flush_all(struct panthor_vm *vm)
> > +{
> > +	return panthor_vm_flush_range(vm, vm->base.mm_start, vm->base.mm_range);
> > +}
> > +
> >  static int panthor_vm_unmap_pages(struct panthor_vm *vm, u64 iova, u64 size)
> >  {
> >  	struct panthor_device *ptdev = vm->ptdev;
> > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.h b/drivers/gpu/drm/panthor/panthor_mmu.h
> > index f3c1ed19f973..6788771071e3 100644
> > --- a/drivers/gpu/drm/panthor/panthor_mmu.h
> > +++ b/drivers/gpu/drm/panthor/panthor_mmu.h
> > @@ -31,6 +31,7 @@ panthor_vm_get_bo_for_va(struct panthor_vm *vm, u64 va, u64 *bo_offset);
> >  int panthor_vm_active(struct panthor_vm *vm);
> >  void panthor_vm_idle(struct panthor_vm *vm);
> >  int panthor_vm_as(struct panthor_vm *vm);
> > +int panthor_vm_flush_all(struct panthor_vm *vm);
> >  
> >  struct panthor_heap_pool *
> >  panthor_vm_get_heap_pool(struct panthor_vm *vm, bool create);  
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panthor: flush FW AS caches in slow reset path
  2024-09-02 15:11 ` Steven Price
  2024-09-02 15:20   ` Boris Brezillon
@ 2024-09-05  7:46   ` Boris Brezillon
  1 sibling, 0 replies; 4+ messages in thread
From: Boris Brezillon @ 2024-09-05  7:46 UTC (permalink / raw)
  To: Steven Price
  Cc: Adrián Larumbe, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter,
	Heiko Stuebner, Grant Likely, kernel, stable, dri-devel,
	linux-kernel

On Mon, 2 Sep 2024 16:11:51 +0100
Steven Price <steven.price@arm.com> wrote:

> On 02/09/2024 14:02, Adrián Larumbe wrote:
> > In the off-chance that waiting for the firmware to signal its booted status
> > timed out in the fast reset path, one must flush the cache lines for the
> > entire FW VM address space before reloading the regions, otherwise stale
> > values eventually lead to a scheduler job timeout.
> > 
> > Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Acked-by: Liviu Dudau <liviu.dudau@arm.com>  
> 
> Reviewed-by: Steven Price <steven.price@arm.com>

Pushed to drm-misc-fixes.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-09-05  7:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-02 13:02 [PATCH v2] drm/panthor: flush FW AS caches in slow reset path Adrián Larumbe
2024-09-02 15:11 ` Steven Price
2024-09-02 15:20   ` Boris Brezillon
2024-09-05  7:46   ` Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox