* Re: [PATCH] drm/panthor: Fix race with suspend during unplug
2025-10-22 10:32 [PATCH] drm/panthor: Fix race with suspend during unplug Ketil Johnsen
@ 2025-10-22 10:59 ` Steven Price
2025-10-24 13:41 ` Liviu Dudau
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Steven Price @ 2025-10-22 10:59 UTC (permalink / raw)
To: Ketil Johnsen, Boris Brezillon, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Heiko Stuebner
Cc: Grant Likely, dri-devel, linux-kernel
On 22/10/2025 11:32, Ketil Johnsen wrote:
> There is a race between panthor_device_unplug() and
> panthor_device_suspend() which can lead to IRQ handlers running on a
> powered down GPU. This is how it can happen:
> - unplug routine calls drm_dev_unplug()
> - panthor_device_suspend() can now execute, and will skip a lot of
> important work because the device is currently marked as unplugged.
> - IRQs will remain active in this case and IRQ handlers can therefore
> try to access a powered down GPU.
>
> The fix is simply to take the PM ref in panthor_device_unplug() a
> little bit earlier, before drm_dev_unplug().
>
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block")
Reviewed-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 81df49880bd87..962a10e00848e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> return;
> }
>
> + drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> +
> /* Call drm_dev_unplug() so any access to HW blocks happening after
> * that point get rejected.
> */
> @@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> */
> mutex_unlock(&ptdev->unplug.lock);
>
> - drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> -
> /* Now, try to cleanly shutdown the GPU before the device resources
> * get reclaimed.
> */
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] drm/panthor: Fix race with suspend during unplug
2025-10-22 10:32 [PATCH] drm/panthor: Fix race with suspend during unplug Ketil Johnsen
2025-10-22 10:59 ` Steven Price
@ 2025-10-24 13:41 ` Liviu Dudau
2025-10-24 14:34 ` Boris Brezillon
2025-11-03 14:39 ` Liviu Dudau
3 siblings, 0 replies; 5+ messages in thread
From: Liviu Dudau @ 2025-10-24 13:41 UTC (permalink / raw)
To: Ketil Johnsen
Cc: Boris Brezillon, Steven Price, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Heiko Stuebner,
Grant Likely, dri-devel, linux-kernel
On Wed, Oct 22, 2025 at 12:32:41PM +0200, Ketil Johnsen wrote:
> There is a race between panthor_device_unplug() and
> panthor_device_suspend() which can lead to IRQ handlers running on a
> powered down GPU. This is how it can happen:
> - unplug routine calls drm_dev_unplug()
> - panthor_device_suspend() can now execute, and will skip a lot of
> important work because the device is currently marked as unplugged.
> - IRQs will remain active in this case and IRQ handlers can therefore
> try to access a powered down GPU.
>
> The fix is simply to take the PM ref in panthor_device_unplug() a
> little bit earlier, before drm_dev_unplug().
>
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block")
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Best regards,
Liviu
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 81df49880bd87..962a10e00848e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> return;
> }
>
> + drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> +
> /* Call drm_dev_unplug() so any access to HW blocks happening after
> * that point get rejected.
> */
> @@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> */
> mutex_unlock(&ptdev->unplug.lock);
>
> - drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> -
> /* Now, try to cleanly shutdown the GPU before the device resources
> * get reclaimed.
> */
> --
> 2.47.2
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] drm/panthor: Fix race with suspend during unplug
2025-10-22 10:32 [PATCH] drm/panthor: Fix race with suspend during unplug Ketil Johnsen
2025-10-22 10:59 ` Steven Price
2025-10-24 13:41 ` Liviu Dudau
@ 2025-10-24 14:34 ` Boris Brezillon
2025-11-03 14:39 ` Liviu Dudau
3 siblings, 0 replies; 5+ messages in thread
From: Boris Brezillon @ 2025-10-24 14:34 UTC (permalink / raw)
To: Ketil Johnsen
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Heiko Stuebner,
Grant Likely, dri-devel, linux-kernel
On Wed, 22 Oct 2025 12:32:41 +0200
Ketil Johnsen <ketil.johnsen@arm.com> wrote:
> There is a race between panthor_device_unplug() and
> panthor_device_suspend() which can lead to IRQ handlers running on a
> powered down GPU. This is how it can happen:
> - unplug routine calls drm_dev_unplug()
> - panthor_device_suspend() can now execute, and will skip a lot of
> important work because the device is currently marked as unplugged.
> - IRQs will remain active in this case and IRQ handlers can therefore
> try to access a powered down GPU.
>
> The fix is simply to take the PM ref in panthor_device_unplug() a
> little bit earlier, before drm_dev_unplug().
>
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 81df49880bd87..962a10e00848e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> return;
> }
>
> + drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> +
> /* Call drm_dev_unplug() so any access to HW blocks happening after
> * that point get rejected.
> */
> @@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> */
> mutex_unlock(&ptdev->unplug.lock);
>
> - drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> -
> /* Now, try to cleanly shutdown the GPU before the device resources
> * get reclaimed.
> */
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] drm/panthor: Fix race with suspend during unplug
2025-10-22 10:32 [PATCH] drm/panthor: Fix race with suspend during unplug Ketil Johnsen
` (2 preceding siblings ...)
2025-10-24 14:34 ` Boris Brezillon
@ 2025-11-03 14:39 ` Liviu Dudau
3 siblings, 0 replies; 5+ messages in thread
From: Liviu Dudau @ 2025-11-03 14:39 UTC (permalink / raw)
To: Ketil Johnsen
Cc: Boris Brezillon, Steven Price, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Heiko Stuebner,
Grant Likely, dri-devel, linux-kernel
On Wed, Oct 22, 2025 at 12:32:41PM +0200, Ketil Johnsen wrote:
> There is a race between panthor_device_unplug() and
> panthor_device_suspend() which can lead to IRQ handlers running on a
> powered down GPU. This is how it can happen:
> - unplug routine calls drm_dev_unplug()
> - panthor_device_suspend() can now execute, and will skip a lot of
> important work because the device is currently marked as unplugged.
> - IRQs will remain active in this case and IRQ handlers can therefore
> try to access a powered down GPU.
>
> The fix is simply to take the PM ref in panthor_device_unplug() a
> little bit earlier, before drm_dev_unplug().
>
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block")
Pushed to drm-misc-next.
Best regards,
Liviu
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 81df49880bd87..962a10e00848e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> return;
> }
>
> + drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> +
> /* Call drm_dev_unplug() so any access to HW blocks happening after
> * that point get rejected.
> */
> @@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev)
> */
> mutex_unlock(&ptdev->unplug.lock);
>
> - drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> -
> /* Now, try to cleanly shutdown the GPU before the device resources
> * get reclaimed.
> */
> --
> 2.47.2
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 5+ messages in thread