* [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error
@ 2024-10-11 22:56 Adrián Larumbe
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Adrián Larumbe @ 2024-10-11 22:56 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, Adrián Larumbe, dri-devel, linux-kernel
In case an OPP transition to a suspension state fails during the runtime
PM suspend call, if the driver's subsystems were successfully resumed,
we should return -EAGAIN so that the device's runtime PM status remains
'active'.
If FW reload failed, then we should fall through, so that the PM core
can flag the device as having suffered a runtime error.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index 4082c8f2951d..cedd3cbcb47d 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -528,8 +528,13 @@ int panthor_device_suspend(struct device *dev)
drm_dev_enter(&ptdev->base, &cookie)) {
panthor_gpu_resume(ptdev);
panthor_mmu_resume(ptdev);
- drm_WARN_ON(&ptdev->base, panthor_fw_resume(ptdev));
- panthor_sched_resume(ptdev);
+ ret = panthor_fw_resume(ptdev);
+ if (!ret) {
+ panthor_sched_resume(ptdev);
+ ret = -EAGAIN;
+ } else {
+ drm_err(&ptdev->base, "FW resume failed at runtime suspend: %d\n", ret);
+ }
drm_dev_exit(cookie);
}
--
2.46.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
@ 2024-10-11 22:57 ` Adrián Larumbe
2024-10-14 7:28 ` Boris Brezillon
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Adrián Larumbe @ 2024-10-11 22:57 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, Adrián Larumbe, dri-devel, linux-kernel
When the device's runtime PM suspend callback is invoked, the switch to
a suspension OPP might sometimes fail. Although this is beyond the
control of the Panthor driver, we can attempt suspending it more than
once as a defensive strategy.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index cedd3cbcb47d..5430557bd0b8 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -490,6 +490,7 @@ int panthor_device_resume(struct device *dev)
int panthor_device_suspend(struct device *dev)
{
struct panthor_device *ptdev = dev_get_drvdata(dev);
+ unsigned int susp_retries;
int ret, cookie;
if (atomic_read(&ptdev->pm.state) != PANTHOR_DEVICE_PM_STATE_ACTIVE)
@@ -522,7 +523,12 @@ int panthor_device_suspend(struct device *dev)
drm_dev_exit(cookie);
}
- ret = panthor_devfreq_suspend(ptdev);
+ for (susp_retries = 0; susp_retries < 5; susp_retries++) {
+ ret = panthor_devfreq_suspend(ptdev);
+ if (!ret)
+ break;
+ }
+
if (ret) {
if (panthor_device_is_initialized(ptdev) &&
drm_dev_enter(&ptdev->base, &cookie)) {
--
2.46.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
@ 2024-10-11 22:57 ` Adrián Larumbe
2024-10-14 7:27 ` Boris Brezillon
2024-10-16 9:14 ` kernel test robot
2024-10-11 23:22 ` [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Liviu Dudau
2024-10-14 7:12 ` Boris Brezillon
3 siblings, 2 replies; 8+ messages in thread
From: Adrián Larumbe @ 2024-10-11 22:57 UTC (permalink / raw)
To: Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: kernel, Adrián Larumbe, dri-devel, linux-kernel
On rk3588 SoCs, during a runtime PM suspend, the transition to the
lowest voltage/frequency pair might sometimes fail for reasons not yet
understood. In that case, even a slow FW reset will fail, leaving the
device's PM runtime status as unusuable.
When that happens, successive attempts to resume the device upon running
a job will always fail.
Fix it by forcing a synchronous device reset, which will lead to a
successful FW reload, and also reset the device's PM runtime error
status before resuming it.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
drivers/gpu/drm/panthor/panthor_device.c | 10 ++++++++++
drivers/gpu/drm/panthor/panthor_device.h | 2 ++
drivers/gpu/drm/panthor/panthor_sched.c | 7 +++++++
3 files changed, 19 insertions(+)
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index 5430557bd0b8..ec6fed5e996b 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -105,6 +105,16 @@ static void panthor_device_reset_cleanup(struct drm_device *ddev, void *data)
destroy_workqueue(ptdev->reset.wq);
}
+int panthor_device_reset_sync(struct panthor_device *ptdev)
+{
+ panthor_fw_pre_reset(ptdev, false);
+ panthor_mmu_pre_reset(ptdev);
+ panthor_gpu_soft_reset(ptdev);
+ panthor_gpu_l2_power_on(ptdev);
+ panthor_mmu_post_reset(ptdev);
+ return panthor_fw_post_reset(ptdev);
+}
+
static void panthor_device_reset_work(struct work_struct *work)
{
struct panthor_device *ptdev = container_of(work, struct panthor_device, reset.work);
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 0e68f5a70d20..05a5a7233378 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -217,6 +217,8 @@ struct panthor_file {
int panthor_device_init(struct panthor_device *ptdev);
void panthor_device_unplug(struct panthor_device *ptdev);
+int panthor_device_reset_sync(struct panthor_device *ptdev);
+
/**
* panthor_device_schedule_reset() - Schedules a reset operation
*/
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index c7b350fc3eba..9a854c8c5718 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -3101,6 +3101,13 @@ queue_run_job(struct drm_sched_job *sched_job)
return dma_fence_get(job->done_fence);
}
+ if (ptdev->base.dev->power.runtime_error) {
+ ret = panthor_device_reset_sync(ptdev);
+ if (drm_WARN_ON(&ptdev->base, ret))
+ return ERR_PTR(ret);
+ drm_WARN_ON(&ptdev->base, pm_runtime_set_active(ptdev->base.dev));
+ }
+
ret = pm_runtime_resume_and_get(ptdev->base.dev);
if (drm_WARN_ON(&ptdev->base, ret))
return ERR_PTR(ret);
--
2.46.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
@ 2024-10-11 23:22 ` Liviu Dudau
2024-10-14 7:12 ` Boris Brezillon
3 siblings, 0 replies; 8+ messages in thread
From: Liviu Dudau @ 2024-10-11 23:22 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Boris Brezillon, Steven Price, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Fri, Oct 11, 2024 at 11:56:59PM +0100, Adrián Larumbe wrote:
> In case an OPP transition to a suspension state fails during the runtime
> PM suspend call, if the driver's subsystems were successfully resumed,
> we should return -EAGAIN so that the device's runtime PM status remains
> 'active'.
>
> If FW reload failed, then we should fall through, so that the PM core
> can flag the device as having suffered a runtime error.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com> for this patch. For the other two
I would like first if we try to understand why the suspend does not happen
quick enough (or at all).
Best regards,
Liviu
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 4082c8f2951d..cedd3cbcb47d 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -528,8 +528,13 @@ int panthor_device_suspend(struct device *dev)
> drm_dev_enter(&ptdev->base, &cookie)) {
> panthor_gpu_resume(ptdev);
> panthor_mmu_resume(ptdev);
> - drm_WARN_ON(&ptdev->base, panthor_fw_resume(ptdev));
> - panthor_sched_resume(ptdev);
> + ret = panthor_fw_resume(ptdev);
> + if (!ret) {
> + panthor_sched_resume(ptdev);
> + ret = -EAGAIN;
> + } else {
> + drm_err(&ptdev->base, "FW resume failed at runtime suspend: %d\n", ret);
> + }
> drm_dev_exit(cookie);
> }
>
> --
> 2.46.2
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
` (2 preceding siblings ...)
2024-10-11 23:22 ` [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Liviu Dudau
@ 2024-10-14 7:12 ` Boris Brezillon
3 siblings, 0 replies; 8+ messages in thread
From: Boris Brezillon @ 2024-10-14 7:12 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Fri, 11 Oct 2024 23:56:59 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> In case an OPP transition to a suspension state fails during the runtime
> PM suspend call, if the driver's subsystems were successfully resumed,
> we should return -EAGAIN so that the device's runtime PM status remains
> 'active'.
>
> If FW reload failed, then we should fall through, so that the PM core
> can flag the device as having suffered a runtime error.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 4082c8f2951d..cedd3cbcb47d 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -528,8 +528,13 @@ int panthor_device_suspend(struct device *dev)
> drm_dev_enter(&ptdev->base, &cookie)) {
> panthor_gpu_resume(ptdev);
> panthor_mmu_resume(ptdev);
> - drm_WARN_ON(&ptdev->base, panthor_fw_resume(ptdev));
> - panthor_sched_resume(ptdev);
> + ret = panthor_fw_resume(ptdev);
> + if (!ret) {
> + panthor_sched_resume(ptdev);
> + ret = -EAGAIN;
> + } else {
> + drm_err(&ptdev->base, "FW resume failed at runtime suspend: %d\n", ret);
> + }
Hm, I'm not convinced resuming when devfreq_suspend() fails was the
right thing to do anyway. Can't we just assume the suspend succeeded in
that case, and force the devfreq OOP transition in the resume path, or
ignore it?
> drm_dev_exit(cookie);
> }
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
@ 2024-10-14 7:27 ` Boris Brezillon
2024-10-16 9:14 ` kernel test robot
1 sibling, 0 replies; 8+ messages in thread
From: Boris Brezillon @ 2024-10-14 7:27 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Fri, 11 Oct 2024 23:57:01 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> On rk3588 SoCs, during a runtime PM suspend, the transition to the
> lowest voltage/frequency pair might sometimes fail for reasons not yet
> understood. In that case, even a slow FW reset will fail, leaving the
> device's PM runtime status as unusuable.
>
> When that happens, successive attempts to resume the device upon running
> a job will always fail.
>
> Fix it by forcing a synchronous device reset, which will lead to a
> successful FW reload, and also reset the device's PM runtime error
> status before resuming it.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 10 ++++++++++
> drivers/gpu/drm/panthor/panthor_device.h | 2 ++
> drivers/gpu/drm/panthor/panthor_sched.c | 7 +++++++
> 3 files changed, 19 insertions(+)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 5430557bd0b8..ec6fed5e996b 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -105,6 +105,16 @@ static void panthor_device_reset_cleanup(struct drm_device *ddev, void *data)
> destroy_workqueue(ptdev->reset.wq);
> }
>
> +int panthor_device_reset_sync(struct panthor_device *ptdev)
> +{
> + panthor_fw_pre_reset(ptdev, false);
> + panthor_mmu_pre_reset(ptdev);
> + panthor_gpu_soft_reset(ptdev);
> + panthor_gpu_l2_power_on(ptdev);
> + panthor_mmu_post_reset(ptdev);
> + return panthor_fw_post_reset(ptdev);
> +}
> +
> static void panthor_device_reset_work(struct work_struct *work)
> {
> struct panthor_device *ptdev = container_of(work, struct panthor_device, reset.work);
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 0e68f5a70d20..05a5a7233378 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -217,6 +217,8 @@ struct panthor_file {
> int panthor_device_init(struct panthor_device *ptdev);
> void panthor_device_unplug(struct panthor_device *ptdev);
>
> +int panthor_device_reset_sync(struct panthor_device *ptdev);
> +
> /**
> * panthor_device_schedule_reset() - Schedules a reset operation
> */
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index c7b350fc3eba..9a854c8c5718 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3101,6 +3101,13 @@ queue_run_job(struct drm_sched_job *sched_job)
> return dma_fence_get(job->done_fence);
> }
>
> + if (ptdev->base.dev->power.runtime_error) {
> + ret = panthor_device_reset_sync(ptdev);
> + if (drm_WARN_ON(&ptdev->base, ret))
> + return ERR_PTR(ret);
> + drm_WARN_ON(&ptdev->base, pm_runtime_set_active(ptdev->base.dev));
> + }
I'd rather pretend the suspend/resume worked (even if it didn't) and
deal with the consequences (force a slow reset on the next resume), than
spread the 'if-PM-op-failed-force-sync-reset' thing everywhere we do a
pm_runtime_resume_and_get(). Also not sure how resetting the GPU will
help fixing the OPP transition failure.
> +
> ret = pm_runtime_resume_and_get(ptdev->base.dev);
> if (drm_WARN_ON(&ptdev->base, ret))
> return ERR_PTR(ret);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
@ 2024-10-14 7:28 ` Boris Brezillon
0 siblings, 0 replies; 8+ messages in thread
From: Boris Brezillon @ 2024-10-14 7:28 UTC (permalink / raw)
To: Adrián Larumbe
Cc: Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, kernel, dri-devel,
linux-kernel
On Fri, 11 Oct 2024 23:57:00 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:
> When the device's runtime PM suspend callback is invoked, the switch to
> a suspension OPP might sometimes fail. Although this is beyond the
> control of the Panthor driver, we can attempt suspending it more than
> once as a defensive strategy.
>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
> drivers/gpu/drm/panthor/panthor_device.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index cedd3cbcb47d..5430557bd0b8 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -490,6 +490,7 @@ int panthor_device_resume(struct device *dev)
> int panthor_device_suspend(struct device *dev)
> {
> struct panthor_device *ptdev = dev_get_drvdata(dev);
> + unsigned int susp_retries;
> int ret, cookie;
>
> if (atomic_read(&ptdev->pm.state) != PANTHOR_DEVICE_PM_STATE_ACTIVE)
> @@ -522,7 +523,12 @@ int panthor_device_suspend(struct device *dev)
> drm_dev_exit(cookie);
> }
>
> - ret = panthor_devfreq_suspend(ptdev);
> + for (susp_retries = 0; susp_retries < 5; susp_retries++) {
> + ret = panthor_devfreq_suspend(ptdev);
> + if (!ret)
> + break;
> + }
This retry logic should probably be moved to panthor_devfreq_suspend(),
but as Liviu said, I think we need to better understand why it takes
several attempts for an OPP transition to succeed.
> +
> if (ret) {
> if (panthor_device_is_initialized(ptdev) &&
> drm_dev_enter(&ptdev->base, &cookie)) {
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
2024-10-14 7:27 ` Boris Brezillon
@ 2024-10-16 9:14 ` kernel test robot
1 sibling, 0 replies; 8+ messages in thread
From: kernel test robot @ 2024-10-16 9:14 UTC (permalink / raw)
To: Adrián Larumbe, Boris Brezillon, Steven Price, Liviu Dudau,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: llvm, oe-kbuild-all, kernel, Adrián Larumbe, dri-devel,
linux-kernel
Hi Adrián,
kernel test robot noticed the following build errors:
[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on linus/master v6.12-rc3 next-20241016]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Adri-n-Larumbe/drm-panthor-Retry-OPP-transition-to-suspension-state-a-few-times/20241012-070112
base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link: https://lore.kernel.org/r/20241011225906.3789965-3-adrian.larumbe%40collabora.com
patch subject: [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend
config: i386-buildonly-randconfig-001-20241016 (https://download.01.org/0day-ci/archive/20241016/202410161634.8YjhTQM2-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241016/202410161634.8YjhTQM2-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410161634.8YjhTQM2-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/gpu/drm/panthor/panthor_sched.c:3104:29: error: no member named 'runtime_error' in 'struct dev_pm_info'
3104 | if (ptdev->base.dev->power.runtime_error) {
| ~~~~~~~~~~~~~~~~~~~~~~ ^
include/linux/compiler.h:55:47: note: expanded from macro 'if'
55 | #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
| ^~~~
include/linux/compiler.h:57:52: note: expanded from macro '__trace_if_var'
57 | #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
| ^~~~
>> drivers/gpu/drm/panthor/panthor_sched.c:3104:29: error: no member named 'runtime_error' in 'struct dev_pm_info'
3104 | if (ptdev->base.dev->power.runtime_error) {
| ~~~~~~~~~~~~~~~~~~~~~~ ^
include/linux/compiler.h:55:47: note: expanded from macro 'if'
55 | #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
| ^~~~
include/linux/compiler.h:57:61: note: expanded from macro '__trace_if_var'
57 | #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
| ^~~~
>> drivers/gpu/drm/panthor/panthor_sched.c:3104:29: error: no member named 'runtime_error' in 'struct dev_pm_info'
3104 | if (ptdev->base.dev->power.runtime_error) {
| ~~~~~~~~~~~~~~~~~~~~~~ ^
include/linux/compiler.h:55:47: note: expanded from macro 'if'
55 | #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
| ^~~~
include/linux/compiler.h:57:86: note: expanded from macro '__trace_if_var'
57 | #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
| ^~~~
include/linux/compiler.h:68:3: note: expanded from macro '__trace_if_value'
68 | (cond) ? \
| ^~~~
3 errors generated.
vim +3104 drivers/gpu/drm/panthor/panthor_sched.c
3081
3082 static struct dma_fence *
3083 queue_run_job(struct drm_sched_job *sched_job)
3084 {
3085 struct panthor_job *job = container_of(sched_job, struct panthor_job, base);
3086 struct panthor_group *group = job->group;
3087 struct panthor_queue *queue = group->queues[job->queue_idx];
3088 struct panthor_device *ptdev = group->ptdev;
3089 struct panthor_scheduler *sched = ptdev->scheduler;
3090 struct panthor_job_ringbuf_instrs instrs;
3091 struct panthor_job_cs_params cs_params;
3092 struct dma_fence *done_fence;
3093 int ret;
3094
3095 /* Stream size is zero, nothing to do except making sure all previously
3096 * submitted jobs are done before we signal the
3097 * drm_sched_job::s_fence::finished fence.
3098 */
3099 if (!job->call_info.size) {
3100 job->done_fence = dma_fence_get(queue->fence_ctx.last_fence);
3101 return dma_fence_get(job->done_fence);
3102 }
3103
> 3104 if (ptdev->base.dev->power.runtime_error) {
3105 ret = panthor_device_reset_sync(ptdev);
3106 if (drm_WARN_ON(&ptdev->base, ret))
3107 return ERR_PTR(ret);
3108 drm_WARN_ON(&ptdev->base, pm_runtime_set_active(ptdev->base.dev));
3109 }
3110
3111 ret = pm_runtime_resume_and_get(ptdev->base.dev);
3112 if (drm_WARN_ON(&ptdev->base, ret))
3113 return ERR_PTR(ret);
3114
3115 mutex_lock(&sched->lock);
3116 if (!group_can_run(group)) {
3117 done_fence = ERR_PTR(-ECANCELED);
3118 goto out_unlock;
3119 }
3120
3121 dma_fence_init(job->done_fence,
3122 &panthor_queue_fence_ops,
3123 &queue->fence_ctx.lock,
3124 queue->fence_ctx.id,
3125 atomic64_inc_return(&queue->fence_ctx.seqno));
3126
3127 job->profiling.slot = queue->profiling.seqno++;
3128 if (queue->profiling.seqno == queue->profiling.slot_count)
3129 queue->profiling.seqno = 0;
3130
3131 job->ringbuf.start = queue->iface.input->insert;
3132
3133 get_job_cs_params(job, &cs_params);
3134 prepare_job_instrs(&cs_params, &instrs);
3135 copy_instrs_to_ringbuf(queue, job, &instrs);
3136
3137 job->ringbuf.end = job->ringbuf.start + (instrs.count * sizeof(u64));
3138
3139 panthor_job_get(&job->base);
3140 spin_lock(&queue->fence_ctx.lock);
3141 list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs);
3142 spin_unlock(&queue->fence_ctx.lock);
3143
3144 /* Make sure the ring buffer is updated before the INSERT
3145 * register.
3146 */
3147 wmb();
3148
3149 queue->iface.input->extract = queue->iface.output->extract;
3150 queue->iface.input->insert = job->ringbuf.end;
3151
3152 if (group->csg_id < 0) {
3153 /* If the queue is blocked, we want to keep the timeout running, so we
3154 * can detect unbounded waits and kill the group when that happens.
3155 * Otherwise, we suspend the timeout so the time we spend waiting for
3156 * a CSG slot is not counted.
3157 */
3158 if (!(group->blocked_queues & BIT(job->queue_idx)) &&
3159 !queue->timeout_suspended) {
3160 queue->remaining_time = drm_sched_suspend_timeout(&queue->scheduler);
3161 queue->timeout_suspended = true;
3162 }
3163
3164 group_schedule_locked(group, BIT(job->queue_idx));
3165 } else {
3166 gpu_write(ptdev, CSF_DOORBELL(queue->doorbell_id), 1);
3167 if (!sched->pm.has_ref &&
3168 !(group->blocked_queues & BIT(job->queue_idx))) {
3169 pm_runtime_get(ptdev->base.dev);
3170 sched->pm.has_ref = true;
3171 }
3172 panthor_devfreq_record_busy(sched->ptdev);
3173 }
3174
3175 /* Update the last fence. */
3176 dma_fence_put(queue->fence_ctx.last_fence);
3177 queue->fence_ctx.last_fence = dma_fence_get(job->done_fence);
3178
3179 done_fence = dma_fence_get(job->done_fence);
3180
3181 out_unlock:
3182 mutex_unlock(&sched->lock);
3183 pm_runtime_mark_last_busy(ptdev->base.dev);
3184 pm_runtime_put_autosuspend(ptdev->base.dev);
3185
3186 return done_fence;
3187 }
3188
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-16 9:16 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 22:56 [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Adrián Larumbe
2024-10-11 22:57 ` [PATCH 2/3] drm/panthor: Retry OPP transition to suspension state a few times Adrián Larumbe
2024-10-14 7:28 ` Boris Brezillon
2024-10-11 22:57 ` [PATCH 3/3] drm/panthor: Rreset device and load FW after failed PM suspend Adrián Larumbe
2024-10-14 7:27 ` Boris Brezillon
2024-10-16 9:14 ` kernel test robot
2024-10-11 23:22 ` [PATCH 1/3] drm/panthor: Fix runtime suspend sequence after OPP transition error Liviu Dudau
2024-10-14 7:12 ` Boris Brezillon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox