The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: "Adrián Larumbe" <adrian.larumbe@collabora.com>
To: Steven Price <steven.price@arm.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>,
	 Rob Herring <robh@kernel.org>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	 Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	 David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	 Faith Ekstrand <faith.ekstrand@collabora.com>,
	"Marty E. Plummer" <hanetzer@startmail.com>,
	 Tomeu Vizoso <tomeu@tomeuvizoso.net>,
	Eric Anholt <eric@anholt.net>,
	 Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>,
	Robin Murphy <robin.murphy@arm.com>,
	 dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	 Collabora Kernel Team <kernel@collabora.com>,
	Neil Armstrong <neil.armstrong@linaro.org>
Subject: Re: [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
Date: Tue, 16 Jun 2026 20:48:19 +0100	[thread overview]
Message-ID: <ajGicH-AkZlDAtEd@sobremesa> (raw)
In-Reply-To: <57571cae-dac7-4550-a634-c2889a961c7b@arm.com>

On 05.06.2026 11:48, Steven Price wrote:
> On 04/06/2026 18:35, Adrián Larumbe wrote:
> > During device probe(), failure to do a PM get() will leave the usage_count
> > set to 0, which is the value assigned at device creation time. That means
> > when the autosuspend delay expires, runtime suspend callback won't be
> > invoked, so the device will remain powered on forever.
> > 
> > On top of that, failure to call PM put() during device unplug means
> > Panfrost device's PM usage_count increases monotonically for every new
> > module reload.
> > 
> > The combined outcome of both of the above was that devfreq OPP transition
> > notifications would be printed all the time, even when no jobs are being
> > submitted. This quickly fills the kernel ring buffer with junk.
> > 
> > Even direr than that was the fact MMU interrupts are only enabled when
> > the device is reset, so after device probe() the very first job targeting
> > the tiler heap BO would always time out, because the driver's PM runtime
> > resume callback would not be invoked.
> > 
> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> > Fixes: 635430797d3f ("drm/panfrost: Rework runtime PM initialization")
> > Fixes: 876b15d2c88d ("drm/panfrost: Fix module unload")
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_drv.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > index 2d4b6aa95c66..545fbf2c8d0c 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > @@ -989,6 +989,7 @@ static int panfrost_probe(struct platform_device *pdev)
> >  	pm_runtime_set_active(pfdev->base.dev);
> >  	pm_runtime_mark_last_busy(pfdev->base.dev);
> >  	pm_runtime_enable(pfdev->base.dev);
> > +	pm_runtime_get_noresume(pfdev->base.dev);
> >  	pm_runtime_set_autosuspend_delay(pfdev->base.dev, 50); /* ~3 frames */
> >  	pm_runtime_use_autosuspend(pfdev->base.dev);
> >  
> > @@ -1000,10 +1001,12 @@ static int panfrost_probe(struct platform_device *pdev)
> >  	if (err < 0)
> >  		goto err_out1;
> >  
> > +	pm_runtime_put_autosuspend(pfdev->base.dev);
> >  
> >  	return 0;
> >  
> >  err_out1:
> > +	pm_runtime_put_noidle(pfdev->base.dev);
> >  	pm_runtime_disable(pfdev->base.dev);
> >  	panfrost_device_fini(pfdev);
> 
> Sashiko is concerned that dropping the usage count before
> pm_runtime_disable() could cause things to turn off too early, and I
> have to agree it sounds like it could be a problem:
> 
> Sashiko wrote:
> > Does dropping the usage count before pm_runtime_disable() create a race
> > condition where the suspend callback can run and disable clocks before
> > hardware shutdown?
> > Because the usage count is dropped early, a concurrent PM event could trigger
> > the suspend callback, disabling clocks. Then, panfrost_device_fini() calls
> > panfrost_gpu_fini() which writes to MMIO registers. Could writing to
> > unclocked registers on ARM SoCs cause fatal bus errors or panics?

I think this could be an issue if the device were already registered and someone
could drive the pm resume and then suspend sequence through an ioctl, but because
this is an error path and yet the device was never made available, I can't imagine
how this could happen.

Maybe if the panfrost device had any children devices, and when one of them did a
put autosuspend, the refcnt would be propagdated back to Panfrost and then trigger
the scenario described by shashiko.

However, I've just realised it's alright to call pm_runtime_put_noidle() even after
pm_runtime_disable(). Seems that the latter just prevents any further suspends or
resumes on the PM device, but we're still in control of the refcnt, so moving
pm_runtime_put_noidle() right after panfrost_device_fini() should be fine.

> Sashiko also suggests we might have some other (partially pre-existing)
> issues here.
> 
> https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655%40collabora.com

I'll look into all the pre-existing issues and write fixes for the next patch series revision.

> Thanks,
> Steve
> 
> >  	pm_runtime_set_suspended(pfdev->base.dev);
> > @@ -1018,8 +1021,9 @@ static void panfrost_remove(struct platform_device *pdev)
> >  	drm_dev_unregister(&pfdev->base);
> >  
> >  	pm_runtime_get_sync(pfdev->base.dev);
> > -	pm_runtime_disable(pfdev->base.dev);
> >  	panfrost_device_fini(pfdev);
> > +	pm_runtime_put_noidle(pfdev->base.dev);
> > +	pm_runtime_disable(pfdev->base.dev);
> >  	pm_runtime_set_suspended(pfdev->base.dev);
> >  }
> >  
> > 


Adrian Larumbe

  reply	other threads:[~2026-06-16 19:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
2026-06-04 17:57   ` Boris Brezillon
2026-06-05 10:29   ` Steven Price
2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
2026-06-04 18:02   ` Boris Brezillon
2026-06-05 10:29     ` Steven Price
2026-06-16 18:43       ` Adrián Larumbe
2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
2026-06-04 18:04   ` Boris Brezillon
2026-06-16 18:53     ` Adrián Larumbe
2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
2026-06-04 18:05   ` Boris Brezillon
2026-06-05 10:34   ` Steven Price
2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
2026-06-04 18:26   ` Boris Brezillon
2026-06-05 10:41     ` Steven Price
2026-06-16 19:15       ` Adrián Larumbe
2026-06-16 22:39     ` Adrián Larumbe
2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
2026-06-04 18:36   ` Boris Brezillon
2026-06-16 20:17     ` Adrián Larumbe
2026-06-05 10:48   ` Steven Price
2026-06-16 19:48     ` Adrián Larumbe [this message]
2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
2026-06-05  6:56   ` Boris Brezillon
2026-06-16 19:15     ` Adrián Larumbe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajGicH-AkZlDAtEd@sobremesa \
    --to=adrian.larumbe@collabora.com \
    --cc=airlied@gmail.com \
    --cc=alyssa.rosenzweig@collabora.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=eric@anholt.net \
    --cc=faith.ekstrand@collabora.com \
    --cc=hanetzer@startmail.com \
    --cc=kernel@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=neil.armstrong@linaro.org \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=tomeu@tomeuvizoso.net \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox