From: Boris Brezillon <boris.brezillon@collabora.com>
To: Ketil Johnsen <ketil.johnsen@arm.com>
Cc: Steven Price <steven.price@arm.com>,
Liviu Dudau <liviu.dudau@arm.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Heiko Stuebner <heiko@sntech.de>,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] drm/panthor: Fix UAF race between device unplug and FW event processing
Date: Wed, 22 Oct 2025 14:37:51 +0200 [thread overview]
Message-ID: <20251022143751.769c1f23@fedora> (raw)
In-Reply-To: <20251022103014.1082629-1-ketil.johnsen@arm.com>
On Wed, 22 Oct 2025 12:30:13 +0200
Ketil Johnsen <ketil.johnsen@arm.com> wrote:
> The function panthor_fw_unplug() will free the FW memory sections.
> The problem is that there could still be pending FW events which are yet
> not handled at this point. process_fw_events_work() can in this case try
> to access said freed memory.
>
> This fix introduces a destroyed state for the panthor_scheduler object,
> and we check for this before processing FW events.
>
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> Fixes: de85488138247 ("drm/panthor: Add the scheduler logical block")
> ---
> v2:
> - Followed Boris's advice and handle the race purely within the
> scheduler block (by adding a destroyed state)
> ---
> drivers/gpu/drm/panthor/panthor_sched.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 0cc9055f4ee52..4996f987b8183 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -315,6 +315,13 @@ struct panthor_scheduler {
> */
> struct list_head stopped_groups;
> } reset;
> +
> + /**
> + * @destroyed: Scheduler object is (being) destroyed
> + *
> + * Normal scheduler operations should no longer take place.
> + */
> + bool destroyed;
Do we really need a new field for that? Can't we just reset
panthor_device::scheduler to NULL early enough in the unplug path?
I guess it's not that simple if we have works going back to ptdev
and then dereferencing ptdev->scheduler, but I think it's also
fundamentally broken to have scheduler works active after the
scheduler teardown has started, so we might want to add some more
checks in the work callbacks too.
> };
>
> /**
> @@ -1765,7 +1772,10 @@ static void process_fw_events_work(struct work_struct *work)
> u32 events = atomic_xchg(&sched->fw_events, 0);
> struct panthor_device *ptdev = sched->ptdev;
>
> - mutex_lock(&sched->lock);
> + guard(mutex)(&sched->lock);
> +
> + if (sched->destroyed)
> + return;
>
> if (events & JOB_INT_GLOBAL_IF) {
> sched_process_global_irq_locked(ptdev);
> @@ -1778,8 +1788,6 @@ static void process_fw_events_work(struct work_struct *work)
> sched_process_csg_irq_locked(ptdev, csg_id);
> events &= ~BIT(csg_id);
> }
> -
> - mutex_unlock(&sched->lock);
> }
>
> /**
> @@ -3882,6 +3890,7 @@ void panthor_sched_unplug(struct panthor_device *ptdev)
> cancel_delayed_work_sync(&sched->tick_work);
>
> mutex_lock(&sched->lock);
> + sched->destroyed = true;
> if (sched->pm.has_ref) {
> pm_runtime_put(ptdev->base.dev);
> sched->pm.has_ref = false;
Hm, I'd really like to see a cancel_work_sync(&sched->fw_events_work)
rather than letting the work execute after we've started tearing down
the scheduler object.
If you follow my suggestion to reset the ptdev->scheduler field, I
guess something like that would do:
void panthor_sched_unplug(struct panthor_device *ptdev)
{
struct panthor_scheduler *sched = ptdev->scheduler;
/* We want the schedu */
WRITE_ONCE(*ptdev->scheduler, NULL);
cancel_work_sync(&sched->fw_events_work);
cancel_delayed_work_sync(&sched->tick_work);
mutex_lock(&sched->lock);
if (sched->pm.has_ref) {
pm_runtime_put(ptdev->base.dev);
sched->pm.has_ref = false;
}
mutex_unlock(&sched->lock);
}
and
void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events) {
struct panthor_scheduler *sched = READ_ONCE(*ptdev->scheduler);
/* Scheduler is not initialized, or it's gone. */
if (!sched)
return;
atomic_or(events, &sched->fw_events);
sched_queue_work(sched, fw_events);
}
sched_queue_[delayed_]work() could also be automated to issue a drm_WARN_ON()
when it's called and ptdev->scheduler = NULL.
next prev parent reply other threads:[~2025-10-22 12:38 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-22 10:30 [PATCH v2] drm/panthor: Fix UAF race between device unplug and FW event processing Ketil Johnsen
2025-10-22 10:53 ` Steven Price
2025-10-22 11:02 ` Liviu Dudau
2025-10-22 12:37 ` Boris Brezillon [this message]
2025-10-22 13:36 ` Steven Price
2025-10-22 14:00 ` Boris Brezillon
2025-10-22 14:28 ` Steven Price
2025-10-22 15:32 ` Boris Brezillon
2025-10-22 15:36 ` Steven Price
2025-10-23 14:22 ` Ketil Johnsen
2025-10-24 6:33 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251022143751.769c1f23@fedora \
--to=boris.brezillon@collabora.com \
--cc=airlied@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=heiko@sntech.de \
--cc=ketil.johnsen@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liviu.dudau@arm.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.