From: "Onur Özkan" <work@onurozkan.dev>
To: rust-for-linux@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, dakr@kernel.org,
aliceryhl@google.com, daniel.almeida@collabora.com,
airlied@gmail.com, simona@ffwll.ch,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API
Date: Fri, 3 Apr 2026 15:36:53 +0300 [thread overview]
Message-ID: <20260403123654.155249-1-work@onurozkan.dev> (raw)
In-Reply-To: <20260313091646.16938-1-work@onurozkan.dev>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]
> This series adds GPU reset handling support for Tyr in a new module
> drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> controller internals and exposes a ResetHandle API to the driver.
>
> The reset module owns reset state, queueing and execution ordering
> through OrderedQueue and handles duplicate/concurrent reset requests
> with a pending flag.
>
> Apart from the reset module, the first 3 patches:
>
> - Fixes a potential reset-complete stale state bug by clearing completed
> state before doing soft reset.
> - Adds Work::disable_sync() (wrapper of bindings::disable_work_sync).
> - Adds OrderedQueue support.
>
> Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.
>
> [1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131
>
> Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> ---
>
> Onur Özkan (4):
> drm/tyr: clear reset IRQ before soft reset
> rust: add Work::disable_sync
> rust: add ordered workqueue wrapper
> drm/tyr: add GPU reset handling
>
> drivers/gpu/drm/tyr/driver.rs | 38 +++----
> drivers/gpu/drm/tyr/reset.rs | 180 ++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/tyr/tyr.rs | 1 +
> rust/helpers/workqueue.c | 6 ++
> rust/kernel/workqueue.rs | 62 ++++++++++++
> 5 files changed, 260 insertions(+), 27 deletions(-)
> create mode 100644 drivers/gpu/drm/tyr/reset.rs
>
>
> base-commit: 0ccc0dac94bf2f5c6eb3e9e7f1014cd9dddf009f
> --
> 2.51.2
>
Hi all,
Writing the current status of this work, I have 2 blockers to move forward.
1- GPU unplug API
On the existing C side, reset failure handling eventually needs to unplug the
device, and that path is part of the broader reset flow in:
- srctree/drivers/gpu/drm/panthor/panthor_device.c
This is part of [1] and as far as I understand, it is still work in progress. For Tyr,
I currently keep this as a placeholder (todo!("unplug the GPU")) in the reset path,
because I do not want to introduce temporary or partial unplug handling in this series
before the unplug design is settled.
[1]: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29
2- Design decisions for reset handling
The second blocker is the design around how Resettable (a generic pre_reset post_reset hook trait)
implemeter should stop admitting new work, drain in-flight operations and recover after reset.
My current understanding is that the cleanest approach is to keep reset.rs responsible only for
reset orchestration:
- schedule reset work
- call pre_reset() hooks
- perform the hardware reset
- call post_reset() hooks
- propagate failure.
Then, each Resettable implementer should own its local recovery logic.
This is also how the existing C implementation is structured. The reset worker is centralized, but
recovery is implemented by the participating subsystems:
- srctree/drivers/gpu/drm/panthor/panthor_sched.c
- srctree/drivers/gpu/drm/panthor/panthor_fw.c
- srctree/drivers/gpu/drm/panthor/panthor_mmu.c
More specifically, the existing C side has hooks such as:
- panthor_sched_pre_reset() / panthor_sched_post_reset()
- panthor_fw_pre_reset() / panthor_fw_post_reset()
- panthor_mmu_pre_reset() / panthor_mmu_post_reset()
The reason I am leaning in the same direction for Tyr is that "stop new work", "drain" and "resume"
are not generic operations. They depend on the implementer.
Because of that, I think reset.rs should not have a global guard/checking API for all of this.
Comments and suggestions are very welcome.
Regards,
Onur
prev parent reply other threads:[~2026-04-03 12:37 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-13 9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
2026-03-13 9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
2026-03-19 10:47 ` Boris Brezillon
2026-03-13 9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
2026-03-13 12:00 ` Alice Ryhl
2026-03-15 10:45 ` Onur Özkan
2026-03-13 9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
2026-03-13 9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
2026-03-13 14:56 ` Daniel Almeida
2026-03-15 10:44 ` Onur Özkan
2026-03-19 11:08 ` Boris Brezillon
2026-03-19 12:51 ` Onur Özkan
2026-04-03 15:01 ` Daniel Almeida
2026-03-13 9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
2026-03-13 11:12 ` Onur Özkan
2026-03-13 11:26 ` Alice Ryhl
2026-04-03 12:36 ` Onur Özkan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260403123654.155249-1-work@onurozkan.dev \
--to=work@onurozkan.dev \
--cc=airlied@gmail.com \
--cc=aliceryhl@google.com \
--cc=dakr@kernel.org \
--cc=daniel.almeida@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox