[PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API
@ 2026-03-13  9:16 Onur Özkan
  2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-13  9:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dakr, aliceryhl, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Onur Özkan

This series adds GPU reset handling support for Tyr in a new module
drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
controller internals and exposes a ResetHandle API to the driver.

The reset module owns reset state, queueing and execution ordering
through OrderedQueue and handles duplicate/concurrent reset requests
with a pending flag.

Apart from the reset module, the first 3 patches:

- Fixes a potential reset-complete stale state bug by clearing completed
  state before doing soft reset.
- Adds Work::disable_sync() (wrapper of bindings::disable_work_sync).
- Adds OrderedQueue support.

Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.

[1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131

Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
---

Onur Özkan (4):
  drm/tyr: clear reset IRQ before soft reset
  rust: add Work::disable_sync
  rust: add ordered workqueue wrapper
  drm/tyr: add GPU reset handling

 drivers/gpu/drm/tyr/driver.rs |  38 +++----
 drivers/gpu/drm/tyr/reset.rs  | 180 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/tyr/tyr.rs    |   1 +
 rust/helpers/workqueue.c      |   6 ++
 rust/kernel/workqueue.rs      |  62 ++++++++++++
 5 files changed, 260 insertions(+), 27 deletions(-)
 create mode 100644 drivers/gpu/drm/tyr/reset.rs


base-commit: 0ccc0dac94bf2f5c6eb3e9e7f1014cd9dddf009f
-- 
2.51.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset
  2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
@ 2026-03-13  9:16 ` Onur Özkan
  2026-03-19 10:47   ` Boris Brezillon
  2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Onur Özkan @ 2026-03-13  9:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dakr, aliceryhl, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Onur Özkan, Deborah Brouwer

Clear RESET_COMPLETED before writing GPU_CMD_SOFT_RESET.

This is also used in
drivers/gpu/drm/panfrost/panfrost_gpu.c::panfrost_gpu_soft_reset
and avoids seeing old reset-complete status from a previous reset.

Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Signed-off-by: Onur Özkan <work@onurozkan.dev>
---
 drivers/gpu/drm/tyr/driver.rs | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
index 69eff2a9e116..f7951804e4e0 100644
--- a/drivers/gpu/drm/tyr/driver.rs
+++ b/drivers/gpu/drm/tyr/driver.rs
@@ -91,6 +91,8 @@ unsafe impl Send for TyrDrmDeviceData {}
 unsafe impl Sync for TyrDrmDeviceData {}
 
 fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
+    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
+    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
     regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
 
     poll::read_poll_timeout(
-- 
2.51.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 RESEND 2/4] rust: add Work::disable_sync
  2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
  2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
@ 2026-03-13  9:16 ` Onur Özkan
  2026-03-13 12:00   ` Alice Ryhl
  2026-03-13  9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Onur Özkan @ 2026-03-13  9:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dakr, aliceryhl, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Onur Özkan, Deborah Brouwer

Add Work::disable_sync() as a safe wrapper for disable_work_sync().

Drivers can use this during teardown to stop new queueing and wait for
queued or running work to finish before dropping related resources.

Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Signed-off-by: Onur Özkan <work@onurozkan.dev>
---
 rust/kernel/workqueue.rs | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 706e833e9702..6acc7b5ba31c 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -530,6 +530,21 @@ pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
         // the compiler does not complain that the `work` field is unused.
         unsafe { Opaque::cast_into(core::ptr::addr_of!((*ptr).work)) }
     }
+
+    /// Disables this work item and waits for queued/running executions to finish.
+    ///
+    /// # Safety
+    ///
+    /// Must be called from a sleepable context if the work was last queued on a non-BH
+    /// workqueue.
+    #[inline]
+    pub unsafe fn disable_sync(&self) {
+        let ptr: *const Self = self;
+        // SAFETY: `self` points to a valid initialized work.
+        let raw_work = unsafe { Self::raw_get(ptr) };
+        // SAFETY: `raw_work` is a valid embedded `work_struct`.
+        unsafe { bindings::disable_work_sync(raw_work) };
+    }
 }
 
 /// Declares that a type contains a [`Work<T, ID>`].
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper
  2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
  2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
  2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
@ 2026-03-13  9:16 ` Onur Özkan
  2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
  2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
  4 siblings, 0 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-13  9:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dakr, aliceryhl, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Onur Özkan, Deborah Brouwer

Add an owned OrderedQueue wrapper for alloc_ordered_workqueue() and
destroy_workqueue().

This gives Rust drivers a simple way to create and own an ordered
workqueue with automatic cleanup in Drop.

Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Signed-off-by: Onur Özkan <work@onurozkan.dev>
---
 rust/helpers/workqueue.c |  6 +++++
 rust/kernel/workqueue.rs | 47 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/rust/helpers/workqueue.c b/rust/helpers/workqueue.c
index ce1c3a5b2150..7cd3b000a5b6 100644
--- a/rust/helpers/workqueue.c
+++ b/rust/helpers/workqueue.c
@@ -14,3 +14,9 @@ __rust_helper void rust_helper_init_work_with_key(struct work_struct *work,
 	INIT_LIST_HEAD(&work->entry);
 	work->func = func;
 }
+
+__rust_helper struct workqueue_struct *
+rust_helper_alloc_ordered_workqueue(const char *name, unsigned int flags)
+{
+	return alloc_ordered_workqueue("%s", flags, name);
+}
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 6acc7b5ba31c..d5aa61a5ef93 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -195,6 +195,7 @@
     types::Opaque,
 };
 use core::marker::PhantomData;
+use core::ptr::NonNull;
 
 /// Creates a [`Work`] initialiser with the given name and a newly-created lock class.
 #[macro_export]
@@ -346,6 +347,52 @@ pub fn try_spawn<T: 'static + Send + FnOnce()>(
     }
 }
 
+/// A kernel work queue that allocates and owns an ordered `workqueue_struct`.
+///
+/// Unlike [`Queue`], [`OrderedQueue`] takes ownership of the underlying C
+/// workqueue and automatically destroys it when dropped.
+pub struct OrderedQueue(NonNull<bindings::workqueue_struct>);
+
+// SAFETY: Workqueue objects are thread-safe to share and use concurrently.
+unsafe impl Send for OrderedQueue {}
+// SAFETY: Workqueue objects are thread-safe to share and use concurrently.
+unsafe impl Sync for OrderedQueue {}
+
+impl OrderedQueue {
+    /// Allocates an ordered workqueue.
+    ///
+    /// It is equivalent to C's `alloc_ordered_workqueue()`.
+    pub fn new(name: &'static CStr, flags: u32) -> Result<Self> {
+        // SAFETY: `name` is a `&'static CStr`, guaranteeing a valid, null-terminated C
+        // string pointer for the duration of this call.
+        let ptr = unsafe { bindings::alloc_ordered_workqueue(name.as_char_ptr(), flags) };
+        let ptr = NonNull::new(ptr).ok_or(ENOMEM)?;
+        Ok(Self(ptr))
+    }
+
+    /// Enqueues a work item.
+    ///
+    /// This may fail if the work item is already enqueued in a workqueue.
+    ///
+    /// The work item will be submitted using `WORK_CPU_UNBOUND`.
+    pub fn enqueue<W, const ID: u64>(&self, w: W) -> W::EnqueueOutput
+    where
+        W: RawWorkItem<ID> + Send + 'static,
+    {
+        // SAFETY: `self.0` is valid while `self` is alive.
+        unsafe { Queue::from_raw(self.0.as_ptr()) }.enqueue(w)
+    }
+}
+
+impl Drop for OrderedQueue {
+    fn drop(&mut self) {
+        // SAFETY:
+        // - Pointer comes from `alloc_ordered_workqueue()` and is owned by `self`.
+        // - `OrderedQueue` does not expose delayed scheduling API.
+        unsafe { bindings::destroy_workqueue(self.0.as_ptr()) };
+    }
+}
+
 /// A helper type used in [`try_spawn`].
 ///
 /// [`try_spawn`]: Queue::try_spawn
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
  2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
                   ` (2 preceding siblings ...)
  2026-03-13  9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
@ 2026-03-13  9:16 ` Onur Özkan
  2026-03-13 14:56   ` Daniel Almeida
  2026-03-19 11:08   ` Boris Brezillon
  2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
  4 siblings, 2 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-13  9:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: dakr, aliceryhl, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Onur Özkan, Deborah Brouwer

Move Tyr reset logic into a new reset module and add async reset work.

This adds:
- ResetHandle with internal controller state
- a dedicated ordered reset workqueue
- a pending flag to avoid duplicate queued resets
- run_reset() as the shared synchronous reset helper

Probe now calls reset::run_reset() before normal init. Driver data now
keeps ResetHandle so reset work is drained before clocks and regulators
are dropped.

Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Signed-off-by: Onur Özkan <work@onurozkan.dev>
---
 drivers/gpu/drm/tyr/driver.rs |  40 +++-----
 drivers/gpu/drm/tyr/reset.rs  | 180 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/tyr/tyr.rs    |   1 +
 3 files changed, 192 insertions(+), 29 deletions(-)
 create mode 100644 drivers/gpu/drm/tyr/reset.rs

diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
index f7951804e4e0..c80238a21ff2 100644
--- a/drivers/gpu/drm/tyr/driver.rs
+++ b/drivers/gpu/drm/tyr/driver.rs
@@ -6,11 +6,8 @@
         OptionalClk, //
     },
     device::{
-        Bound,
-        Core,
-        Device, //
+        Core, //
     },
-    devres::Devres,
     dma::{
         Device as DmaDevice,
         DmaMask, //
@@ -22,10 +19,7 @@
         Registered,
         UnregisteredDevice, //
     },
-    io::poll,
-    new_mutex,
-    of,
-    platform,
+    new_mutex, of, platform,
     prelude::*,
     regulator,
     regulator::Regulator,
@@ -35,17 +29,15 @@
         Arc,
         Mutex, //
     },
-    time, //
 };
 
 use crate::{
     file::TyrDrmFileData,
     fw::Firmware,
     gem::BoData,
-    gpu,
     gpu::GpuInfo,
     mmu::Mmu,
-    regs, //
+    reset, //
 };
 
 pub(crate) type IoMem = kernel::io::mem::IoMem<SZ_2M>;
@@ -62,6 +54,11 @@ pub(crate) struct TyrPlatformDriverData {
 
 #[pin_data]
 pub(crate) struct TyrDrmDeviceData {
+    // `ResetHandle::drop()` drains queued/running works and this must happen
+    // before clocks/regulators are dropped. So keep this field before them to
+    // ensure the correct drop order.
+    pub(crate) reset: reset::ResetHandle,
+
     pub(crate) pdev: ARef<platform::Device>,
 
     pub(crate) fw: Arc<Firmware>,
@@ -90,22 +87,6 @@ unsafe impl Send for TyrDrmDeviceData {}
 // SAFETY: This will be removed in a future patch.
 unsafe impl Sync for TyrDrmDeviceData {}
 
-fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
-    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
-    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
-    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
-
-    poll::read_poll_timeout(
-        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
-        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0,
-        time::Delta::from_millis(1),
-        time::Delta::from_millis(100),
-    )
-    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
-
-    Ok(())
-}
-
 kernel::of_device_table!(
     OF_TABLE,
     MODULE_OF_TABLE,
@@ -138,8 +119,7 @@ fn probe(
         let request = pdev.io_request_by_index(0).ok_or(ENODEV)?;
         let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?;
 
-        issue_soft_reset(pdev.as_ref(), &iomem)?;
-        gpu::l2_power_on(pdev.as_ref(), &iomem)?;
+        reset::run_reset(pdev.as_ref(), &iomem)?;
 
         let gpu_info = GpuInfo::new(pdev.as_ref(), &iomem)?;
         gpu_info.log(pdev);
@@ -153,6 +133,7 @@ fn probe(
 
         let uninit_ddev = UnregisteredDevice::<TyrDrmDriver>::new(pdev.as_ref())?;
         let platform: ARef<platform::Device> = pdev.into();
+        let reset = reset::ResetHandle::new(platform.clone(), iomem.clone())?;
 
         let mmu = Mmu::new(pdev, iomem.as_arc_borrow(), &gpu_info)?;
 
@@ -178,6 +159,7 @@ fn probe(
                     _mali: mali_regulator,
                     _sram: sram_regulator,
                 }),
+                reset,
                 gpu_info,
         });
         let ddev = Registration::new_foreign_owned(uninit_ddev, pdev.as_ref(), data, 0)?;
diff --git a/drivers/gpu/drm/tyr/reset.rs b/drivers/gpu/drm/tyr/reset.rs
new file mode 100644
index 000000000000..29dfae98b0dd
--- /dev/null
+++ b/drivers/gpu/drm/tyr/reset.rs
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0 or MIT
+
+//! Provides asynchronous reset handling for the Tyr DRM driver via
+//! [`ResetHandle`], which runs reset work on a dedicated ordered
+//! workqueue and avoids duplicate pending resets.
+
+use kernel::{
+    device::{
+        Bound,
+        Device, //
+    },
+    devres::Devres,
+    io::poll,
+    platform,
+    prelude::*,
+    sync::{
+        aref::ARef,
+        atomic::{
+            Acquire,
+            Atomic,
+            Relaxed,
+            Release, //
+        },
+        Arc,
+    },
+    time,
+    workqueue::{
+        self,
+        Work, //
+    },
+};
+
+use crate::{
+    driver::IoMem,
+    gpu,
+    regs, //
+};
+
+/// Manages asynchronous GPU reset handling and ensures only a single reset
+/// work is pending at a time.
+#[pin_data]
+struct Controller {
+    /// Platform device reference needed for reset operations and logging.
+    pdev: ARef<platform::Device>,
+    /// Mapped register space needed for reset operations.
+    iomem: Arc<Devres<IoMem>>,
+    /// Atomic flag for controlling the scheduling pending state.
+    pending: Atomic<bool>,
+    /// Dedicated ordered workqueue for reset operations.
+    wq: workqueue::OrderedQueue,
+    /// Work item backing async reset processing.
+    #[pin]
+    work: Work<Controller>,
+}
+
+kernel::impl_has_work! {
+    impl HasWork<Controller> for Controller { self.work }
+}
+
+impl workqueue::WorkItem for Controller {
+    type Pointer = Arc<Self>;
+
+    fn run(this: Arc<Self>) {
+        this.reset_work();
+    }
+}
+
+impl Controller {
+    /// Creates a [`Controller`] instance.
+    fn new(pdev: ARef<platform::Device>, iomem: Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
+        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
+
+        Arc::pin_init(
+            try_pin_init!(Self {
+                pdev,
+                iomem,
+                pending: Atomic::new(false),
+                wq,
+                work <- kernel::new_work!("tyr::reset"),
+            }),
+            GFP_KERNEL,
+        )
+    }
+
+    /// Processes one scheduled reset request.
+    ///
+    /// Panthor reference:
+    /// - drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
+    fn reset_work(self: &Arc<Self>) {
+        dev_info!(self.pdev.as_ref(), "GPU reset work is started.\n");
+
+        // SAFETY: `Controller` is part of driver-private data and only exists
+        // while the platform device is bound.
+        let pdev = unsafe { self.pdev.as_ref().as_bound() };
+        if let Err(e) = run_reset(pdev, &self.iomem) {
+            dev_err!(self.pdev.as_ref(), "GPU reset failed: {:?}\n", e);
+        } else {
+            dev_info!(self.pdev.as_ref(), "GPU reset work is done.\n");
+        }
+
+        self.pending.store(false, Release);
+    }
+}
+
+/// Reset handle that shuts down pending work gracefully on drop.
+pub(crate) struct ResetHandle(Arc<Controller>);
+
+impl ResetHandle {
+    /// Creates a [`ResetHandle`] instance.
+    pub(crate) fn new(pdev: ARef<platform::Device>, iomem: Arc<Devres<IoMem>>) -> Result<Self> {
+        Ok(Self(Controller::new(pdev, iomem)?))
+    }
+
+    /// Schedules reset work.
+    #[expect(dead_code)]
+    pub(crate) fn schedule(&self) {
+        // TODO: Similar to `panthor_device_schedule_reset()` in Panthor, add a
+        // power management check once Tyr supports it.
+
+        // Keep only one reset request running or queued. If one is already pending,
+        // we ignore new schedule requests.
+        if self.0.pending.cmpxchg(false, true, Relaxed).is_ok()
+            && self.0.wq.enqueue(self.0.clone()).is_err()
+        {
+            self.0.pending.store(false, Release);
+        }
+    }
+
+    /// Returns true if a reset is queued or in progress.
+    ///
+    /// Note that the state can change immediately after the return.
+    #[inline]
+    #[expect(dead_code)]
+    pub(crate) fn is_pending(&self) -> bool {
+        self.0.pending.load(Acquire)
+    }
+}
+
+impl Drop for ResetHandle {
+    fn drop(&mut self) {
+        // Drain queued/running work and block future queueing attempts for this
+        // work item before clocks/regulators are torn down.
+        // SAFETY: drop executes in a sleepable context.
+        unsafe { self.0.work.disable_sync() };
+    }
+}
+
+/// Issues a soft reset command and waits for reset-complete IRQ status.
+fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
+    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
+    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
+    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
+
+    poll::read_poll_timeout(
+        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
+        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0,
+        time::Delta::from_millis(1),
+        time::Delta::from_millis(100),
+    )
+    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
+
+    Ok(())
+}
+
+/// Runs one synchronous GPU reset pass.
+///
+/// Its visibility is `pub(super)` only so the probe path can run an
+/// initial reset; it is not part of this module's public API.
+///
+/// On success, the GPU is left in a state suitable for reinitialization.
+///
+/// The reset sequence is as follows:
+/// 1. Trigger a GPU soft reset.
+/// 2. Wait for the reset-complete IRQ status.
+/// 3. Power L2 back on.
+pub(super) fn run_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
+    issue_soft_reset(dev, iomem)?;
+    gpu::l2_power_on(dev, iomem)?;
+    Ok(())
+}
diff --git a/drivers/gpu/drm/tyr/tyr.rs b/drivers/gpu/drm/tyr/tyr.rs
index 18b0668bb217..d0349bc49f27 100644
--- a/drivers/gpu/drm/tyr/tyr.rs
+++ b/drivers/gpu/drm/tyr/tyr.rs
@@ -14,6 +14,7 @@
 mod gpu;
 mod mmu;
 mod regs;
+mod reset;
 mod slot;
 mod vm;
 
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API
  2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
                   ` (3 preceding siblings ...)
  2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
@ 2026-03-13  9:52 ` Alice Ryhl
  2026-03-13 11:12   ` Onur Özkan
  4 siblings, 1 reply; 15+ messages in thread
From: Alice Ryhl @ 2026-03-13  9:52 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux

On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote:
> This series adds GPU reset handling support for Tyr in a new module
> drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> controller internals and exposes a ResetHandle API to the driver.
> 
> The reset module owns reset state, queueing and execution ordering
> through OrderedQueue and handles duplicate/concurrent reset requests
> with a pending flag.
> 
> Apart from the reset module, the first 3 patches:
> 
> - Fixes a potential reset-complete stale state bug by clearing completed
>   state before doing soft reset.
> - Adds Work::disable_sync() (wrapper of bindings::disable_work_sync).
> - Adds OrderedQueue support.
> 
> Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.
> 
> [1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131
> 
> Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> ---
> 
> Onur Özkan (4):
>   drm/tyr: clear reset IRQ before soft reset
>   rust: add Work::disable_sync
>   rust: add ordered workqueue wrapper

I actually added ordered workqueue support here:
https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/

Alice

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API
  2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
@ 2026-03-13 11:12   ` Onur Özkan
  2026-03-13 11:26     ` Alice Ryhl
  0 siblings, 1 reply; 15+ messages in thread
From: Onur Özkan @ 2026-03-13 11:12 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: linux-kernel, dakr, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux

On Fri, 13 Mar 2026 09:52:16 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote:
> > This series adds GPU reset handling support for Tyr in a new module
> > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> > controller internals and exposes a ResetHandle API to the driver.
> > 
> > The reset module owns reset state, queueing and execution ordering
> > through OrderedQueue and handles duplicate/concurrent reset requests
> > with a pending flag.
> > 
> > Apart from the reset module, the first 3 patches:
> > 
> > - Fixes a potential reset-complete stale state bug by clearing
> > completed state before doing soft reset.
> > - Adds Work::disable_sync() (wrapper of
> > bindings::disable_work_sync).
> > - Adds OrderedQueue support.
> > 
> > Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.
> > 
> > [1]:
> > https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131
> > 
> > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> > ---
> > 
> > Onur Özkan (4):
> >   drm/tyr: clear reset IRQ before soft reset
> >   rust: add Work::disable_sync
> >   rust: add ordered workqueue wrapper
> 
> I actually added ordered workqueue support here:
> https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/
> 
> Alice

That's cool. I guess this will wait until your patch lands unless we
want to combine them into a single series.

- Onur


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API
  2026-03-13 11:12   ` Onur Özkan
@ 2026-03-13 11:26     ` Alice Ryhl
  0 siblings, 0 replies; 15+ messages in thread
From: Alice Ryhl @ 2026-03-13 11:26 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux

On Fri, Mar 13, 2026 at 12:12 PM Onur Özkan <work@onurozkan.dev> wrote:
>
> On Fri, 13 Mar 2026 09:52:16 +0000
> Alice Ryhl <aliceryhl@google.com> wrote:
>
> > On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote:
> > > This series adds GPU reset handling support for Tyr in a new module
> > > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> > > controller internals and exposes a ResetHandle API to the driver.
> > >
> > > The reset module owns reset state, queueing and execution ordering
> > > through OrderedQueue and handles duplicate/concurrent reset requests
> > > with a pending flag.
> > >
> > > Apart from the reset module, the first 3 patches:
> > >
> > > - Fixes a potential reset-complete stale state bug by clearing
> > > completed state before doing soft reset.
> > > - Adds Work::disable_sync() (wrapper of
> > > bindings::disable_work_sync).
> > > - Adds OrderedQueue support.
> > >
> > > Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.
> > >
> > > [1]:
> > > https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131
> > >
> > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> > > ---
> > >
> > > Onur Özkan (4):
> > >   drm/tyr: clear reset IRQ before soft reset
> > >   rust: add Work::disable_sync
> > >   rust: add ordered workqueue wrapper
> >
> > I actually added ordered workqueue support here:
> > https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/
> >
> > Alice
>
> That's cool. I guess this will wait until your patch lands unless we
> want to combine them into a single series.

You can just say in your cover letter that your series depends on mine.

Alice

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 2/4] rust: add Work::disable_sync
  2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
@ 2026-03-13 12:00   ` Alice Ryhl
  2026-03-15 10:45     ` Onur Özkan
  0 siblings, 1 reply; 15+ messages in thread
From: Alice Ryhl @ 2026-03-13 12:00 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Deborah Brouwer

On Fri, Mar 13, 2026 at 10:17 AM Onur Özkan <work@onurozkan.dev> wrote:
>
> Add Work::disable_sync() as a safe wrapper for disable_work_sync().
>
> Drivers can use this during teardown to stop new queueing and wait for
> queued or running work to finish before dropping related resources.
>
> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
> Signed-off-by: Onur Özkan <work@onurozkan.dev>
> ---
>  rust/kernel/workqueue.rs | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
> index 706e833e9702..6acc7b5ba31c 100644
> --- a/rust/kernel/workqueue.rs
> +++ b/rust/kernel/workqueue.rs
> @@ -530,6 +530,21 @@ pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
>          // the compiler does not complain that the `work` field is unused.
>          unsafe { Opaque::cast_into(core::ptr::addr_of!((*ptr).work)) }
>      }
> +
> +    /// Disables this work item and waits for queued/running executions to finish.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Must be called from a sleepable context if the work was last queued on a non-BH
> +    /// workqueue.

We generally do not make functions unsafe just because they might sleep.

Alice

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
  2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
@ 2026-03-13 14:56   ` Daniel Almeida
  2026-03-15 10:44     ` Onur Özkan
  2026-03-19 11:08   ` Boris Brezillon
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Almeida @ 2026-03-13 14:56 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, aliceryhl, airlied, simona, dri-devel,
	rust-for-linux, Deborah Brouwer



> On 13 Mar 2026, at 06:16, Onur Özkan <work@onurozkan.dev> wrote:
> 
> Move Tyr reset logic into a new reset module and add async reset work.
> 
> This adds:
> - ResetHandle with internal controller state
> - a dedicated ordered reset workqueue
> - a pending flag to avoid duplicate queued resets
> - run_reset() as the shared synchronous reset helper
> 
> Probe now calls reset::run_reset() before normal init. Driver data now
> keeps ResetHandle so reset work is drained before clocks and regulators
> are dropped.
> 
> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
> Signed-off-by: Onur Özkan <work@onurozkan.dev>
> ---
> drivers/gpu/drm/tyr/driver.rs |  40 +++-----
> drivers/gpu/drm/tyr/reset.rs  | 180 ++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/tyr/tyr.rs    |   1 +
> 3 files changed, 192 insertions(+), 29 deletions(-)
> create mode 100644 drivers/gpu/drm/tyr/reset.rs
> 
> diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
> index f7951804e4e0..c80238a21ff2 100644
> --- a/drivers/gpu/drm/tyr/driver.rs
> +++ b/drivers/gpu/drm/tyr/driver.rs
> @@ -6,11 +6,8 @@
>         OptionalClk, //
>     },
>     device::{
> -        Bound,
> -        Core,
> -        Device, //
> +        Core, //
>     },
> -    devres::Devres,
>     dma::{
>         Device as DmaDevice,
>         DmaMask, //
> @@ -22,10 +19,7 @@
>         Registered,
>         UnregisteredDevice, //
>     },
> -    io::poll,
> -    new_mutex,
> -    of,
> -    platform,
> +    new_mutex, of, platform,
>     prelude::*,
>     regulator,
>     regulator::Regulator,
> @@ -35,17 +29,15 @@
>         Arc,
>         Mutex, //
>     },
> -    time, //
> };
> 
> use crate::{
>     file::TyrDrmFileData,
>     fw::Firmware,
>     gem::BoData,
> -    gpu,
>     gpu::GpuInfo,
>     mmu::Mmu,
> -    regs, //
> +    reset, //
> };
> 
> pub(crate) type IoMem = kernel::io::mem::IoMem<SZ_2M>;
> @@ -62,6 +54,11 @@ pub(crate) struct TyrPlatformDriverData {
> 
> #[pin_data]
> pub(crate) struct TyrDrmDeviceData {
> +    // `ResetHandle::drop()` drains queued/running works and this must happen
> +    // before clocks/regulators are dropped. So keep this field before them to
> +    // ensure the correct drop order.
> +    pub(crate) reset: reset::ResetHandle,
> +
>     pub(crate) pdev: ARef<platform::Device>,
> 
>     pub(crate) fw: Arc<Firmware>,
> @@ -90,22 +87,6 @@ unsafe impl Send for TyrDrmDeviceData {}
> // SAFETY: This will be removed in a future patch.
> unsafe impl Sync for TyrDrmDeviceData {}
> 
> -fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
> -    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
> -    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
> -    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
> -
> -    poll::read_poll_timeout(
> -        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
> -        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0,
> -        time::Delta::from_millis(1),
> -        time::Delta::from_millis(100),
> -    )
> -    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
> -
> -    Ok(())
> -}
> -
> kernel::of_device_table!(
>     OF_TABLE,
>     MODULE_OF_TABLE,
> @@ -138,8 +119,7 @@ fn probe(
>         let request = pdev.io_request_by_index(0).ok_or(ENODEV)?;
>         let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?;
> 
> -        issue_soft_reset(pdev.as_ref(), &iomem)?;
> -        gpu::l2_power_on(pdev.as_ref(), &iomem)?;
> +        reset::run_reset(pdev.as_ref(), &iomem)?;
> 
>         let gpu_info = GpuInfo::new(pdev.as_ref(), &iomem)?;
>         gpu_info.log(pdev);
> @@ -153,6 +133,7 @@ fn probe(
> 
>         let uninit_ddev = UnregisteredDevice::<TyrDrmDriver>::new(pdev.as_ref())?;
>         let platform: ARef<platform::Device> = pdev.into();
> +        let reset = reset::ResetHandle::new(platform.clone(), iomem.clone())?;
> 
>         let mmu = Mmu::new(pdev, iomem.as_arc_borrow(), &gpu_info)?;
> 
> @@ -178,6 +159,7 @@ fn probe(
>                     _mali: mali_regulator,
>                     _sram: sram_regulator,
>                 }),
> +                reset,
>                 gpu_info,
>         });
>         let ddev = Registration::new_foreign_owned(uninit_ddev, pdev.as_ref(), data, 0)?;
> diff --git a/drivers/gpu/drm/tyr/reset.rs b/drivers/gpu/drm/tyr/reset.rs
> new file mode 100644
> index 000000000000..29dfae98b0dd
> --- /dev/null
> +++ b/drivers/gpu/drm/tyr/reset.rs
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0 or MIT
> +
> +//! Provides asynchronous reset handling for the Tyr DRM driver via
> +//! [`ResetHandle`], which runs reset work on a dedicated ordered
> +//! workqueue and avoids duplicate pending resets.
> +
> +use kernel::{
> +    device::{
> +        Bound,
> +        Device, //
> +    },
> +    devres::Devres,
> +    io::poll,
> +    platform,
> +    prelude::*,
> +    sync::{
> +        aref::ARef,
> +        atomic::{
> +            Acquire,
> +            Atomic,
> +            Relaxed,
> +            Release, //
> +        },
> +        Arc,
> +    },
> +    time,
> +    workqueue::{
> +        self,
> +        Work, //
> +    },
> +};
> +
> +use crate::{
> +    driver::IoMem,
> +    gpu,
> +    regs, //
> +};
> +
> +/// Manages asynchronous GPU reset handling and ensures only a single reset
> +/// work is pending at a time.
> +#[pin_data]
> +struct Controller {
> +    /// Platform device reference needed for reset operations and logging.
> +    pdev: ARef<platform::Device>,
> +    /// Mapped register space needed for reset operations.
> +    iomem: Arc<Devres<IoMem>>,
> +    /// Atomic flag for controlling the scheduling pending state.
> +    pending: Atomic<bool>,
> +    /// Dedicated ordered workqueue for reset operations.
> +    wq: workqueue::OrderedQueue,
> +    /// Work item backing async reset processing.
> +    #[pin]
> +    work: Work<Controller>,
> +}
> +
> +kernel::impl_has_work! {
> +    impl HasWork<Controller> for Controller { self.work }
> +}
> +
> +impl workqueue::WorkItem for Controller {
> +    type Pointer = Arc<Self>;
> +
> +    fn run(this: Arc<Self>) {
> +        this.reset_work();
> +    }
> +}
> +
> +impl Controller {
> +    /// Creates a [`Controller`] instance.
> +    fn new(pdev: ARef<platform::Device>, iomem: Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
> +        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
> +
> +        Arc::pin_init(
> +            try_pin_init!(Self {
> +                pdev,
> +                iomem,
> +                pending: Atomic::new(false),
> +                wq,
> +                work <- kernel::new_work!("tyr::reset"),
> +            }),
> +            GFP_KERNEL,
> +        )
> +    }
> +
> +    /// Processes one scheduled reset request.
> +    ///
> +    /// Panthor reference:
> +    /// - drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
> +    fn reset_work(self: &Arc<Self>) {
> +        dev_info!(self.pdev.as_ref(), "GPU reset work is started.\n");
> +
> +        // SAFETY: `Controller` is part of driver-private data and only exists
> +        // while the platform device is bound.
> +        let pdev = unsafe { self.pdev.as_ref().as_bound() };
> +        if let Err(e) = run_reset(pdev, &self.iomem) {
> +            dev_err!(self.pdev.as_ref(), "GPU reset failed: {:?}\n", e);
> +        } else {
> +            dev_info!(self.pdev.as_ref(), "GPU reset work is done.\n");
> +        }

Can we have more descriptive strings here? A user cares little for
implementation details such as “reset work”, what they care about is
that the hardware is undergoing a reset.

> +
> +        self.pending.store(false, Release);
> +    }
> +}
> +
> +/// Reset handle that shuts down pending work gracefully on drop.
> +pub(crate) struct ResetHandle(Arc<Controller>);
> +

Why is this an Arc? There seems to be a single owner?

> +impl ResetHandle {
> +    /// Creates a [`ResetHandle`] instance.
> +    pub(crate) fn new(pdev: ARef<platform::Device>, iomem: Arc<Devres<IoMem>>) -> Result<Self> {
> +        Ok(Self(Controller::new(pdev, iomem)?))
> +    }
> +
> +    /// Schedules reset work.
> +    #[expect(dead_code)]
> +    pub(crate) fn schedule(&self) {
> +        // TODO: Similar to `panthor_device_schedule_reset()` in Panthor, add a
> +        // power management check once Tyr supports it.
> +
> +        // Keep only one reset request running or queued. If one is already pending,
> +        // we ignore new schedule requests.
> +        if self.0.pending.cmpxchg(false, true, Relaxed).is_ok()
> +            && self.0.wq.enqueue(self.0.clone()).is_err()
> +        {
> +            self.0.pending.store(false, Release);
> +        }
> +    }
> +
> +    /// Returns true if a reset is queued or in progress.
> +    ///
> +    /// Note that the state can change immediately after the return.
> +    #[inline]
> +    #[expect(dead_code)]
> +    pub(crate) fn is_pending(&self) -> bool {
> +        self.0.pending.load(Acquire)
> +    }

> +}
> +
> +impl Drop for ResetHandle {
> +    fn drop(&mut self) {
> +        // Drain queued/running work and block future queueing attempts for this
> +        // work item before clocks/regulators are torn down.
> +        // SAFETY: drop executes in a sleepable context.
> +        unsafe { self.0.work.disable_sync() };
> +    }
> +}
> +
> +/// Issues a soft reset command and waits for reset-complete IRQ status.
> +fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
> +    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
> +    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
> +    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
> +
> +    poll::read_poll_timeout(
> +        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
> +        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0,
> +        time::Delta::from_millis(1),
> +        time::Delta::from_millis(100),
> +    )
> +    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
> +
> +    Ok(())
> +}
> +
> +/// Runs one synchronous GPU reset pass.
> +///
> +/// Its visibility is `pub(super)` only so the probe path can run an
> +/// initial reset; it is not part of this module's public API.
> +///
> +/// On success, the GPU is left in a state suitable for reinitialization.
> +///
> +/// The reset sequence is as follows:
> +/// 1. Trigger a GPU soft reset.
> +/// 2. Wait for the reset-complete IRQ status.
> +/// 3. Power L2 back on.
> +pub(super) fn run_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
> +    issue_soft_reset(dev, iomem)?;
> +    gpu::l2_power_on(dev, iomem)?;
> +    Ok(())
> +}
> diff --git a/drivers/gpu/drm/tyr/tyr.rs b/drivers/gpu/drm/tyr/tyr.rs
> index 18b0668bb217..d0349bc49f27 100644
> --- a/drivers/gpu/drm/tyr/tyr.rs
> +++ b/drivers/gpu/drm/tyr/tyr.rs
> @@ -14,6 +14,7 @@
> mod gpu;
> mod mmu;
> mod regs;
> +mod reset;
> mod slot;
> mod vm;
> 
> -- 
> 2.51.2
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
  2026-03-13 14:56   ` Daniel Almeida
@ 2026-03-15 10:44     ` Onur Özkan
  0 siblings, 0 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-15 10:44 UTC (permalink / raw)
  To: Daniel Almeida
  Cc: linux-kernel, dakr, aliceryhl, airlied, simona, dri-devel,
	rust-for-linux, Deborah Brouwer

On Fri, 13 Mar 2026 11:56:58 -0300
Daniel Almeida <daniel.almeida@collabora.com> wrote:

> 
> 
> > On 13 Mar 2026, at 06:16, Onur Özkan <work@onurozkan.dev> wrote:
> > 
> > Move Tyr reset logic into a new reset module and add async reset
> > work.
> > 
> > This adds:
> > - ResetHandle with internal controller state
> > - a dedicated ordered reset workqueue
> > - a pending flag to avoid duplicate queued resets
> > - run_reset() as the shared synchronous reset helper
> > 
> > Probe now calls reset::run_reset() before normal init. Driver data
> > now keeps ResetHandle so reset work is drained before clocks and
> > regulators are dropped.
> > 
> > Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
> > Signed-off-by: Onur Özkan <work@onurozkan.dev>
> > ---
> > drivers/gpu/drm/tyr/driver.rs |  40 +++-----
> > drivers/gpu/drm/tyr/reset.rs  | 180
> > ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tyr/tyr.rs    |
> >  1 + 3 files changed, 192 insertions(+), 29 deletions(-)
> > create mode 100644 drivers/gpu/drm/tyr/reset.rs
> > 
> > diff --git a/drivers/gpu/drm/tyr/driver.rs
> > b/drivers/gpu/drm/tyr/driver.rs index f7951804e4e0..c80238a21ff2
> > 100644 --- a/drivers/gpu/drm/tyr/driver.rs
> > +++ b/drivers/gpu/drm/tyr/driver.rs
> > @@ -6,11 +6,8 @@
> >         OptionalClk, //
> >     },
> >     device::{
> > -        Bound,
> > -        Core,
> > -        Device, //
> > +        Core, //
> >     },
> > -    devres::Devres,
> >     dma::{
> >         Device as DmaDevice,
> >         DmaMask, //
> > @@ -22,10 +19,7 @@
> >         Registered,
> >         UnregisteredDevice, //
> >     },
> > -    io::poll,
> > -    new_mutex,
> > -    of,
> > -    platform,
> > +    new_mutex, of, platform,
> >     prelude::*,
> >     regulator,
> >     regulator::Regulator,
> > @@ -35,17 +29,15 @@
> >         Arc,
> >         Mutex, //
> >     },
> > -    time, //
> > };
> > 
> > use crate::{
> >     file::TyrDrmFileData,
> >     fw::Firmware,
> >     gem::BoData,
> > -    gpu,
> >     gpu::GpuInfo,
> >     mmu::Mmu,
> > -    regs, //
> > +    reset, //
> > };
> > 
> > pub(crate) type IoMem = kernel::io::mem::IoMem<SZ_2M>;
> > @@ -62,6 +54,11 @@ pub(crate) struct TyrPlatformDriverData {
> > 
> > #[pin_data]
> > pub(crate) struct TyrDrmDeviceData {
> > +    // `ResetHandle::drop()` drains queued/running works and this
> > must happen
> > +    // before clocks/regulators are dropped. So keep this field
> > before them to
> > +    // ensure the correct drop order.
> > +    pub(crate) reset: reset::ResetHandle,
> > +
> >     pub(crate) pdev: ARef<platform::Device>,
> > 
> >     pub(crate) fw: Arc<Firmware>,
> > @@ -90,22 +87,6 @@ unsafe impl Send for TyrDrmDeviceData {}
> > // SAFETY: This will be removed in a future patch.
> > unsafe impl Sync for TyrDrmDeviceData {}
> > 
> > -fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) ->
> > Result {
> > -    // Clear any stale reset-complete IRQ state before issuing a
> > new soft reset.
> > -    regs::GPU_IRQ_CLEAR.write(dev, iomem,
> > regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
> > -    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
> > -
> > -    poll::read_poll_timeout(
> > -        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
> > -        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED
> > != 0,
> > -        time::Delta::from_millis(1),
> > -        time::Delta::from_millis(100),
> > -    )
> > -    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
> > -
> > -    Ok(())
> > -}
> > -
> > kernel::of_device_table!(
> >     OF_TABLE,
> >     MODULE_OF_TABLE,
> > @@ -138,8 +119,7 @@ fn probe(
> >         let request = pdev.io_request_by_index(0).ok_or(ENODEV)?;
> >         let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(),
> > GFP_KERNEL)?;
> > 
> > -        issue_soft_reset(pdev.as_ref(), &iomem)?;
> > -        gpu::l2_power_on(pdev.as_ref(), &iomem)?;
> > +        reset::run_reset(pdev.as_ref(), &iomem)?;
> > 
> >         let gpu_info = GpuInfo::new(pdev.as_ref(), &iomem)?;
> >         gpu_info.log(pdev);
> > @@ -153,6 +133,7 @@ fn probe(
> > 
> >         let uninit_ddev =
> > UnregisteredDevice::<TyrDrmDriver>::new(pdev.as_ref())?; let
> > platform: ARef<platform::Device> = pdev.into();
> > +        let reset = reset::ResetHandle::new(platform.clone(),
> > iomem.clone())?;
> > 
> >         let mmu = Mmu::new(pdev, iomem.as_arc_borrow(), &gpu_info)?;
> > 
> > @@ -178,6 +159,7 @@ fn probe(
> >                     _mali: mali_regulator,
> >                     _sram: sram_regulator,
> >                 }),
> > +                reset,
> >                 gpu_info,
> >         });
> >         let ddev = Registration::new_foreign_owned(uninit_ddev,
> > pdev.as_ref(), data, 0)?; diff --git a/drivers/gpu/drm/tyr/reset.rs
> > b/drivers/gpu/drm/tyr/reset.rs new file mode 100644
> > index 000000000000..29dfae98b0dd
> > --- /dev/null
> > +++ b/drivers/gpu/drm/tyr/reset.rs
> > @@ -0,0 +1,180 @@
> > +// SPDX-License-Identifier: GPL-2.0 or MIT
> > +
> > +//! Provides asynchronous reset handling for the Tyr DRM driver via
> > +//! [`ResetHandle`], which runs reset work on a dedicated ordered
> > +//! workqueue and avoids duplicate pending resets.
> > +
> > +use kernel::{
> > +    device::{
> > +        Bound,
> > +        Device, //
> > +    },
> > +    devres::Devres,
> > +    io::poll,
> > +    platform,
> > +    prelude::*,
> > +    sync::{
> > +        aref::ARef,
> > +        atomic::{
> > +            Acquire,
> > +            Atomic,
> > +            Relaxed,
> > +            Release, //
> > +        },
> > +        Arc,
> > +    },
> > +    time,
> > +    workqueue::{
> > +        self,
> > +        Work, //
> > +    },
> > +};
> > +
> > +use crate::{
> > +    driver::IoMem,
> > +    gpu,
> > +    regs, //
> > +};
> > +
> > +/// Manages asynchronous GPU reset handling and ensures only a
> > single reset +/// work is pending at a time.
> > +#[pin_data]
> > +struct Controller {
> > +    /// Platform device reference needed for reset operations and
> > logging.
> > +    pdev: ARef<platform::Device>,
> > +    /// Mapped register space needed for reset operations.
> > +    iomem: Arc<Devres<IoMem>>,
> > +    /// Atomic flag for controlling the scheduling pending state.
> > +    pending: Atomic<bool>,
> > +    /// Dedicated ordered workqueue for reset operations.
> > +    wq: workqueue::OrderedQueue,
> > +    /// Work item backing async reset processing.
> > +    #[pin]
> > +    work: Work<Controller>,
> > +}
> > +
> > +kernel::impl_has_work! {
> > +    impl HasWork<Controller> for Controller { self.work }
> > +}
> > +
> > +impl workqueue::WorkItem for Controller {
> > +    type Pointer = Arc<Self>;
> > +
> > +    fn run(this: Arc<Self>) {
> > +        this.reset_work();
> > +    }
> > +}
> > +
> > +impl Controller {
> > +    /// Creates a [`Controller`] instance.
> > +    fn new(pdev: ARef<platform::Device>, iomem:
> > Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
> > +        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
> > +
> > +        Arc::pin_init(
> > +            try_pin_init!(Self {
> > +                pdev,
> > +                iomem,
> > +                pending: Atomic::new(false),
> > +                wq,
> > +                work <- kernel::new_work!("tyr::reset"),
> > +            }),
> > +            GFP_KERNEL,
> > +        )
> > +    }
> > +
> > +    /// Processes one scheduled reset request.
> > +    ///
> > +    /// Panthor reference:
> > +    /// -
> > drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
> > +    fn reset_work(self: &Arc<Self>) {
> > +        dev_info!(self.pdev.as_ref(), "GPU reset work is
> > started.\n"); +
> > +        // SAFETY: `Controller` is part of driver-private data and
> > only exists
> > +        // while the platform device is bound.
> > +        let pdev = unsafe { self.pdev.as_ref().as_bound() };
> > +        if let Err(e) = run_reset(pdev, &self.iomem) {
> > +            dev_err!(self.pdev.as_ref(), "GPU reset failed:
> > {:?}\n", e);
> > +        } else {
> > +            dev_info!(self.pdev.as_ref(), "GPU reset work is
> > done.\n");
> > +        }
> 
> Can we have more descriptive strings here? A user cares little for
> implementation details such as “reset work”, what they care about is
> that the hardware is undergoing a reset.
> 

Sure, I will update it.

> > +
> > +        self.pending.store(false, Release);
> > +    }
> > +}
> > +
> > +/// Reset handle that shuts down pending work gracefully on drop.
> > +pub(crate) struct ResetHandle(Arc<Controller>);
> > +
> 
> Why is this an Arc? There seems to be a single owner?
> 

Once we queue reset work, the workqueue needs its own ref so Controller
stays alive until the worker runs and returns. ResetHandle keeps the
normal ref and the queued work holds the extra one.

Regards,
Onur

> > +impl ResetHandle {
> > +    /// Creates a [`ResetHandle`] instance.
> > +    pub(crate) fn new(pdev: ARef<platform::Device>, iomem:
> > Arc<Devres<IoMem>>) -> Result<Self> {
> > +        Ok(Self(Controller::new(pdev, iomem)?))
> > +    }
> > +
> > +    /// Schedules reset work.
> > +    #[expect(dead_code)]
> > +    pub(crate) fn schedule(&self) {
> > +        // TODO: Similar to `panthor_device_schedule_reset()` in
> > Panthor, add a
> > +        // power management check once Tyr supports it.
> > +
> > +        // Keep only one reset request running or queued. If one
> > is already pending,
> > +        // we ignore new schedule requests.
> > +        if self.0.pending.cmpxchg(false, true, Relaxed).is_ok()
> > +            && self.0.wq.enqueue(self.0.clone()).is_err()
> > +        {
> > +            self.0.pending.store(false, Release);
> > +        }
> > +    }
> > +
> > +    /// Returns true if a reset is queued or in progress.
> > +    ///
> > +    /// Note that the state can change immediately after the
> > return.
> > +    #[inline]
> > +    #[expect(dead_code)]
> > +    pub(crate) fn is_pending(&self) -> bool {
> > +        self.0.pending.load(Acquire)
> > +    }
> 
> > +}
> > +
> > +impl Drop for ResetHandle {
> > +    fn drop(&mut self) {
> > +        // Drain queued/running work and block future queueing
> > attempts for this
> > +        // work item before clocks/regulators are torn down.
> > +        // SAFETY: drop executes in a sleepable context.
> > +        unsafe { self.0.work.disable_sync() };
> > +    }
> > +}
> > +
> > +/// Issues a soft reset command and waits for reset-complete IRQ
> > status. +fn issue_soft_reset(dev: &Device<Bound>, iomem:
> > &Devres<IoMem>) -> Result {
> > +    // Clear any stale reset-complete IRQ state before issuing a
> > new soft reset.
> > +    regs::GPU_IRQ_CLEAR.write(dev, iomem,
> > regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
> > +    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
> > +
> > +    poll::read_poll_timeout(
> > +        || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
> > +        |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED
> > != 0,
> > +        time::Delta::from_millis(1),
> > +        time::Delta::from_millis(100),
> > +    )
> > +    .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
> > +
> > +    Ok(())
> > +}
> > +
> > +/// Runs one synchronous GPU reset pass.
> > +///
> > +/// Its visibility is `pub(super)` only so the probe path can run
> > an +/// initial reset; it is not part of this module's public API.
> > +///
> > +/// On success, the GPU is left in a state suitable for
> > reinitialization. +///
> > +/// The reset sequence is as follows:
> > +/// 1. Trigger a GPU soft reset.
> > +/// 2. Wait for the reset-complete IRQ status.
> > +/// 3. Power L2 back on.
> > +pub(super) fn run_reset(dev: &Device<Bound>, iomem:
> > &Devres<IoMem>) -> Result {
> > +    issue_soft_reset(dev, iomem)?;
> > +    gpu::l2_power_on(dev, iomem)?;
> > +    Ok(())
> > +}
> > diff --git a/drivers/gpu/drm/tyr/tyr.rs b/drivers/gpu/drm/tyr/tyr.rs
> > index 18b0668bb217..d0349bc49f27 100644
> > --- a/drivers/gpu/drm/tyr/tyr.rs
> > +++ b/drivers/gpu/drm/tyr/tyr.rs
> > @@ -14,6 +14,7 @@
> > mod gpu;
> > mod mmu;
> > mod regs;
> > +mod reset;
> > mod slot;
> > mod vm;
> > 
> > -- 
> > 2.51.2
> > 
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 2/4] rust: add Work::disable_sync
  2026-03-13 12:00   ` Alice Ryhl
@ 2026-03-15 10:45     ` Onur Özkan
  0 siblings, 0 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-15 10:45 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: linux-kernel, dakr, daniel.almeida, airlied, simona, dri-devel,
	rust-for-linux, Deborah Brouwer

On Fri, 13 Mar 2026 13:00:13 +0100
Alice Ryhl <aliceryhl@google.com> wrote:

> On Fri, Mar 13, 2026 at 10:17 AM Onur Özkan <work@onurozkan.dev>
> wrote:
> >
> > Add Work::disable_sync() as a safe wrapper for disable_work_sync().
> >
> > Drivers can use this during teardown to stop new queueing and wait
> > for queued or running work to finish before dropping related
> > resources.
> >
> > Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
> > Signed-off-by: Onur Özkan <work@onurozkan.dev>
> > ---
> >  rust/kernel/workqueue.rs | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
> > index 706e833e9702..6acc7b5ba31c 100644
> > --- a/rust/kernel/workqueue.rs
> > +++ b/rust/kernel/workqueue.rs
> > @@ -530,6 +530,21 @@ pub unsafe fn raw_get(ptr: *const Self) ->
> > *mut bindings::work_struct { // the compiler does not complain that
> > the `work` field is unused. unsafe {
> > Opaque::cast_into(core::ptr::addr_of!((*ptr).work)) } }
> > +
> > +    /// Disables this work item and waits for queued/running
> > executions to finish.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// Must be called from a sleepable context if the work was
> > last queued on a non-BH
> > +    /// workqueue.
> 
> We generally do not make functions unsafe just because they might
> sleep.
> 
> Alice

I will convert that into a "# Note" in the next version.

Regards,
Onur

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset
  2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
@ 2026-03-19 10:47   ` Boris Brezillon
  0 siblings, 0 replies; 15+ messages in thread
From: Boris Brezillon @ 2026-03-19 10:47 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, aliceryhl, daniel.almeida, airlied, simona,
	dri-devel, rust-for-linux, Deborah Brouwer

On Fri, 13 Mar 2026 12:16:41 +0300
Onur Özkan <work@onurozkan.dev> wrote:

> Clear RESET_COMPLETED before writing GPU_CMD_SOFT_RESET.
> 
> This is also used in
> drivers/gpu/drm/panfrost/panfrost_gpu.c::panfrost_gpu_soft_reset
> and avoids seeing old reset-complete status from a previous reset.
> 
> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
> Signed-off-by: Onur Özkan <work@onurozkan.dev>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/tyr/driver.rs | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
> index 69eff2a9e116..f7951804e4e0 100644
> --- a/drivers/gpu/drm/tyr/driver.rs
> +++ b/drivers/gpu/drm/tyr/driver.rs
> @@ -91,6 +91,8 @@ unsafe impl Send for TyrDrmDeviceData {}
>  unsafe impl Sync for TyrDrmDeviceData {}
>  
>  fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
> +    // Clear any stale reset-complete IRQ state before issuing a new soft reset.
> +    regs::GPU_IRQ_CLEAR.write(dev, iomem, regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED)?;
>      regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
>  
>      poll::read_poll_timeout(


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
  2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
  2026-03-13 14:56   ` Daniel Almeida
@ 2026-03-19 11:08   ` Boris Brezillon
  2026-03-19 12:51     ` Onur Özkan
  1 sibling, 1 reply; 15+ messages in thread
From: Boris Brezillon @ 2026-03-19 11:08 UTC (permalink / raw)
  To: Onur Özkan
  Cc: linux-kernel, dakr, aliceryhl, daniel.almeida, airlied, simona,
	dri-devel, rust-for-linux, Deborah Brouwer

On Fri, 13 Mar 2026 12:16:44 +0300
Onur Özkan <work@onurozkan.dev> wrote:

> +impl Controller {
> +    /// Creates a [`Controller`] instance.
> +    fn new(pdev: ARef<platform::Device>, iomem: Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
> +        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
> +
> +        Arc::pin_init(
> +            try_pin_init!(Self {
> +                pdev,
> +                iomem,
> +                pending: Atomic::new(false),
> +                wq,
> +                work <- kernel::new_work!("tyr::reset"),
> +            }),
> +            GFP_KERNEL,
> +        )
> +    }
> +
> +    /// Processes one scheduled reset request.
> +    ///
> +    /// Panthor reference:
> +    /// - drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
> +    fn reset_work(self: &Arc<Self>) {
> +        dev_info!(self.pdev.as_ref(), "GPU reset work is started.\n");
> +
> +        // SAFETY: `Controller` is part of driver-private data and only exists
> +        // while the platform device is bound.
> +        let pdev = unsafe { self.pdev.as_ref().as_bound() };
> +        if let Err(e) = run_reset(pdev, &self.iomem) {
> +            dev_err!(self.pdev.as_ref(), "GPU reset failed: {:?}\n", e);
> +        } else {
> +            dev_info!(self.pdev.as_ref(), "GPU reset work is done.\n");
> +        }

Unfortunately, the reset operation is not as simple as instructing the
GPU to reset, it's a complex synchronization process where you need to
try to gracefully put various components on hold before you reset, and
then resume those after the reset is effective. Of course, with what we
currently have in-tree, there's not much to suspend/resume, but I think
I'd prefer to design the thing so we can progressively add more
components without changing the reset logic too much.

I would probably start with a Resettable trait that has the
{pre,post}_reset() methods that exist in Panthor.

The other thing we need is a way for those components to know when a
reset is about to happen so they can postpone some actions they were
planning in order to not further delay the reset, or end up with
actions that fail because the HW is already unusable. Not too sure how
we want to handle that though. Panthor is currently sprinkled with
panthor_device_reset_is_pending() calls in key places, but that's still
very manual, maybe we can automate that a bit more in Tyr, dunno.

> +
> +        self.pending.store(false, Release);
> +    }
> +}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
  2026-03-19 11:08   ` Boris Brezillon
@ 2026-03-19 12:51     ` Onur Özkan
  0 siblings, 0 replies; 15+ messages in thread
From: Onur Özkan @ 2026-03-19 12:51 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: linux-kernel, dakr, aliceryhl, daniel.almeida, airlied, simona,
	dri-devel, rust-for-linux, Deborah Brouwer

Hi Boris,

On Thu, 19 Mar 2026 12:08:28 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Fri, 13 Mar 2026 12:16:44 +0300
> Onur Özkan <work@onurozkan.dev> wrote:
> 
> > +impl Controller {
> > +    /// Creates a [`Controller`] instance.
> > +    fn new(pdev: ARef<platform::Device>, iomem:
> > Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
> > +        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
> > +
> > +        Arc::pin_init(
> > +            try_pin_init!(Self {
> > +                pdev,
> > +                iomem,
> > +                pending: Atomic::new(false),
> > +                wq,
> > +                work <- kernel::new_work!("tyr::reset"),
> > +            }),
> > +            GFP_KERNEL,
> > +        )
> > +    }
> > +
> > +    /// Processes one scheduled reset request.
> > +    ///
> > +    /// Panthor reference:
> > +    /// -
> > drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
> > +    fn reset_work(self: &Arc<Self>) {
> > +        dev_info!(self.pdev.as_ref(), "GPU reset work is
> > started.\n"); +
> > +        // SAFETY: `Controller` is part of driver-private data and
> > only exists
> > +        // while the platform device is bound.
> > +        let pdev = unsafe { self.pdev.as_ref().as_bound() };
> > +        if let Err(e) = run_reset(pdev, &self.iomem) {
> > +            dev_err!(self.pdev.as_ref(), "GPU reset failed:
> > {:?}\n", e);
> > +        } else {
> > +            dev_info!(self.pdev.as_ref(), "GPU reset work is
> > done.\n");
> > +        }
> 
> Unfortunately, the reset operation is not as simple as instructing the
> GPU to reset, it's a complex synchronization process where you need to
> try to gracefully put various components on hold before you reset, and
> then resume those after the reset is effective. Of course, with what
> we currently have in-tree, there's not much to suspend/resume, but I
> think I'd prefer to design the thing so we can progressively add more
> components without changing the reset logic too much.
> 
> I would probably start with a Resettable trait that has the
> {pre,post}_reset() methods that exist in Panthor.

Yeah, I checked Panthor and it has these functions for the reset logic.
I will implement that in v2 and will dig further to see if there is
anything else to cover in regards to proper syncing.

> 
> The other thing we need is a way for those components to know when a
> reset is about to happen so they can postpone some actions they were
> planning in order to not further delay the reset, or end up with
> actions that fail because the HW is already unusable. Not too sure how
> we want to handle that though. Panthor is currently sprinkled with
> panthor_device_reset_is_pending() calls in key places, but that's
> still very manual, maybe we can automate that a bit more in Tyr,
> dunno.
>

Hmm, sounds like a perfect guard use-case. Is it possible to require
users to access the hw behind a guard (e.g., try_access_hw())? We would
then check if a reset is in progress and hold the user or return an
error until the reset is complete.

Thanks,
Onur

> > +
> > +        self.pending.store(false, Release);
> > +    }
> > +}


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-03-19 12:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
2026-03-19 10:47   ` Boris Brezillon
2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
2026-03-13 12:00   ` Alice Ryhl
2026-03-15 10:45     ` Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
2026-03-13 14:56   ` Daniel Almeida
2026-03-15 10:44     ` Onur Özkan
2026-03-19 11:08   ` Boris Brezillon
2026-03-19 12:51     ` Onur Özkan
2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
2026-03-13 11:12   ` Onur Özkan
2026-03-13 11:26     ` Alice Ryhl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox