public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Onur Özkan" <work@onurozkan.dev>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: linux-kernel@vger.kernel.org, dakr@kernel.org,
	aliceryhl@google.com, daniel.almeida@collabora.com,
	airlied@gmail.com, simona@ffwll.ch,
	dri-devel@lists.freedesktop.org, rust-for-linux@vger.kernel.org,
	Deborah Brouwer <deborah.brouwer@collabora.com>
Subject: Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
Date: Thu, 19 Mar 2026 15:51:35 +0300	[thread overview]
Message-ID: <20260319155135.18f87a1b@nimda> (raw)
In-Reply-To: <20260319120828.52736966@fedora>

Hi Boris,

On Thu, 19 Mar 2026 12:08:28 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Fri, 13 Mar 2026 12:16:44 +0300
> Onur Özkan <work@onurozkan.dev> wrote:
> 
> > +impl Controller {
> > +    /// Creates a [`Controller`] instance.
> > +    fn new(pdev: ARef<platform::Device>, iomem:
> > Arc<Devres<IoMem>>) -> Result<Arc<Self>> {
> > +        let wq = workqueue::OrderedQueue::new(c"tyr-reset-wq", 0)?;
> > +
> > +        Arc::pin_init(
> > +            try_pin_init!(Self {
> > +                pdev,
> > +                iomem,
> > +                pending: Atomic::new(false),
> > +                wq,
> > +                work <- kernel::new_work!("tyr::reset"),
> > +            }),
> > +            GFP_KERNEL,
> > +        )
> > +    }
> > +
> > +    /// Processes one scheduled reset request.
> > +    ///
> > +    /// Panthor reference:
> > +    /// -
> > drivers/gpu/drm/panthor/panthor_device.c::panthor_device_reset_work()
> > +    fn reset_work(self: &Arc<Self>) {
> > +        dev_info!(self.pdev.as_ref(), "GPU reset work is
> > started.\n"); +
> > +        // SAFETY: `Controller` is part of driver-private data and
> > only exists
> > +        // while the platform device is bound.
> > +        let pdev = unsafe { self.pdev.as_ref().as_bound() };
> > +        if let Err(e) = run_reset(pdev, &self.iomem) {
> > +            dev_err!(self.pdev.as_ref(), "GPU reset failed:
> > {:?}\n", e);
> > +        } else {
> > +            dev_info!(self.pdev.as_ref(), "GPU reset work is
> > done.\n");
> > +        }
> 
> Unfortunately, the reset operation is not as simple as instructing the
> GPU to reset, it's a complex synchronization process where you need to
> try to gracefully put various components on hold before you reset, and
> then resume those after the reset is effective. Of course, with what
> we currently have in-tree, there's not much to suspend/resume, but I
> think I'd prefer to design the thing so we can progressively add more
> components without changing the reset logic too much.
> 
> I would probably start with a Resettable trait that has the
> {pre,post}_reset() methods that exist in Panthor.

Yeah, I checked Panthor and it has these functions for the reset logic.
I will implement that in v2 and will dig further to see if there is
anything else to cover in regards to proper syncing.

> 
> The other thing we need is a way for those components to know when a
> reset is about to happen so they can postpone some actions they were
> planning in order to not further delay the reset, or end up with
> actions that fail because the HW is already unusable. Not too sure how
> we want to handle that though. Panthor is currently sprinkled with
> panthor_device_reset_is_pending() calls in key places, but that's
> still very manual, maybe we can automate that a bit more in Tyr,
> dunno.
>

Hmm, sounds like a perfect guard use-case. Is it possible to require
users to access the hw behind a guard (e.g., try_access_hw())? We would
then check if a reset is in progress and hold the user or return an
error until the reset is complete.

Thanks,
Onur

> > +
> > +        self.pending.store(false, Release);
> > +    }
> > +}


  reply	other threads:[~2026-03-19 12:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
2026-03-19 10:47   ` Boris Brezillon
2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
2026-03-13 12:00   ` Alice Ryhl
2026-03-15 10:45     ` Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
2026-03-13 14:56   ` Daniel Almeida
2026-03-15 10:44     ` Onur Özkan
2026-03-19 11:08   ` Boris Brezillon
2026-03-19 12:51     ` Onur Özkan [this message]
2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
2026-03-13 11:12   ` Onur Özkan
2026-03-13 11:26     ` Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260319155135.18f87a1b@nimda \
    --to=work@onurozkan.dev \
    --cc=airlied@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=deborah.brouwer@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox