From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-43170.protonmail.ch (mail-43170.protonmail.ch [185.70.43.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1A6F1B3925; Sat, 11 Apr 2026 06:03:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.70.43.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775887386; cv=none; b=hUOWFiRJdGYT1Svv+yYzXTHege3Qe927AAj2K8OJ+17Cb/aq1D54C2PfcRirjmBEVytMj9NmgtcuKPuUFefXfwD5OZSZqTq5tdECyCDkDmUF0i6rOmhSWrajZOtnWnRQn6nB66tpJ8ywW4zY6Eo/B5ohlDhfw6HzoUyYO3XqN40= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775887386; c=relaxed/simple; bh=rUSRzWzifnCxIcUYaLs3JQUcjFG6U7TEGxxMQiMeIYI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=LQe9SmF7QjXZVP2HyCBrl9S5PW0jdsetscd3o31T8kVzawvy3jd9v6AI9wurdlXWk/Sa7XGTx28Wz6PTZUpsDkHg9sy/q78DHZMYdg13ny+7a797BFJn0KZrmW40k6PuGEjNBxqB5bGLCfK/nWI8IXddyMauy5xs9Zm5oiPNdp4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=onurozkan.dev; spf=pass smtp.mailfrom=onurozkan.dev; dkim=pass (2048-bit key) header.d=onurozkan.dev header.i=@onurozkan.dev header.b=Zt711YtG; arc=none smtp.client-ip=185.70.43.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=onurozkan.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=onurozkan.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=onurozkan.dev header.i=@onurozkan.dev header.b="Zt711YtG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onurozkan.dev; s=protonmail; t=1775887375; x=1776146575; bh=dbjtQIy9yXiga+86q0/zkZ1w+yqlhdtNrWMutoBZihM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:From:To: Cc:Date:Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=Zt711YtGbji+I45u/cru8OOhCR+qQf5CGKE84zTxU5JGi+pydj4NlfWhXGi+fO6XL NZ2wdk5H68qDdAGz+f3QjOxQlXVJdfX9KKtIX72AgixIcoVfMTal7EX6JaETHIdeRq SPwWv73Ksef1sn01XyB/tFxnTcQ2JuWEE8TKu7FP5qkmq4wrjhvAdyWSJMlVWB1Xq2 S+7pLtIUaXD23qZl4fMtX9y2GSSpFnqvSPIsesFU8OlkzNXNa/rm29fm76D2gbE9Fm y+APkJ29XhUQyjjQRns51yTOBaVkbxAnPRztgyscZAzwrZ7k7/xVklmfoSeST+Whf4 Ru80P2No5dPkg== X-Pm-Submission-Id: 4ft34D5yqhz1DDrl From: =?UTF-8?q?Onur=20=C3=96zkan?= To: =?UTF-8?q?Onur=20=C3=96zkan?= Cc: Alice Ryhl , Daniel Almeida , Boris Brezillon , linux-kernel@vger.kernel.org, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, dri-devel@lists.freedesktop.org, rust-for-linux@vger.kernel.org, Deborah Brouwer Subject: Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Date: Sat, 11 Apr 2026 09:02:51 +0300 Message-ID: <20260411060251.11863-1-work@onurozkan.dev> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20260411055215.120719-1-work@onurozkan.dev> References: <20260313091646.16938-1-work@onurozkan.dev> <20260313091646.16938-5-work@onurozkan.dev> <20260319120828.52736966@fedora> <9876893D-F3B4-4CA4-8858-473B6FB8E7EB@collabora.com> <20260409114133.43134-1-work@onurozkan.dev> <395ED15F-3BC1-48C0-BE36-AF3A951E464D@collabora.com> <20260411055215.120719-1-work@onurozkan.dev> Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sat, 11 Apr 2026 08:52:15 +0300=0D Onur =C3=96zkan wrote:=0D =0D > On Fri, 10 Apr 2026 07:56:38 +0000=0D > Alice Ryhl wrote:=0D > =0D > > On Thu, Apr 09, 2026 at 10:44:24AM -0300, Daniel Almeida wrote:=0D > > > Hi Onur,=0D > > > =0D > > > > On 9 Apr 2026, at 08:41, Onur =C3=96zkan wrote= :=0D > > > > =0D > > > > On Fri, 03 Apr 2026 12:01:09 -0300=0D > > > > Daniel Almeida wrote:=0D > > > > =0D > > > >> =0D > > > >> =0D > > > >>> On 19 Mar 2026, at 08:08, Boris Brezillon wrote:=0D > > > >>> =0D > > > >>> On Fri, 13 Mar 2026 12:16:44 +0300=0D > > > >>> Onur =C3=96zkan wrote:=0D > > > >>> =0D > > > >>>> +impl Controller {=0D > > > >>>> + /// Creates a [`Controller`] instance.=0D > > > >>>> + fn new(pdev: ARef, iomem: Arc>) -> Result> {=0D > > > >>>> + let wq =3D workqueue::OrderedQueue::new(c"tyr-reset-wq"= , 0)?;=0D > > > >>>> +=0D > > > >>>> + Arc::pin_init(=0D > > > >>>> + try_pin_init!(Self {=0D > > > >>>> + pdev,=0D > > > >>>> + iomem,=0D > > > >>>> + pending: Atomic::new(false),=0D > > > >>>> + wq,=0D > > > >>>> + work <- kernel::new_work!("tyr::reset"),=0D > > > >>>> + }),=0D > > > >>>> + GFP_KERNEL,=0D > > > >>>> + )=0D > > > >>>> + }=0D > > > >>>> +=0D > > > >>>> + /// Processes one scheduled reset request.=0D > > > >>>> + ///=0D > > > >>>> + /// Panthor reference:=0D > > > >>>> + /// - drivers/gpu/drm/panthor/panthor_device.c::panthor_dev= ice_reset_work()=0D > > > >>>> + fn reset_work(self: &Arc) {=0D > > > >>>> + dev_info!(self.pdev.as_ref(), "GPU reset work is starte= d.\n");=0D > > > >>>> +=0D > > > >>>> + // SAFETY: `Controller` is part of driver-private data = and only exists=0D > > > >>>> + // while the platform device is bound.=0D > > > >>>> + let pdev =3D unsafe { self.pdev.as_ref().as_bound() };= =0D > > > >>>> + if let Err(e) =3D run_reset(pdev, &self.iomem) {=0D > > > >>>> + dev_err!(self.pdev.as_ref(), "GPU reset failed: {:?= }\n", e);=0D > > > >>>> + } else {=0D > > > >>>> + dev_info!(self.pdev.as_ref(), "GPU reset work is do= ne.\n");=0D > > > >>>> + }=0D > > > >>> =0D > > > >>> Unfortunately, the reset operation is not as simple as instructin= g the=0D > > > >>> GPU to reset, it's a complex synchronization process where you ne= ed to=0D > > > >>> try to gracefully put various components on hold before you reset= , and=0D > > > >>> then resume those after the reset is effective. Of course, with w= hat we=0D > > > >>> currently have in-tree, there's not much to suspend/resume, but I= think=0D > > > >>> I'd prefer to design the thing so we can progressively add more=0D > > > >>> components without changing the reset logic too much.=0D > > > >>> =0D > > > >>> I would probably start with a Resettable trait that has the=0D > > > >>> {pre,post}_reset() methods that exist in Panthor.=0D > > > >>> =0D > > > >>> The other thing we need is a way for those components to know whe= n a=0D > > > >>> reset is about to happen so they can postpone some actions they w= ere=0D > > > >>> planning in order to not further delay the reset, or end up with= =0D > > > >>> actions that fail because the HW is already unusable. Not too sur= e how=0D > > > >>> we want to handle that though. Panthor is currently sprinkled wit= h=0D > > > >>> panthor_device_reset_is_pending() calls in key places, but that's= still=0D > > > >>> very manual, maybe we can automate that a bit more in Tyr, dunno.= =0D > > > >>> =0D > > > >> =0D > > > >> =0D > > > >> We could have an enum where one of the variants is Resetting, and = the other one=0D > > > >> gives access to whatever state is not accessible while resets are = in progress.=0D > > > >> =0D > > > >> Something like=0D > > > >> =0D > > > >> pub enum TyrData {=0D > > > >> Active(ActiveTyrData),=0D > > > >> ResetInProgress(ActiveTyrData)=0D > > > >> }=0D > > > >> =0D > > > >> fn access() -> Option<&ActiveTyrData> {=0D > > > >> =E2=80=A6 // if the =E2=80=9CResetInProgress=E2=80=9D variant is = active, return None=0D > > > >> }=0D > > > >> =0D > > > > =0D > > > > That's an interesting approach, but if it's all about `fn access` f= unction, we=0D > > > > can already do that with a simple atomic state e.g.,:=0D > > > > =0D > > > > // The state flag in reset::Controller=0D > > > > state: Atomic=0D > > > > =0D > > > > fn access(&self) -> Option<&Arc>> {=0D > > > > match self.state.load(Relaxed) {=0D > > > > ResetState::Idle =3D> Some(&self.iomem),=0D > > > > _ =3D> None,=0D > > > > }=0D > > > > }=0D > > > > =0D > > > > What do you think? Would this be sufficient?=0D > > > > =0D > > > > Btw, a sample code snippet from the caller side would be very helpf= ul for=0D > > > > designing this further. That part is kind a blurry for me.=0D > > > > =0D > > > > Thanks,=0D > > > > Onur=0D > > > > =0D > > > >> =0D > > > >>>> +=0D > > > >>>> + self.pending.store(false, Release);=0D > > > >>>> + }=0D > > > >>>> +}=0D > > > =0D > > > I think that there are two things we're trying to implement correctly= :=0D > > > =0D > > > 1) Deny access to a subset of the state while a reset is in progress= =0D > > > 2) Wait for anyone accessing 1) to finish before starting a reset=0D > > > =0D > > > IIUC, using Atomic can solve 1 by bailing if the "reset in progres= s"=0D > > > flag/variant is set, but I don't think it implements 2? One would hav= e to=0D > > > implement more logic to block until the state is not being actively u= sed.=0D > > > =0D > > > Now, there are probably easier ways to solve this, but I propose that= we do the=0D > > > extra legwork to make this explicit and enforceable by the type syste= m.=0D > > > =0D > > > How about introducing a r/w semaphore abstraction? It seems to correc= tly encode=0D > > > the logic we want:=0D > > > =0D > > > a) multiple users can access the state if no reset is pending ("read"= side)=0D > > > b) the reset code can block until the state is no longer being access= ed (the "write" side)=0D > > > =0D > > > In Tyr, this would roughly map to something like:=0D > > > =0D > > > struct TyrData {=0D > > > reset_gate: RwSem=0D > > > // other, always accessible members=0D > > > }=0D > > > =0D > > > impl TyrData {=0D > > > fn try_access(&self) -> Option> {...}= =0D > > > }=0D > > > =0D > > > Where ActiveHwState contains the fw/mmu/sched blocks (these are not u= pstream=0D > > > yet, Deborah has a series that will introduce the fw block that shoul= d land=0D > > > soon) and perhaps more.=0D > > > =0D > > > Now, the reset logic would be roughly:=0D > > > =0D > > > fn reset_work(tdev: Arc) {=0D > > > // Block until nobody else is accessing the hw, prevent others from= =0D > > > // initiating new accesses too..=0D > > > let _guard =3D tdev.reset_gate.write();=0D > > > =0D > > > // pre_reset() all Resettable implementors=0D > > > =0D > > > ... reset=0D > > > =0D > > > // post_Reset all Resettable implementors=0D > > > }=0D > > > =0D > > > Now, for every block that might touch a resource that would be unavai= lable=0D > > > during a reset, we enforce a try_access() via the type system, and en= sure that=0D > > > the reset cannot start while the guard is alive. In particular, ioctl= s would=0D > > > look like:=0D > > > =0D > > > fn ioctl_foo(...) {=0D > > > let hw =3D tdev.reset_gate.try_access()?;=0D > > > // resets are blocked while the guard is alive, no other way to acc= ess that state otherwise=0D > > > }=0D > > > =0D > > > The code will not compile otherwise, so long as we keep the state in = ActiveHwState, i.e.: =0D > > > protected by the r/w sem.=0D > > > =0D > > > This looks like an improvement over Panthor, since Panthor relies on = manually=0D > > > canceling work that access hw state via cancel_work_sync(), and gatin= g new work=0D > > > submissions on the "reset_in_progress" flag, i.e.:=0D > > > =0D > > > /**=0D > > > * sched_queue_work() - Queue a scheduler work.=0D > > > * @sched: Scheduler object.=0D > > > * @wname: Work name.=0D > > > *=0D > > > * Conditionally queues a scheduler work if no reset is pending/in-pr= ogress.=0D > > > */=0D > > > #define sched_queue_work(sched, wname) \=0D > > > do { \=0D > > > if (!atomic_read(&(sched)->reset.in_progress) && \=0D > > > !panthor_device_reset_is_pending((sched)->ptdev)) \=0D > > > queue_work((sched)->wq, &(sched)->wname ## _work); \=0D > > > } while (0)=0D > > > =0D > > > =0D > > > Thoughts?=0D > > =0D > > I don't think a semaphore is the right answer. I think SRCU plus a=0D > > counter makes more sense. (With a sentinal value reserved for "currentl= y=0D > > resetting".)=0D > > =0D > > When you begin using the hardware, you start an srcu critical region an= d=0D > > read the counter. If the counter has the sentinel value, you know the=0D > > hardware is resetting and you fail. Otherwise you record the couter and= =0D > > proceed.=0D > > =0D > > If at any point you release the srcu critical region and want to=0D > > re-acquire it to continue the same ongoing work, then you must ensure=0D > > that the counter still has the same value. This ensures that if the GPU= =0D > > is reset, then even if the reset has finished by the time you come back= ,=0D > > you still fail because the counter has changed.=0D > > =0D > > =0D > > Another option could be to rely on the existing Device unbind logic.=0D > > Entirely remove the class device and run the full unbind procedure,=0D > > cleaning up all devm work etc. Then once that has finished, probe the=0D > > device again and start from scratch. After all, GPU reset does not have= =0D > > to be a cheap operation.=0D > > =0D > > Alice=0D > =0D > Anybody working on the SRCU support or do we need to write one?=0D > =0D > Ignoring the sync primitive for a moment (since there is no clear consens= us=0D > yet), I was trying to figure out the HW access API around a closure.=0D > =0D > I wonder if something like this (roughly, haven't implemented or compiled= it=0D > yet) would make sense:=0D > =0D > struct HwGate {=0D > // srcu, I guess?=0D > }=0D > =0D > struct HwGuard<'a> {=0D > gate: &'a HwGate,=0D > hw: &'a ActiveHw,=0D > }=0D > =0D > struct ActiveHw {=0D > // mmu, fw, ...=0D > }=0D > =0D > struct Controller {=0D > ...=0D > hw: ActiveHw,=0D > }=0D > =0D > impl Controller {=0D > fn with_hw(&self, f: impl FnOnce(&HwGuard<'_>) -> Result) -> Resu= lt {=0D > let guard =3D self.gate.lock(&self.hw)?;=0D > f(&guard)=0D > }=0D > }=0D > =0D > and TyrDrmDeviceData would hava a helper wrapper:=0D > =0D > impl TyrDrmDeviceData {=0D > fn with_hw(&self, f: impl FnOnce(&ActiveHwState) -> Result) -> Re= sult {=0D > self.reset.with_hw(f) // maybe find a better name for "reset"=0D > }=0D > }=0D > =0D > Then the user can use it like this:=0D > =0D > let driver: TyrDrmDeviceData =3D ...;=0D > =0D > driver.with_hw(|hw| {=0D > let mmu =3D hw.mmu;=0D > ...=0D > })=0D > =0D > Any thoughts?=0D > =0D > - Onur=0D =0D I forgot to mention that with the closure approach, we can easily inject=0D additional operations into hardware access like preventing the access if a= =0D reset work has been queued but not started yet, which was one of the key=0D considerations bringed up by Boris during the Thursday call.=0D =0D - Onur=0D