From: Jason Gunthorpe <jgg@nvidia.com>
To: John Hubbard <jhubbard@nvidia.com>,
Greg KH <gregkh@linuxfoundation.org>,
Danilo Krummrich <dakr@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Alexandre Courbot <acourbot@nvidia.com>,
Dave Airlie <airlied@gmail.com>, Gary Guo <gary@garyguo.net>,
Joel Fernandes <joel@joelfernandes.org>,
Boqun Feng <boqun.feng@gmail.com>,
Ben Skeggs <bskeggs@nvidia.com>,
linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
paulmck@kernel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation
Date: Fri, 7 Mar 2025 10:55:57 -0400 [thread overview]
Message-ID: <20250307145557.GO354511@nvidia.com> (raw)
In-Reply-To: <Z8rv-DQuGdxye28N@phenom.ffwll.local>
On Fri, Mar 07, 2025 at 02:09:12PM +0100, Simona Vetter wrote:
> > A driver can do a health check immediately in remove() and make a
> > decision if the device is alive or not to speed up removal in the
> > hostile hot unplug case.
>
> Hm ... I guess when you get an all -1 read you check with a specific
> register to make sure it's not a false positive? Since for some registers
> that's a valid value.
Yes. mlx5 has HW designed to support this, but I imagine on most
devices you could find an ID register or something that won't be -1.
> - The "at least we don't blow up with memory safety issues" bare minimum
> that the rust abstractions should guarantee. So revocable and friends.
I still really dislike recovable because it imposes a cost that is
unnecessary.
> And I think the latter safety fallback does not prevent you from doing the
> full fancy design, e.g. for revocable resources that only happens after
> your explicitly-coded ->remove() callback has finished. Which means you
> still have full access to the hw like anywhere else.
Yes, if you use rust bindings with something like RDMA then I would
expect that by the time remove is done everything is cleaned up and
all the revokable stuff was useless and never used.
This is why I dislike revoke so much. It is adding a bunch of garbage
all over the place that is *never used* if the driver is working
correctly.
I believe it is much better to runtime check that the driver is
correct and not burden the API design with this.
Giving people these features will only encourage them to write wrong
drivers.
This is not even a new idea, devm introduces automatic lifetime into
the kernel and I've sat in presentations about how devm has all sorts
of bug classes because of misuse. :\
> Does this sounds like a possible conclusion of this thread, or do we need
> to keep digging?
IDK, I think this should be socialized more. It is important as it
effects all drivers here out, and it is radically different to how the
kernel works today.
> Also now that I look at this problem as a two-level issue, I think drm is
> actually a lot better than what I explained. If you clean up driver state
> properly in ->remove (or as stack automatic cleanup functions that run
> before all the mmio/irq/whatever stuff disappears), then we are largely
> there already with being able to fully quiescent driver state enough to
> make sure no new requests can sneak in.
That is the typical subsystem design!
Thanks,
Jason
next prev parent reply other threads:[~2025-03-07 14:56 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 14:04 [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Alexandre Courbot
2025-02-17 14:04 ` [PATCH RFC 1/3] rust: add useful ops for u64 Alexandre Courbot
2025-02-17 20:47 ` Sergio González Collado
2025-02-17 21:10 ` Daniel Almeida
2025-02-18 13:16 ` Alexandre Courbot
2025-02-18 20:51 ` Timur Tabi
2025-02-19 1:21 ` Alexandre Courbot
2025-02-19 3:24 ` John Hubbard
2025-02-19 12:51 ` Alexandre Courbot
2025-02-19 20:22 ` John Hubbard
2025-02-19 20:23 ` Dave Airlie
2025-02-19 23:13 ` Daniel Almeida
2025-02-20 0:14 ` John Hubbard
2025-02-21 11:35 ` Alexandre Courbot
2025-02-21 12:31 ` Danilo Krummrich
2025-02-19 20:11 ` Sergio González Collado
2025-02-18 10:07 ` Dirk Behme
2025-02-18 13:07 ` Alexandre Courbot
2025-02-20 6:23 ` Dirk Behme
2025-02-17 14:04 ` [PATCH RFC 2/3] rust: make ETIMEDOUT error available Alexandre Courbot
2025-02-17 21:15 ` Daniel Almeida
2025-02-17 14:04 ` [PATCH RFC 3/3] gpu: nova-core: add basic timer device Alexandre Courbot
2025-02-17 15:48 ` [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Simona Vetter
2025-02-18 8:07 ` Greg KH
2025-02-18 13:23 ` Alexandre Courbot
2025-02-17 21:33 ` Danilo Krummrich
2025-02-18 1:46 ` Dave Airlie
2025-02-18 10:26 ` Danilo Krummrich
2025-02-19 12:58 ` Simona Vetter
2025-02-24 1:40 ` Alexandre Courbot
2025-02-24 12:07 ` Danilo Krummrich
2025-02-24 12:11 ` Danilo Krummrich
2025-02-24 18:45 ` Joel Fernandes
2025-02-24 23:44 ` Danilo Krummrich
2025-02-25 15:52 ` Joel Fernandes
2025-02-25 16:09 ` Danilo Krummrich
2025-02-25 21:02 ` Joel Fernandes
2025-02-25 22:02 ` Danilo Krummrich
2025-02-25 22:42 ` Dave Airlie
2025-02-25 22:57 ` Jason Gunthorpe
2025-02-25 23:26 ` Danilo Krummrich
2025-02-25 23:45 ` Danilo Krummrich
2025-02-26 0:49 ` Jason Gunthorpe
2025-02-26 1:16 ` Danilo Krummrich
2025-02-26 17:21 ` Jason Gunthorpe
2025-02-26 21:31 ` Danilo Krummrich
2025-02-26 23:47 ` Jason Gunthorpe
2025-02-27 0:41 ` Boqun Feng
2025-02-27 14:46 ` Jason Gunthorpe
2025-02-27 15:18 ` Boqun Feng
2025-02-27 16:17 ` Jason Gunthorpe
2025-02-27 16:55 ` Boqun Feng
2025-02-27 17:32 ` Danilo Krummrich
2025-02-27 19:23 ` Jason Gunthorpe
2025-02-27 21:25 ` Boqun Feng
2025-02-27 22:00 ` Jason Gunthorpe
2025-02-27 22:40 ` Danilo Krummrich
2025-02-28 18:55 ` Jason Gunthorpe
2025-03-03 19:36 ` Danilo Krummrich
2025-03-03 21:50 ` Jason Gunthorpe
2025-03-04 9:57 ` Danilo Krummrich
2025-02-27 1:02 ` Greg KH
2025-02-27 1:34 ` John Hubbard
2025-02-27 21:42 ` Dave Airlie
2025-02-27 23:06 ` John Hubbard
2025-02-28 4:10 ` Dave Airlie
2025-02-28 18:50 ` Jason Gunthorpe
2025-02-28 10:52 ` Simona Vetter
2025-02-28 18:40 ` Jason Gunthorpe
2025-03-04 16:10 ` Simona Vetter
2025-03-04 16:42 ` Jason Gunthorpe
2025-03-05 7:30 ` Simona Vetter
2025-03-05 15:10 ` Jason Gunthorpe
2025-03-06 10:42 ` Simona Vetter
2025-03-06 15:32 ` Jason Gunthorpe
2025-03-07 10:28 ` Simona Vetter
2025-03-07 12:32 ` Jason Gunthorpe
2025-03-07 13:09 ` Simona Vetter
2025-03-07 14:55 ` Jason Gunthorpe [this message]
2025-03-13 14:32 ` Simona Vetter
2025-03-19 17:21 ` Jason Gunthorpe
2025-03-21 10:35 ` Simona Vetter
2025-03-21 12:04 ` Jason Gunthorpe
2025-03-21 12:12 ` Danilo Krummrich
2025-03-21 17:49 ` Jason Gunthorpe
2025-03-21 18:54 ` Danilo Krummrich
2025-03-07 14:00 ` Greg KH
2025-03-07 14:46 ` Jason Gunthorpe
2025-03-07 15:19 ` Greg KH
2025-03-07 15:25 ` Jason Gunthorpe
2025-02-27 14:23 ` Jason Gunthorpe
2025-02-27 11:32 ` Danilo Krummrich
2025-02-27 15:07 ` Jason Gunthorpe
2025-02-27 16:51 ` Danilo Krummrich
2025-02-25 14:11 ` Alexandre Courbot
2025-02-25 15:06 ` Danilo Krummrich
2025-02-25 15:23 ` Alexandre Courbot
2025-02-25 15:53 ` Danilo Krummrich
2025-02-27 21:37 ` Dave Airlie
2025-02-28 1:49 ` Timur Tabi
2025-02-28 2:24 ` Dave Airlie
2025-02-18 13:35 ` Alexandre Courbot
2025-02-18 1:42 ` Dave Airlie
2025-02-18 13:47 ` Alexandre Courbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250307145557.GO354511@nvidia.com \
--to=jgg@nvidia.com \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=bskeggs@nvidia.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=jhubbard@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=paulmck@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox