From: Danilo Krummrich <dakr@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Joel Fernandes <joelagnelf@nvidia.com>,
Alexandre Courbot <acourbot@nvidia.com>,
Dave Airlie <airlied@gmail.com>, Gary Guo <gary@garyguo.net>,
Joel Fernandes <joel@joelfernandes.org>,
Boqun Feng <boqun.feng@gmail.com>,
John Hubbard <jhubbard@nvidia.com>,
Ben Skeggs <bskeggs@nvidia.com>,
linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
paulmck@kernel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation
Date: Thu, 27 Feb 2025 17:51:11 +0100 [thread overview]
Message-ID: <Z8CX__mIlFUFEkIh@cassiopeiae> (raw)
In-Reply-To: <20250227150709.GF39591@nvidia.com>
On Thu, Feb 27, 2025 at 11:07:09AM -0400, Jason Gunthorpe wrote:
> On Thu, Feb 27, 2025 at 12:32:45PM +0100, Danilo Krummrich wrote:
> > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote:
> > > On Wed, Feb 26, 2025 at 10:31:10PM +0100, Danilo Krummrich wrote:
> > > > Let's take a step back and look again why we have Devres (and Revocable) for
> > > > e.g. pci::Bar.
> > > >
> > > > The device / driver model requires that device resources are only held by a
> > > > driver, as long as the driver is bound to the device.
> > > >
> > > > For instance, in C we achieve this by calling
> > > >
> > > > pci_iounmap()
> > > > pci_release_region()
> > > >
> > > > from remove().
> > > >
> > > > We rely on this, we trust drivers to actually do this.
> > >
> > > Right, exactly
> > >
> > > But it is not just PCI bar. There are a *huge* number of kernel APIs
> > > that have built in to them the same sort of requirement - teardown
> > > MUST run with remove, and once done the resource cannot be used by
> > > another thread.
> > >
> > > Basically most things involving function pointers has this sort of
> > > lifecycle requirement because it is a common process that prevents a
> > > EAF of module unload.
> >
> > You're still mixing topics, the whole Devres<pci::Bar> thing as about limiting
> > object lifetime to the point where the driver is unbound.
> >
> > Shutting down asynchronous execution of things, i.e. workqueues, timers, IOCTLs
> > to prevent unexpected access to the module .text section is a whole different
> > topic.
>
> Again, the standard kernel design pattern is to put these things
> together so that shutdown isolates concurrency which permits free
> without UAF.
>
> > In other words, assuming that we properly enforce that there are no async
> > execution paths after remove() or module_exit() (not necessarily the same),
> > we still need to ensure that a pci::Bar object does not outlive remove().
>
> Yes, you just have to somehow use rust to ensure a call pci_iounmap()
> happens during remove, after the isolation.
>
> You are already doing it with devm. It seems to me the only problem
> you have is nobody has invented a way in rust to contract that the devm
> won't run until the threads are isolated.
You can do that, pci::Driver::probe() returns a Pin<KBox<Self>>. This object is
dropped when the device is unbound and it runs before the devres callbacks.
Using miscdevice as example, your MiscDeviceRegistration would be a member of
this object and hence dropped on remove() before the devres callbacks revoke
device resources.
>
> I don't see this as insolvable, you can have some input argument to
> any API that creates concurrency that also pushes an ordered
> destructor to the struct device lifecycle that ensures it cancels that
> concurrency.
>
> > Device resources are a bit special, since their lifetime must be cap'd at device
> > unbind, *independent* of the object lifetime they reside in. Hence the Devres
> > container.
>
> I'd argue many resources should be limited to device unbind. Memory is
> perhaps the only exception.
There is a difference between should and must. A driver is fully free to bind
the lifetime of a miscdevice to either to the driver lifetime (object returned
by probe) or the module lifetime, both can be valid. That's a question of
semantics.
A device resource though is only allowed to be held by a driver as long as the
corresponding device is bound to the driver. Hence an API that does not ensure
that the pci::Bar is actually, forcefully dropped on device unbind is unsound.
So, let me ask you again, how do you ensure that a pci::Bar is dropped on device
unbind if we hand it out without the Devres container?
>
> > > My fear, that is intensifying as we go through this discussion, is
> > > that rust binding authors have not fully comprehended what the kernel
> > > life cycle model and common design pattern actually is, and have not
> > > fully thought through issues like module unload creating a lifetime
> > > cycle for *function pointers*.
> >
> > I do *not* see where you take the evidence from to make such a generic
> > statement.
>
> Well, I take the basic insistance that is OK to leak stuff from driver
> scope to module scope is not well designed.
>
Who insists on leaking stuff from driver scope to module scope is OK?
next prev parent reply other threads:[~2025-02-27 16:51 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 14:04 [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Alexandre Courbot
2025-02-17 14:04 ` [PATCH RFC 1/3] rust: add useful ops for u64 Alexandre Courbot
2025-02-17 20:47 ` Sergio González Collado
2025-02-17 21:10 ` Daniel Almeida
2025-02-18 13:16 ` Alexandre Courbot
2025-02-18 20:51 ` Timur Tabi
2025-02-19 1:21 ` Alexandre Courbot
2025-02-19 3:24 ` John Hubbard
2025-02-19 12:51 ` Alexandre Courbot
2025-02-19 20:22 ` John Hubbard
2025-02-19 20:23 ` Dave Airlie
2025-02-19 23:13 ` Daniel Almeida
2025-02-20 0:14 ` John Hubbard
2025-02-21 11:35 ` Alexandre Courbot
2025-02-21 12:31 ` Danilo Krummrich
2025-02-19 20:11 ` Sergio González Collado
2025-02-18 10:07 ` Dirk Behme
2025-02-18 13:07 ` Alexandre Courbot
2025-02-20 6:23 ` Dirk Behme
2025-02-17 14:04 ` [PATCH RFC 2/3] rust: make ETIMEDOUT error available Alexandre Courbot
2025-02-17 21:15 ` Daniel Almeida
2025-02-17 14:04 ` [PATCH RFC 3/3] gpu: nova-core: add basic timer device Alexandre Courbot
2025-02-17 15:48 ` [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Simona Vetter
2025-02-18 8:07 ` Greg KH
2025-02-18 13:23 ` Alexandre Courbot
2025-02-17 21:33 ` Danilo Krummrich
2025-02-18 1:46 ` Dave Airlie
2025-02-18 10:26 ` Danilo Krummrich
2025-02-19 12:58 ` Simona Vetter
2025-02-24 1:40 ` Alexandre Courbot
2025-02-24 12:07 ` Danilo Krummrich
2025-02-24 12:11 ` Danilo Krummrich
2025-02-24 18:45 ` Joel Fernandes
2025-02-24 23:44 ` Danilo Krummrich
2025-02-25 15:52 ` Joel Fernandes
2025-02-25 16:09 ` Danilo Krummrich
2025-02-25 21:02 ` Joel Fernandes
2025-02-25 22:02 ` Danilo Krummrich
2025-02-25 22:42 ` Dave Airlie
2025-02-25 22:57 ` Jason Gunthorpe
2025-02-25 23:26 ` Danilo Krummrich
2025-02-25 23:45 ` Danilo Krummrich
2025-02-26 0:49 ` Jason Gunthorpe
2025-02-26 1:16 ` Danilo Krummrich
2025-02-26 17:21 ` Jason Gunthorpe
2025-02-26 21:31 ` Danilo Krummrich
2025-02-26 23:47 ` Jason Gunthorpe
2025-02-27 0:41 ` Boqun Feng
2025-02-27 14:46 ` Jason Gunthorpe
2025-02-27 15:18 ` Boqun Feng
2025-02-27 16:17 ` Jason Gunthorpe
2025-02-27 16:55 ` Boqun Feng
2025-02-27 17:32 ` Danilo Krummrich
2025-02-27 19:23 ` Jason Gunthorpe
2025-02-27 21:25 ` Boqun Feng
2025-02-27 22:00 ` Jason Gunthorpe
2025-02-27 22:40 ` Danilo Krummrich
2025-02-28 18:55 ` Jason Gunthorpe
2025-03-03 19:36 ` Danilo Krummrich
2025-03-03 21:50 ` Jason Gunthorpe
2025-03-04 9:57 ` Danilo Krummrich
2025-02-27 1:02 ` Greg KH
2025-02-27 1:34 ` John Hubbard
2025-02-27 21:42 ` Dave Airlie
2025-02-27 23:06 ` John Hubbard
2025-02-28 4:10 ` Dave Airlie
2025-02-28 18:50 ` Jason Gunthorpe
2025-02-28 10:52 ` Simona Vetter
2025-02-28 18:40 ` Jason Gunthorpe
2025-03-04 16:10 ` Simona Vetter
2025-03-04 16:42 ` Jason Gunthorpe
2025-03-05 7:30 ` Simona Vetter
2025-03-05 15:10 ` Jason Gunthorpe
2025-03-06 10:42 ` Simona Vetter
2025-03-06 15:32 ` Jason Gunthorpe
2025-03-07 10:28 ` Simona Vetter
2025-03-07 12:32 ` Jason Gunthorpe
2025-03-07 13:09 ` Simona Vetter
2025-03-07 14:55 ` Jason Gunthorpe
2025-03-13 14:32 ` Simona Vetter
2025-03-19 17:21 ` Jason Gunthorpe
2025-03-21 10:35 ` Simona Vetter
2025-03-21 12:04 ` Jason Gunthorpe
2025-03-21 12:12 ` Danilo Krummrich
2025-03-21 17:49 ` Jason Gunthorpe
2025-03-21 18:54 ` Danilo Krummrich
2025-03-07 14:00 ` Greg KH
2025-03-07 14:46 ` Jason Gunthorpe
2025-03-07 15:19 ` Greg KH
2025-03-07 15:25 ` Jason Gunthorpe
2025-02-27 14:23 ` Jason Gunthorpe
2025-02-27 11:32 ` Danilo Krummrich
2025-02-27 15:07 ` Jason Gunthorpe
2025-02-27 16:51 ` Danilo Krummrich [this message]
2025-02-25 14:11 ` Alexandre Courbot
2025-02-25 15:06 ` Danilo Krummrich
2025-02-25 15:23 ` Alexandre Courbot
2025-02-25 15:53 ` Danilo Krummrich
2025-02-27 21:37 ` Dave Airlie
2025-02-28 1:49 ` Timur Tabi
2025-02-28 2:24 ` Dave Airlie
2025-02-18 13:35 ` Alexandre Courbot
2025-02-18 1:42 ` Dave Airlie
2025-02-18 13:47 ` Alexandre Courbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8CX__mIlFUFEkIh@cassiopeiae \
--to=dakr@kernel.org \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=bskeggs@nvidia.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=gary@garyguo.net \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=paulmck@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).