rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Danilo Krummrich <dakr@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Alexandre Courbot <acourbot@nvidia.com>,
	Dave Airlie <airlied@gmail.com>, Gary Guo <gary@garyguo.net>,
	Joel Fernandes <joel@joelfernandes.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Ben Skeggs <bskeggs@nvidia.com>,
	linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
	nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	paulmck@kernel.org
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation
Date: Thu, 27 Feb 2025 10:23:49 -0400	[thread overview]
Message-ID: <20250227142349.GD39591@nvidia.com> (raw)
In-Reply-To: <2025022644-fleshed-petite-a944@gregkh>

On Wed, Feb 26, 2025 at 05:02:23PM -0800, Greg KH wrote:
> On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote:
> > The way misc device works you can't unload the module until all the
> > FDs are closed and the misc code directly handles races with opening
> > new FDs while modules are unloading. It is quite a different scheme
> > than discussed in this thread.
> 
> And I would argue that is it the _right_ scheme to be following overall
> here.  Removing modules with in-flight devices/drivers is to me is odd,
> and only good for developers doing work, not for real systems, right?

There are two issues and I've found these discussions get confused
about two interrelated things:

1) Module lifetime and when modules are refcounted
2) How does device_driver remove() work, especially with hot unplug
   and /sys/../unbind while the module is *still loaded*.

Noting, very explicitly, that you can unbind a device_driver without
unloading the module.

#1 should be strictly based around the needs of function pointers in
the system. Ie stuff like ".owner = THIS_MODULE".

#2 is challenging when the driver has a file descriptor.

AFAIK there are only two broad choices:
 a) wait for all FDs to close in remove() (boo!)
 b) leave the FDs open but disable them and complete remove(). eg
    return -ENODEV to all system calls

I think the kernel community has a strong preference for (b), but rdma
had started with (a) long ago. So we fixed it to (b), netdev does (b),
so do alot of places because (a) is, frankly, awful.

Now.. how does that relate to module unbinding? The drivers are
unbound now because we properly support hotunplug via (b). So when is
it OK to allow a module with no bound drivers to remove while a zombie
FD is still open?

That largely revolves around who owns the struct file_operations. For
misc_dev the driver module would own it, so it is impossible to unload
the driver module even if the device driver was hot unplugged/unbound.

For a subsystem, like rdma, the subsystem can own the
file_operations. Now to allow the driver module to be unloaded we
"simply" require the subsystem to fence all driver callbacks during
device driver remove and subsystem unregister. ie if the subsystem
knows it no longer can call the driver then it no longer needs a
refcount on the driver module.

This fence was necessary anyhow for RDMA because drivers had the
pre-existing assumption that unregister was fencing all driver
callbacks by waiting for the FDs to close. Drivers did not handle UAF
races with something like pci_iounmp() and their concurrent driver
callback threads.

Once the fence was built it was straightforward to also allow driver
module unload since the core code has NULL'd its copy of all the
driver function pointers during unregister.

Further, I'd argue this is the best model for subsystems to
follow. Allowing driver code to continue to run after subsystem
unregister forces the driver to deal with UAF removal races. This is
too hard for drivers to implement correctly, and prevents unloading
the driver module after the drivers have been unbound.

Why do people care? Aside from obvious hot-unplug cases, like physical
PCI hot plug on high-avaibility servers hated (a), there was a strong
desire from folks running software HA schemes to be able to upgrade
the driver module with minimal hits. They want to leave the
application running and it is able to fast-recover when the FD becomes
-ENODEV by opening a new one and keeping most of their internal state
alive.

> What is the requirement that means that you have to do this for function
> pointers? 

I'm just pointing out that function pointers are not guaranteed to be
valid forever in the linux model. Every function pointer is somehow
being protected by a lifecycle that links back to the module
lifecycle.

Most of the time a driver author can ignore function pointer lifecycle
analysis, but not always..

Jason

  parent reply	other threads:[~2025-02-27 14:23 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17 14:04 [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Alexandre Courbot
2025-02-17 14:04 ` [PATCH RFC 1/3] rust: add useful ops for u64 Alexandre Courbot
2025-02-17 20:47   ` Sergio González Collado
2025-02-17 21:10   ` Daniel Almeida
2025-02-18 13:16     ` Alexandre Courbot
2025-02-18 20:51       ` Timur Tabi
2025-02-19  1:21         ` Alexandre Courbot
2025-02-19  3:24           ` John Hubbard
2025-02-19 12:51             ` Alexandre Courbot
2025-02-19 20:22               ` John Hubbard
2025-02-19 20:23                 ` Dave Airlie
2025-02-19 23:13                   ` Daniel Almeida
2025-02-20  0:14                     ` John Hubbard
2025-02-21 11:35                       ` Alexandre Courbot
2025-02-21 12:31                         ` Danilo Krummrich
2025-02-19 20:11           ` Sergio González Collado
2025-02-18 10:07   ` Dirk Behme
2025-02-18 13:07     ` Alexandre Courbot
2025-02-20  6:23       ` Dirk Behme
2025-02-17 14:04 ` [PATCH RFC 2/3] rust: make ETIMEDOUT error available Alexandre Courbot
2025-02-17 21:15   ` Daniel Almeida
2025-02-17 14:04 ` [PATCH RFC 3/3] gpu: nova-core: add basic timer device Alexandre Courbot
2025-02-17 15:48 ` [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation Simona Vetter
2025-02-18  8:07   ` Greg KH
2025-02-18 13:23     ` Alexandre Courbot
2025-02-17 21:33 ` Danilo Krummrich
2025-02-18  1:46   ` Dave Airlie
2025-02-18 10:26     ` Danilo Krummrich
2025-02-19 12:58       ` Simona Vetter
2025-02-24  1:40     ` Alexandre Courbot
2025-02-24 12:07       ` Danilo Krummrich
2025-02-24 12:11         ` Danilo Krummrich
2025-02-24 18:45           ` Joel Fernandes
2025-02-24 23:44             ` Danilo Krummrich
2025-02-25 15:52               ` Joel Fernandes
2025-02-25 16:09                 ` Danilo Krummrich
2025-02-25 21:02                   ` Joel Fernandes
2025-02-25 22:02                     ` Danilo Krummrich
2025-02-25 22:42                       ` Dave Airlie
2025-02-25 22:57                     ` Jason Gunthorpe
2025-02-25 23:26                       ` Danilo Krummrich
2025-02-25 23:45                       ` Danilo Krummrich
2025-02-26  0:49                         ` Jason Gunthorpe
2025-02-26  1:16                           ` Danilo Krummrich
2025-02-26 17:21                             ` Jason Gunthorpe
2025-02-26 21:31                               ` Danilo Krummrich
2025-02-26 23:47                                 ` Jason Gunthorpe
2025-02-27  0:41                                   ` Boqun Feng
2025-02-27 14:46                                     ` Jason Gunthorpe
2025-02-27 15:18                                       ` Boqun Feng
2025-02-27 16:17                                         ` Jason Gunthorpe
2025-02-27 16:55                                           ` Boqun Feng
2025-02-27 17:32                                             ` Danilo Krummrich
2025-02-27 19:23                                               ` Jason Gunthorpe
2025-02-27 21:25                                                 ` Boqun Feng
2025-02-27 22:00                                                   ` Jason Gunthorpe
2025-02-27 22:40                                                     ` Danilo Krummrich
2025-02-28 18:55                                                       ` Jason Gunthorpe
2025-03-03 19:36                                                         ` Danilo Krummrich
2025-03-03 21:50                                                           ` Jason Gunthorpe
2025-03-04  9:57                                                             ` Danilo Krummrich
2025-02-27  1:02                                   ` Greg KH
2025-02-27  1:34                                     ` John Hubbard
2025-02-27 21:42                                       ` Dave Airlie
2025-02-27 23:06                                         ` John Hubbard
2025-02-28  4:10                                           ` Dave Airlie
2025-02-28 18:50                                             ` Jason Gunthorpe
2025-02-28 10:52                                       ` Simona Vetter
2025-02-28 18:40                                         ` Jason Gunthorpe
2025-03-04 16:10                                           ` Simona Vetter
2025-03-04 16:42                                             ` Jason Gunthorpe
2025-03-05  7:30                                               ` Simona Vetter
2025-03-05 15:10                                                 ` Jason Gunthorpe
2025-03-06 10:42                                                   ` Simona Vetter
2025-03-06 15:32                                                     ` Jason Gunthorpe
2025-03-07 10:28                                                       ` Simona Vetter
2025-03-07 12:32                                                         ` Jason Gunthorpe
2025-03-07 13:09                                                           ` Simona Vetter
2025-03-07 14:55                                                             ` Jason Gunthorpe
2025-03-13 14:32                                                               ` Simona Vetter
2025-03-19 17:21                                                                 ` Jason Gunthorpe
2025-03-21 10:35                                                                   ` Simona Vetter
2025-03-21 12:04                                                                     ` Jason Gunthorpe
2025-03-21 12:12                                                                       ` Danilo Krummrich
2025-03-21 17:49                                                                         ` Jason Gunthorpe
2025-03-21 18:54                                                                           ` Danilo Krummrich
2025-03-07 14:00                                                           ` Greg KH
2025-03-07 14:46                                                             ` Jason Gunthorpe
2025-03-07 15:19                                                               ` Greg KH
2025-03-07 15:25                                                                 ` Jason Gunthorpe
2025-02-27 14:23                                     ` Jason Gunthorpe [this message]
2025-02-27 11:32                                   ` Danilo Krummrich
2025-02-27 15:07                                     ` Jason Gunthorpe
2025-02-27 16:51                                       ` Danilo Krummrich
2025-02-25 14:11         ` Alexandre Courbot
2025-02-25 15:06           ` Danilo Krummrich
2025-02-25 15:23             ` Alexandre Courbot
2025-02-25 15:53               ` Danilo Krummrich
2025-02-27 21:37           ` Dave Airlie
2025-02-28  1:49             ` Timur Tabi
2025-02-28  2:24               ` Dave Airlie
2025-02-18 13:35   ` Alexandre Courbot
2025-02-18  1:42 ` Dave Airlie
2025-02-18 13:47   ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250227142349.GD39591@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=bskeggs@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jhubbard@nvidia.com \
    --cc=joel@joelfernandes.org \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=paulmck@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).