public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	 Parav Pandit <parav@nvidia.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-security-module@vger.kernel.org"
	<linux-security-module@vger.kernel.org>,
	 Leon Romanovsky <leonro@nvidia.com>
Subject: Re: [PATCH] RDMA/uverbs: Consider capability of the process that opens the file
Date: Mon, 28 Apr 2025 12:03:47 -0500	[thread overview]
Message-ID: <87tt68cj64.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <20250425183529.GB2012301@nvidia.com> (Jason Gunthorpe's message of "Fri, 25 Apr 2025 15:35:29 -0300")

Jason Gunthorpe <jgg@nvidia.com> writes:

> On Fri, Apr 25, 2025 at 12:34:21PM -0500, Eric W. Biederman wrote:
>> > What about something like CAP_SYS_RAWIO? I don't think we would ever
>> > make that a per-userns thing, but as a thought experiment, do we check
>> > current->XXX->user_ns or still check ibdev->netns->XX->user_ns?
>> >
>> 
>> Oh.  CAP_SYS_RAWIO is totally is something you can have.  In fact
>> the first process in a user namespace starts out with CAP_SYS_RAWIO.
>> That said it is CAP_SYS_RAWIO with respect to the user namespace.
>> 
>> What would be almost certainly be a bug is for any permission check
>> to be relaxed to ns_capable(resource->user_ns, CAP_SYS_RAWIO).
>
> So a process "has" it but the kernel never accepts it?

Exactly.  Most capabilities possessed by root in a user namespace are
never allowed to do anything or at very rarely allowed to do anything.

Semantically the only things root in a user namespace is allowed to
do are those things that the we only disallow because it would confuse
setuid programs, but are otherwise perfectly fine for an unprivileged
process to perform.

>> I don't know what an infiniband character device refers to.  Is it an
>> attachment of a physical cable to the box like a netdevice?  Is it an
>> infiniband queue-pair?
>
> It refers to a single struct ib_device in the kernel. It is kind of a
> like a namespace in that all the commands executed and uobjects
> created on the FD are relative to the struct ib_device.

Unfortunately I am not familiar with the infiniband kernel abstractions.

>> The names (device major and minor) not living in a network namespace
>> mean that there can be problems for CRIU to migrate a infiniband device,
>> as it's device major and minor number are not guaranteed to be
>> available.  Perhaps that doesn't matter, as the name you open is on a
>> filesystem.  *Shrug*
>
> I don't see a path for CRIU and rdma, there is too much hardware
> state.. Presumably if anyone ever did it they'd have to ignore that
> the major/minor changes.

At this point I would be surprised if there was sufficient resources
being put at the problem.  In principle the hardware state is a
simplified version of a TCP/IP connection and some DMA transactions,
so it isn't an unreasonable thing to imagine.

Especially as it is usually the folks who have long running jobs on
clusters (because no single machine is big enough) that are both use
infiniband and are interested in checkpoint-restart.

The CRIU question is always relevant to ask in a namespace context
as that is very much part of the goal of namespaces.  To abstract
out the names so that migration of resources from one machine
can be implemented in the future.

>> > static int ib_uverbs_open(struct inode *inode, struct file *filp)
>> > {
>> > 	if (!rdma_dev_access_netns(ib_dev, current->nsproxy->net_ns)) {
>> > 		ret = -EPERM;
>> >
>> 
>> > bool rdma_dev_access_netns(const struct ib_device *dev, const struct net *net)
>> > {
>> > 	return (ib_devices_shared_netns ||
>> > 		net_eq(read_pnet(&dev->coredev.rdma_net), net));
>> >
>> > So you can say we 'captured' the net_ns into the FD as there is some
>> > struct file->....->ib_dev->..->net_ns that does not change
>> >
>> > Thus ib_dev->...->user_ns is going to always be the user_ns of the
>> > netns of the process that opened the FD.
>> 
>> Nope.
>> 
>> There is no check against current->cred->user_ns.  So the check has
>> nothing to do with the credentials of the process that opened the
>> character device.
>
> I said "user_ns of the netns"?  Credentials of the process is something
> else?

Exactly the credentials of the a process are not:
	current->nsproxy->net_ns->user_ns;  /* Not this */

The credentials of a process are:
	current->cred;  /* This */

With current->cred->user_ns the current processes user namespace.

> It sounds like we just totally ignore current->cred->user_ns from the
> rdma subsystem perspective?

Since you don't allow anything currently to happen in a user namespace
that is completely reasonable.

Once ns_capable checks start being added that changes.

>> > So.. hopefully final question.. When we are in a system call context
>> > and want to check CAP_NET_XX should we also require that the current
>> > process has the same net ns as the ib_dev?
>> 
>> I want to say in general only for opening the ib_device.

The information I don't have to give a better answer is what
an ib_device actually is from a hardware perspective.  That
is why I was asking about queue-pairs above.

>> The network stack in general uses netlink to talk to network devices
>> (sockets are another matter), so this whole using character devices
>> to talk to devices is very weird to me.
>
> It isn't that different. In netlink you get the FD through socket, in
> char dev you get it through open.

It is just enough different that there are no good examples I can point
you at to say what needs to happen here.

For netlink it is totally clear that all of the permission checks need
to happen at open time.  Because the sender of the netlink message
may be on another machine.

For a FD opened through a character device everything is local and
you are still in the context of the requesting process so permission
checks can still happen, and aren't completely silly.

Eric


  parent reply	other threads:[~2025-04-28 17:35 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13  5:08 [PATCH] RDMA/uverbs: Consider capability of the process that opens the file Parav Pandit
2025-03-17 19:31 ` Jason Gunthorpe
2025-03-18  3:43   ` Parav Pandit
2025-03-18 11:20     ` Jason Gunthorpe
2025-03-18 12:30       ` Parav Pandit
2025-03-18 12:44         ` Jason Gunthorpe
2025-03-18 20:00       ` Eric W. Biederman
2025-03-18 22:57         ` Jason Gunthorpe
2025-04-04 14:53           ` Parav Pandit
2025-04-04 15:13             ` Jason Gunthorpe
2025-04-06 14:15               ` Serge E. Hallyn
2025-04-07 11:16                 ` Parav Pandit
2025-04-07 14:46                   ` sergeh
2025-04-20 12:30                     ` Parav Pandit
2025-04-20 13:41                       ` Serge E. Hallyn
2025-04-20 17:31                         ` Parav Pandit
2025-04-07 16:12                   ` Jason Gunthorpe
2025-04-08 14:44                     ` Eric W. Biederman
2025-04-21  3:13             ` Serge E. Hallyn
2025-04-21 11:04               ` Parav Pandit
2025-04-21 13:00                 ` Serge E. Hallyn
2025-04-21 13:33                   ` Parav Pandit
2025-04-21 17:22                     ` Serge E. Hallyn
2025-04-22 12:46                       ` Jason Gunthorpe
2025-04-22 13:14                         ` Serge E. Hallyn
2025-04-22 16:11                           ` Jason Gunthorpe
2025-04-22 16:29                             ` Serge E. Hallyn
2025-04-23 12:41                               ` Parav Pandit
2025-04-23 14:46                                 ` Jason Gunthorpe
2025-04-23 15:43                                   ` Eric W. Biederman
2025-04-23 15:56                                     ` Parav Pandit
2025-04-23 16:45                                       ` Jason Gunthorpe
2025-04-24  9:08                                         ` Parav Pandit
2025-04-24 14:13                                           ` Jason Gunthorpe
2025-04-25 13:14                                             ` Parav Pandit
2025-04-25 13:29                                               ` Jason Gunthorpe
2025-04-25 13:54                                                 ` Parav Pandit
2025-04-25 14:06                                                   ` Serge E. Hallyn
2025-04-25 15:05                                                     ` Parav Pandit
2025-04-25 15:29                                                       ` Serge E. Hallyn
2025-04-25 13:59                                                 ` Serge E. Hallyn
2025-04-25 14:01                                                 ` Serge E. Hallyn
2025-04-25 14:24                                                   ` Jason Gunthorpe
2025-04-25 15:06                                                     ` Serge E. Hallyn
2025-04-25 15:27                                                       ` Parav Pandit
2025-04-25 15:46                                                         ` Eric W. Biederman
2025-04-25 16:16                                                           ` Parav Pandit
2025-04-25 15:32                                                     ` Eric W. Biederman
2025-04-25 16:21                                                       ` Jason Gunthorpe
2025-04-25 17:34                                                         ` Eric W. Biederman
2025-04-25 18:20                                                           ` Parav Pandit
2025-04-25 18:35                                                           ` Jason Gunthorpe
2025-04-27 14:30                                                             ` Serge E. Hallyn
2025-04-28 17:03                                                             ` Eric W. Biederman [this message]
2025-04-29  3:56                                                               ` Eric W. Biederman
2025-04-29 10:39                                                               ` Parav Pandit
2025-04-30  3:34                                                                 ` Eric W. Biederman
2025-04-30 12:14                                                                   ` Parav Pandit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tt68cj64.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=serge@hallyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox