From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Parav Pandit <parav@nvidia.com>,
"Serge E. Hallyn" <serge@hallyn.com>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"linux-security-module@vger.kernel.org"
<linux-security-module@vger.kernel.org>,
Leon Romanovsky <leonro@nvidia.com>
Subject: Re: [PATCH] RDMA/uverbs: Consider capability of the process that opens the file
Date: Tue, 08 Apr 2025 09:44:13 -0500 [thread overview]
Message-ID: <87ikned876.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <20250407161225.GF1557073@nvidia.com> (Jason Gunthorpe's message of "Mon, 7 Apr 2025 13:12:25 -0300")
Jason Gunthorpe <jgg@nvidia.com> writes:
> On Mon, Apr 07, 2025 at 11:16:35AM +0000, Parav Pandit wrote:
>> > > This all makes my head hurt. The right user namespace is the one that
>> > > is currently active for the invoking process, I couldn't understand
>> > > why we have net namespaces refer to user namespaces :\
>> >
>> > A user at any time can create a new user namespace, without creating a new
>> > network namespace, and have privilege in that user namespace, over
>> > resources owned by the user namespace.
>>
>> > So if a user can create a new user namespace, then say "hey I have
>> > CAP_NET_ADMIN over current_user_ns, so give me access to the RDMA
>> > resources belonging to my current_net_ns", that's a problem.
>
> But why is that possible? If the current user name space does not have
> CAP_NET_ADMIN then why can it create a new user name space that does?
Because it isn't CAP_NET_ADMIN. The capabilities are per user
namespace.
AKA the pair (&init_user_ns, CAP_NET_ADMIN) is what you think of when
you think of CAP_NET_ADMIN.
The reason for this is a lot of things that capabilities guard are only
semantically a problem because it would confuse preexisting suid root
binaries. Binding to the low ports (for example) is no more special
than binding to any other port, except that assumptions can be made
about who has bound to the low ports.
So if you can restrict binding to the low ports only to network
namespaces that you and your children control so there is no change
of confusing a suid root application that it is a legitimate operation
to perform.
In networking terms the user namespace and the subordinate namespace
created with user namespace permissions are a bit like a tunnel. The
users of a tunnel can do anything inside their tunnel assign IP
addresses etc, and no one will care as long as it all stays inside the
tunnel.
So in essence the question is do you have capabilities within the tunnel
or do you have capabilities outside of a tunnel.
Do to historical silliness there is a practical concern about code that
only root could run. People tend not to worry if there are bugs that
allow such code to do unintended things. So even if semantically it is
safe to allow such code, generally the code needs a bit of an audit to
make certain there are not bugs or implementation assumptions that will
be violated when allowing additional functionality in a user namespace.
> And if userspace does have CAP_NET_ADMIN what is the issue with
> creating more user namespaces that also have it?
>
>> > So that's why the check should be ns_capable(device->net->user-ns,
>> > CAP_NET_ADMIN) and not ns_capable(current_user_ns, CAP_NET_ADMIN).
>> >
>> Given the check is of the process (and hence user and net ns) and not of the rdma device itself,
>> Shouldn't we just check,
>>
>> ns_capable(current->nsproxy->user_ns, ...)
>>
>> This ensures current network namespace's owning user ns is consulted.
>
> It sounds like the design does not store the capabilities inside the
> current user_ns, but it logically stores them in other NSs. Ie all the
> net related capabilities are in the netns.
>
> Presumably then we have a mapping of every capability to the proper
> namespace to store it?
Store is the wrong concept. Namespaces remember which user namespace
they were created from. This allows the capability checks to require
that you have the capability in the user namespace that created them,
or in a parent user namespace.
There exists a full set of capabilities that can be present in
a user namespace. The initial process in a user namespace is given
all of those capabilities in it's struct cred. Just like the init
process is given all capabilities at system start. The difference
is that when all you have are capabilities that are limited to
a user namespace they don't allow anything to be done (other than
creating namespaces) unless some namespaces are created from that
user namespace.
> If the container has a user namespace and the net ns uses the same
> user namespace then you get the appearance of user namespace
> controlled capabilities...
Essentially yes.
That network namespace requires CAP_NET_ADMIN in the user namespace
it was created within (or a parent user namespace), for it's capability
checks.
Eric
next prev parent reply other threads:[~2025-04-08 15:26 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 5:08 [PATCH] RDMA/uverbs: Consider capability of the process that opens the file Parav Pandit
2025-03-17 19:31 ` Jason Gunthorpe
2025-03-18 3:43 ` Parav Pandit
2025-03-18 11:20 ` Jason Gunthorpe
2025-03-18 12:30 ` Parav Pandit
2025-03-18 12:44 ` Jason Gunthorpe
2025-03-18 20:00 ` Eric W. Biederman
2025-03-18 22:57 ` Jason Gunthorpe
2025-04-04 14:53 ` Parav Pandit
2025-04-04 15:13 ` Jason Gunthorpe
2025-04-06 14:15 ` Serge E. Hallyn
2025-04-07 11:16 ` Parav Pandit
2025-04-07 14:46 ` sergeh
2025-04-20 12:30 ` Parav Pandit
2025-04-20 13:41 ` Serge E. Hallyn
2025-04-20 17:31 ` Parav Pandit
2025-04-07 16:12 ` Jason Gunthorpe
2025-04-08 14:44 ` Eric W. Biederman [this message]
2025-04-21 3:13 ` Serge E. Hallyn
2025-04-21 11:04 ` Parav Pandit
2025-04-21 13:00 ` Serge E. Hallyn
2025-04-21 13:33 ` Parav Pandit
2025-04-21 17:22 ` Serge E. Hallyn
2025-04-22 12:46 ` Jason Gunthorpe
2025-04-22 13:14 ` Serge E. Hallyn
2025-04-22 16:11 ` Jason Gunthorpe
2025-04-22 16:29 ` Serge E. Hallyn
2025-04-23 12:41 ` Parav Pandit
2025-04-23 14:46 ` Jason Gunthorpe
2025-04-23 15:43 ` Eric W. Biederman
2025-04-23 15:56 ` Parav Pandit
2025-04-23 16:45 ` Jason Gunthorpe
2025-04-24 9:08 ` Parav Pandit
2025-04-24 14:13 ` Jason Gunthorpe
2025-04-25 13:14 ` Parav Pandit
2025-04-25 13:29 ` Jason Gunthorpe
2025-04-25 13:54 ` Parav Pandit
2025-04-25 14:06 ` Serge E. Hallyn
2025-04-25 15:05 ` Parav Pandit
2025-04-25 15:29 ` Serge E. Hallyn
2025-04-25 13:59 ` Serge E. Hallyn
2025-04-25 14:01 ` Serge E. Hallyn
2025-04-25 14:24 ` Jason Gunthorpe
2025-04-25 15:06 ` Serge E. Hallyn
2025-04-25 15:27 ` Parav Pandit
2025-04-25 15:46 ` Eric W. Biederman
2025-04-25 16:16 ` Parav Pandit
2025-04-25 15:32 ` Eric W. Biederman
2025-04-25 16:21 ` Jason Gunthorpe
2025-04-25 17:34 ` Eric W. Biederman
2025-04-25 18:20 ` Parav Pandit
2025-04-25 18:35 ` Jason Gunthorpe
2025-04-27 14:30 ` Serge E. Hallyn
2025-04-28 17:03 ` Eric W. Biederman
2025-04-29 3:56 ` Eric W. Biederman
2025-04-29 10:39 ` Parav Pandit
2025-04-30 3:34 ` Eric W. Biederman
2025-04-30 12:14 ` Parav Pandit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ikned876.fsf@email.froward.int.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=jgg@nvidia.com \
--cc=leonro@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=parav@nvidia.com \
--cc=serge@hallyn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox