From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Parav Pandit <parav@nvidia.com>,
"Eric W . Biederman" <ebiederm@xmission.com>,
linux-rdma@vger.kernel.org, Mark Bloch <mbloch@nvidia.com>
Subject: [PATCH rdma-next v1 1/7] RDMA/uverbs: Check CAP_NET_RAW in user namespace for flow create
Date: Thu, 26 Jun 2025 15:05:52 +0300 [thread overview]
Message-ID: <9951debcf3e3abedcb3b5f4e477124fec1c336af.1750938869.git.leon@kernel.org> (raw)
In-Reply-To: <cover.1750938869.git.leon@kernel.org>
From: Parav Pandit <parav@nvidia.com>
Currently, the capability check is done in the default
init_user_ns user namespace. When a process runs in a
non default user namespace, such check fails. Due to this
when a process is running using podman, it fails to create
the flow resource.
Since the RDMA device is a resource within a network namespace,
use the network namespace associated with the RDMA device to
determine its owning user namespace.
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/infiniband/core/device.c | 27 +++++++++++++++++++++++++++
drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
include/rdma/ib_verbs.h | 2 ++
3 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 468ed6bd4722..79d8e6fce487 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -145,6 +145,33 @@ bool rdma_dev_access_netns(const struct ib_device *dev, const struct net *net)
}
EXPORT_SYMBOL(rdma_dev_access_netns);
+/**
+ * rdma_dev_has_raw_cap() - Returns whether a specified rdma device has
+ * CAP_NET_RAW capability or not.
+ *
+ * @dev: Pointer to rdma device whose capability to be checked
+ *
+ * Returns true if a rdma device's owning user namespace has CAP_NET_RAW
+ * capability, otherwise false. When rdma subsystem is in legacy shared network,
+ * namespace mode, the default net namespace is considered.
+ */
+bool rdma_dev_has_raw_cap(const struct ib_device *dev)
+{
+ const struct net *net;
+
+ /* Network namespace is the resource whose user namespace
+ * to be considered. When in shared mode, there is no reliable
+ * network namespace resource, so consider the default net namespace.
+ */
+ if (ib_devices_shared_netns)
+ net = &init_net;
+ else
+ net = read_pnet(&dev->coredev.rdma_net);
+
+ return ns_capable(net->user_ns, CAP_NET_RAW);
+}
+EXPORT_SYMBOL(rdma_dev_has_raw_cap);
+
/*
* xarray has this behavior where it won't iterate over NULL values stored in
* allocated arrays. So we need our own iterator to see all values stored in
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index bc9fe3ceca4d..08a738a2a1ff 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3225,9 +3225,6 @@ static int ib_uverbs_ex_create_flow(struct uverbs_attr_bundle *attrs)
if (cmd.comp_mask)
return -EINVAL;
- if (!capable(CAP_NET_RAW))
- return -EPERM;
-
if (cmd.flow_attr.flags >= IB_FLOW_ATTR_FLAGS_RESERVED)
return -EINVAL;
@@ -3272,6 +3269,11 @@ static int ib_uverbs_ex_create_flow(struct uverbs_attr_bundle *attrs)
goto err_free_attr;
}
+ if (!rdma_dev_has_raw_cap(uobj->context->device)) {
+ err = -EPERM;
+ goto err_uobj;
+ }
+
if (!rdma_is_port_valid(uobj->context->device, cmd.flow_attr.port)) {
err = -EINVAL;
goto err_uobj;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 38f68d245fa6..5e70a5cf35c3 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -4864,6 +4864,8 @@ static inline int ibdev_to_node(struct ib_device *ibdev)
bool rdma_dev_access_netns(const struct ib_device *device,
const struct net *net);
+bool rdma_dev_has_raw_cap(const struct ib_device *dev);
+
#define IB_ROCE_UDP_ENCAP_VALID_PORT_MIN (0xC000)
#define IB_ROCE_UDP_ENCAP_VALID_PORT_MAX (0xFFFF)
#define IB_GRH_FLOWLABEL_MASK (0x000FFFFF)
--
2.49.0
next prev parent reply other threads:[~2025-06-26 12:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-26 12:05 [PATCH rdma-next v1 0/7] Check CAP_NET_RAW in right namespace Leon Romanovsky
2025-06-26 12:05 ` Leon Romanovsky [this message]
2025-06-26 12:05 ` [PATCH rdma-next v1 2/7] RDMA/uverbs: Check CAP_NET_RAW in user namespace for QP create Leon Romanovsky
2025-06-26 12:05 ` [PATCH rdma-next v1 3/7] RDMA/mlx5: Check CAP_NET_RAW in user namespace for flow create Leon Romanovsky
2025-06-26 12:05 ` [PATCH rdma-next v1 4/7] RDMA/mlx5: Check CAP_NET_RAW in user namespace for anchor create Leon Romanovsky
2025-06-26 12:05 ` [PATCH rdma-next v1 5/7] RDMA/mlx5: Check CAP_NET_RAW in user namespace for devx create Leon Romanovsky
2025-06-26 12:05 ` [PATCH rdma-next v1 6/7] RDMA/counter: Check CAP_NET_RAW check in user namespace for RDMA counters Leon Romanovsky
2025-06-26 12:05 ` [PATCH rdma-next v1 7/7] RDMA/nldev: Check CAP_NET_RAW in user namespace for QP modify Leon Romanovsky
2025-06-26 12:20 ` [PATCH rdma-next v1 0/7] Check CAP_NET_RAW in right namespace Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9951debcf3e3abedcb3b5f4e477124fec1c336af.1750938869.git.leon@kernel.org \
--to=leon@kernel.org \
--cc=ebiederm@xmission.com \
--cc=jgg@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=parav@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox