* [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw
@ 2023-07-17 15:12 Chuck Lever
2023-07-17 15:12 ` [PATCH v6 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Chuck Lever @ 2023-07-17 15:12 UTC (permalink / raw)
To: leon, jgg; +Cc: Chuck Lever, Tom Talpey, Bernard Metzler, BMT, tom, linux-rdma
Here's a series that implements support for siw on tunnel devices,
based on suggestions from Jason Gunthorpe and Tom Talpey.
Changes since v5:
- Refine comment in cma_validate_port()
Changes since v4:
- Address review comments from Tom Talpey
Changes since v3:
- Clean up RCU dereference in cma_validate_port()
Changes since v2:
- Split into multiple patches
- Pre-initialize gid_attr::ndev for iWARP devices
---
Chuck Lever (4):
RDMA/siw: Fabricate a GID on tun and loopback devices
RDMA/core: Set gid_attr.ndev for iWARP devices
RDMA/cma: Deduplicate error flow in cma_validate_port()
RDMA/cma: Avoid GID lookups on iWARP devices
drivers/infiniband/core/cache.c | 11 +++++++++
drivers/infiniband/core/cma.c | 32 ++++++++++++++++++++++-----
drivers/infiniband/sw/siw/siw.h | 1 +
drivers/infiniband/sw/siw/siw_main.c | 22 +++++++-----------
drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
5 files changed, 49 insertions(+), 21 deletions(-)
--
Chuck Lever
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v6 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
@ 2023-07-17 15:12 ` Chuck Lever
2023-07-17 15:12 ` [PATCH v6 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2023-07-17 15:12 UTC (permalink / raw)
To: leon, jgg
Cc: Tom Talpey, Bernard Metzler, Tom Talpey, Chuck Lever, BMT, tom,
linux-rdma
From: Chuck Lever <chuck.lever@oracle.com>
LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
Currently, siw_device_create() falls back to copying the IB device's
name in those cases, because an all-zero MAC address breaks the RDMA
core address resolution mechanism.
However, at the point when siw_device_create() constructs a GID, the
ib_device::name field is uninitialized, leaving the MAC address to
remain in an all-zero state.
Fabricate a random artificial GID for such devices, and ensure this
artificial GID is returned for all device query operations.
Reported-by: Tom Talpey <tom@talpey.com>
Link: https://lore.kernel.org/linux-rdma/SA0PR15MB391986C07C4D41E107E79659994FA@SA0PR15MB3919.namprd15.prod.outlook.com/T/#t
Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices")
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/sw/siw/siw.h | 1 +
drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
3 files changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index 2f3a9cda3850..8b4a710b82bc 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -74,6 +74,7 @@ struct siw_device {
u32 vendor_part_id;
int numa_node;
+ char raw_gid[ETH_ALEN];
/* physical port state (only one port per device) */
enum ib_port_state state;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 65b5cda5457b..f45600d169ae 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -75,8 +75,7 @@ static int siw_device_register(struct siw_device *sdev, const char *name)
return rv;
}
- siw_dbg(base_dev, "HWaddr=%pM\n", sdev->netdev->dev_addr);
-
+ siw_dbg(base_dev, "HWaddr=%pM\n", sdev->raw_gid);
return 0;
}
@@ -313,24 +312,19 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
return NULL;
base_dev = &sdev->base_dev;
-
sdev->netdev = netdev;
- if (netdev->type != ARPHRD_LOOPBACK && netdev->type != ARPHRD_NONE) {
- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
- netdev->dev_addr);
+ if (netdev->addr_len) {
+ memcpy(sdev->raw_gid, netdev->dev_addr,
+ min_t(unsigned int, netdev->addr_len, ETH_ALEN));
} else {
/*
- * This device does not have a HW address,
- * but connection mangagement lib expects gid != 0
+ * This device does not have a HW address, but
+ * connection mangagement requires a unique gid.
*/
- size_t len = min_t(size_t, strlen(base_dev->name), 6);
- char addr[6] = { };
-
- memcpy(addr, base_dev->name, len);
- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
- addr);
+ eth_random_addr(sdev->raw_gid);
}
+ addrconf_addr_eui48((u8 *)&base_dev->node_guid, sdev->raw_gid);
base_dev->uverbs_cmd_mask |= BIT_ULL(IB_USER_VERBS_CMD_POST_SEND);
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 398ec13db624..32b0befd25e2 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -157,7 +157,7 @@ int siw_query_device(struct ib_device *base_dev, struct ib_device_attr *attr,
attr->vendor_part_id = sdev->vendor_part_id;
addrconf_addr_eui48((u8 *)&attr->sys_image_guid,
- sdev->netdev->dev_addr);
+ sdev->raw_gid);
return 0;
}
@@ -218,7 +218,7 @@ int siw_query_gid(struct ib_device *base_dev, u32 port, int idx,
/* subnet_prefix == interface_id == 0; */
memset(gid, 0, sizeof(*gid));
- memcpy(&gid->raw[0], sdev->netdev->dev_addr, 6);
+ memcpy(gid->raw, sdev->raw_gid, ETH_ALEN);
return 0;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v6 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-07-17 15:12 ` [PATCH v6 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
@ 2023-07-17 15:12 ` Chuck Lever
2023-07-17 15:12 ` [PATCH v6 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2023-07-17 15:12 UTC (permalink / raw)
To: leon, jgg; +Cc: Tom Talpey, Chuck Lever, BMT, tom, linux-rdma
From: Chuck Lever <chuck.lever@oracle.com>
Have the iwarp side properly set the ndev in the device's sgid_attrs
so that address resolution can treat it more like a RoCE device.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cache.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 2e91d8879326..33f9d02f9b60 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1457,6 +1457,17 @@ static int config_non_roce_gid_cache(struct ib_device *device,
i);
goto err;
}
+
+ if (rdma_protocol_iwarp(device, port)) {
+ struct net_device *ndev;
+
+ ndev = ib_device_get_netdev(device, port);
+ if (!ndev)
+ continue;
+ RCU_INIT_POINTER(gid_attr.ndev, ndev);
+ dev_put(ndev);
+ }
+
gid_attr.index = i;
tprops->subnet_prefix =
be64_to_cpu(gid_attr.gid.global.subnet_prefix);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v6 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port()
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-07-17 15:12 ` [PATCH v6 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
2023-07-17 15:12 ` [PATCH v6 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
@ 2023-07-17 15:12 ` Chuck Lever
2023-07-17 15:12 ` [PATCH v6 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
2023-07-21 19:08 ` [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Jason Gunthorpe
4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2023-07-17 15:12 UTC (permalink / raw)
To: leon, jgg; +Cc: Chuck Lever, BMT, tom, linux-rdma
From: Chuck Lever <chuck.lever@oracle.com>
Clean up to prepare for the addition of new logic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cma.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 1ee87c3aaeab..da54167723d6 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -686,30 +686,31 @@ cma_validate_port(struct ib_device *device, u32 port,
struct rdma_id_private *id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
+ const struct ib_gid_attr *sgid_attr = ERR_PTR(-ENODEV);
int bound_if_index = dev_addr->bound_dev_if;
- const struct ib_gid_attr *sgid_attr;
int dev_type = dev_addr->dev_type;
struct net_device *ndev = NULL;
if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
- return ERR_PTR(-ENODEV);
+ goto out;
if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
- return ERR_PTR(-ENODEV);
+ goto out;
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
- return ERR_PTR(-ENODEV);
+ goto out;
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(dev_addr->net, bound_if_index);
if (!ndev)
- return ERR_PTR(-ENODEV);
+ goto out;
} else {
gid_type = IB_GID_TYPE_IB;
}
sgid_attr = rdma_find_gid_by_port(device, gid, gid_type, port, ndev);
dev_put(ndev);
+out:
return sgid_attr;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v6 4/4] RDMA/cma: Avoid GID lookups on iWARP devices
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
` (2 preceding siblings ...)
2023-07-17 15:12 ` [PATCH v6 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
@ 2023-07-17 15:12 ` Chuck Lever
2023-07-21 19:08 ` [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Jason Gunthorpe
4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2023-07-17 15:12 UTC (permalink / raw)
To: leon, jgg; +Cc: Chuck Lever, BMT, tom, linux-rdma
From: Chuck Lever <chuck.lever@oracle.com>
We would like to enable the use of siw on top of a VPN that is
constructed and managed via a tun device. That hasn't worked up
until now because ARPHRD_NONE devices (such as tun devices) have
no GID for the RDMA/core to look up.
But it turns out that the egress device has already been picked for
us -- no GID is necessary. addr_handler() just has to do the right
thing with it.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cma.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index da54167723d6..8bd6cb867381 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -700,6 +700,27 @@ cma_validate_port(struct ib_device *device, u32 port,
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
goto out;
+ /*
+ * For drivers that do not associate more than one net device with
+ * their gid tables, such as iWARP drivers, it is sufficient to
+ * return the first table entry.
+ *
+ * Other driver classes might be included in the future.
+ */
+ if (rdma_protocol_iwarp(device, port)) {
+ sgid_attr = rdma_get_gid_attr(device, port, 0);
+ if (IS_ERR(sgid_attr))
+ goto out;
+
+ rcu_read_lock();
+ ndev = rcu_dereference(sgid_attr->ndev);
+ if (!net_eq(dev_net(ndev), dev_addr->net) ||
+ ndev->ifindex != bound_if_index)
+ sgid_attr = ERR_PTR(-ENODEV);
+ rcu_read_unlock();
+ goto out;
+ }
+
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(dev_addr->net, bound_if_index);
if (!ndev)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
` (3 preceding siblings ...)
2023-07-17 15:12 ` [PATCH v6 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
@ 2023-07-21 19:08 ` Jason Gunthorpe
4 siblings, 0 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2023-07-21 19:08 UTC (permalink / raw)
To: Chuck Lever; +Cc: leon, Chuck Lever, Tom Talpey, Bernard Metzler, linux-rdma
On Mon, Jul 17, 2023 at 11:12:05AM -0400, Chuck Lever wrote:
> Here's a series that implements support for siw on tunnel devices,
> based on suggestions from Jason Gunthorpe and Tom Talpey.
>
>
> Changes since v5:
> - Refine comment in cma_validate_port()
>
> Changes since v4:
> - Address review comments from Tom Talpey
>
> Changes since v3:
> - Clean up RCU dereference in cma_validate_port()
>
> Changes since v2:
> - Split into multiple patches
> - Pre-initialize gid_attr::ndev for iWARP devices
>
> ---
>
> Chuck Lever (4):
> RDMA/siw: Fabricate a GID on tun and loopback devices
> RDMA/core: Set gid_attr.ndev for iWARP devices
> RDMA/cma: Deduplicate error flow in cma_validate_port()
> RDMA/cma: Avoid GID lookups on iWARP devices
Applied to for-next, thanks
Jason
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-07-21 19:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-17 15:12 [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-07-17 15:12 ` [PATCH v6 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
2023-07-17 15:12 ` [PATCH v6 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
2023-07-17 15:12 ` [PATCH v6 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
2023-07-17 15:12 ` [PATCH v6 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
2023-07-21 19:08 ` [PATCH v6 0/4] Handle ARPHRD_NONE devices for siw Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).