* [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw
@ 2023-06-14 14:00 Chuck Lever
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Chuck Lever @ 2023-06-14 14:00 UTC (permalink / raw)
To: jgg; +Cc: Tom Talpey, Chuck Lever, linux-rdma, tom, BMT
Here's a series that implements support for siw on tunnel devices,
based on suggestions from Jason Gunthorpe and Tom Talpey.
This series does not address a similar issue with rxe because RoCE
GID resolution behaves differently than it does for iWARP devices.
An independent solution is likely to be required for rxe.
Changes since v2:
- Split into multiple patches
- Pre-initialize gid_attr::ndev for iWARP devices
---
Chuck Lever (4):
RDMA/siw: Fabricate a GID on tun and loopback devices
RDMA/core: Set gid_attr.ndev for iWARP devices
RDMA/cma: Deduplicate error flow in cma_validate_port()
RDMA/cma: Avoid GID lookups on iWARP devices
drivers/infiniband/core/cache.c | 12 ++++++++++++
drivers/infiniband/core/cma.c | 24 +++++++++++++++++++-----
drivers/infiniband/sw/siw/siw.h | 1 +
drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
5 files changed, 42 insertions(+), 21 deletions(-)
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
@ 2023-06-14 14:00 ` Chuck Lever
2023-06-15 8:18 ` Bernard Metzler
2023-06-23 18:25 ` Tom Talpey
2023-06-14 14:00 ` [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
` (3 subsequent siblings)
4 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever @ 2023-06-14 14:00 UTC (permalink / raw)
To: jgg; +Cc: Tom Talpey, Chuck Lever, linux-rdma, tom, BMT
From: Chuck Lever <chuck.lever@oracle.com>
LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
Currently, siw_device_create() falls back to copying the IB device's
name in those cases, because an all-zero MAC address breaks the RDMA
core address resolution mechanism.
However, at the point when siw_device_create() constructs a GID, the
ib_device::name field is uninitialized, leaving the MAC address to
remain in an all-zero state.
Fabricate a random artificial GID for such devices, and ensure that
artificial GID is returned for all device query operations.
Reported-by: Tom Talpey <tom@talpey.com>
Link: https://lore.kernel.org/linux-rdma/SA0PR15MB391986C07C4D41E107E79659994FA@SA0PR15MB3919.namprd15.prod.outlook.com/T/#t
Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/sw/siw/siw.h | 1 +
drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
3 files changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index 2f3a9cda3850..8b4a710b82bc 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -74,6 +74,7 @@ struct siw_device {
u32 vendor_part_id;
int numa_node;
+ char raw_gid[ETH_ALEN];
/* physical port state (only one port per device) */
enum ib_port_state state;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 65b5cda5457b..f45600d169ae 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -75,8 +75,7 @@ static int siw_device_register(struct siw_device *sdev, const char *name)
return rv;
}
- siw_dbg(base_dev, "HWaddr=%pM\n", sdev->netdev->dev_addr);
-
+ siw_dbg(base_dev, "HWaddr=%pM\n", sdev->raw_gid);
return 0;
}
@@ -313,24 +312,19 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
return NULL;
base_dev = &sdev->base_dev;
-
sdev->netdev = netdev;
- if (netdev->type != ARPHRD_LOOPBACK && netdev->type != ARPHRD_NONE) {
- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
- netdev->dev_addr);
+ if (netdev->addr_len) {
+ memcpy(sdev->raw_gid, netdev->dev_addr,
+ min_t(unsigned int, netdev->addr_len, ETH_ALEN));
} else {
/*
- * This device does not have a HW address,
- * but connection mangagement lib expects gid != 0
+ * This device does not have a HW address, but
+ * connection mangagement requires a unique gid.
*/
- size_t len = min_t(size_t, strlen(base_dev->name), 6);
- char addr[6] = { };
-
- memcpy(addr, base_dev->name, len);
- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
- addr);
+ eth_random_addr(sdev->raw_gid);
}
+ addrconf_addr_eui48((u8 *)&base_dev->node_guid, sdev->raw_gid);
base_dev->uverbs_cmd_mask |= BIT_ULL(IB_USER_VERBS_CMD_POST_SEND);
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 398ec13db624..32b0befd25e2 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -157,7 +157,7 @@ int siw_query_device(struct ib_device *base_dev, struct ib_device_attr *attr,
attr->vendor_part_id = sdev->vendor_part_id;
addrconf_addr_eui48((u8 *)&attr->sys_image_guid,
- sdev->netdev->dev_addr);
+ sdev->raw_gid);
return 0;
}
@@ -218,7 +218,7 @@ int siw_query_gid(struct ib_device *base_dev, u32 port, int idx,
/* subnet_prefix == interface_id == 0; */
memset(gid, 0, sizeof(*gid));
- memcpy(&gid->raw[0], sdev->netdev->dev_addr, 6);
+ memcpy(gid->raw, sdev->raw_gid, ETH_ALEN);
return 0;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
@ 2023-06-14 14:00 ` Chuck Lever
2023-06-23 18:37 ` Tom Talpey
2023-06-14 14:00 ` [PATCH v3 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2023-06-14 14:00 UTC (permalink / raw)
To: jgg; +Cc: Chuck Lever, linux-rdma, tom, BMT
From: Chuck Lever <chuck.lever@oracle.com>
Have the iwarp side properly set the ndev in the device's sgid_attrs
so that address resolution can treat it more like a RoCE device.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cache.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 2e91d8879326..717524fe8a39 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1439,6 +1439,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
{
struct ib_gid_attr gid_attr = {};
struct ib_gid_table *table;
+ struct net_device *ndev;
int ret = 0;
int i;
@@ -1457,10 +1458,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
i);
goto err;
}
+
+ ndev = NULL;
+ if (rdma_protocol_iwarp(device, port)) {
+ ndev = ib_device_get_netdev(device, port);
+ if (!ndev)
+ continue;
+ RCU_INIT_POINTER(gid_attr.ndev, ndev);
+ }
+
gid_attr.index = i;
tprops->subnet_prefix =
be64_to_cpu(gid_attr.gid.global.subnet_prefix);
add_modify_gid(table, &gid_attr);
+
+ dev_put(ndev);
}
err:
mutex_unlock(&table->lock);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port()
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
2023-06-14 14:00 ` [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
@ 2023-06-14 14:00 ` Chuck Lever
2023-06-14 14:00 ` [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
2023-06-15 11:59 ` [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Zhu Yanjun
4 siblings, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2023-06-14 14:00 UTC (permalink / raw)
To: jgg; +Cc: Chuck Lever, linux-rdma, tom, BMT
From: Chuck Lever <chuck.lever@oracle.com>
Clean up to prepare for the addition of new logic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cma.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 93a1c48d0c32..a1756ed1faa1 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -686,30 +686,31 @@ cma_validate_port(struct ib_device *device, u32 port,
struct rdma_id_private *id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
+ const struct ib_gid_attr *sgid_attr = ERR_PTR(-ENODEV);
int bound_if_index = dev_addr->bound_dev_if;
- const struct ib_gid_attr *sgid_attr;
int dev_type = dev_addr->dev_type;
struct net_device *ndev = NULL;
if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
- return ERR_PTR(-ENODEV);
+ goto out;
if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
- return ERR_PTR(-ENODEV);
+ goto out;
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
- return ERR_PTR(-ENODEV);
+ goto out;
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(dev_addr->net, bound_if_index);
if (!ndev)
- return ERR_PTR(-ENODEV);
+ goto out;
} else {
gid_type = IB_GID_TYPE_IB;
}
sgid_attr = rdma_find_gid_by_port(device, gid, gid_type, port, ndev);
dev_put(ndev);
+out:
return sgid_attr;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
` (2 preceding siblings ...)
2023-06-14 14:00 ` [PATCH v3 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
@ 2023-06-14 14:00 ` Chuck Lever
2023-06-22 15:13 ` Jason Gunthorpe
2023-06-15 11:59 ` [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Zhu Yanjun
4 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2023-06-14 14:00 UTC (permalink / raw)
To: jgg; +Cc: Chuck Lever, linux-rdma, tom, BMT
From: Chuck Lever <chuck.lever@oracle.com>
We would like to enable the use of siw on top of a VPN that is
constructed and managed via a tun device. That hasn't worked up
until now because ARPHRD_NONE devices (such as tun devices) have
no GID for the RDMA/core to look up.
But it turns out that the egress device has already been picked for
us -- no GID is necessary. addr_handler() just has to do the right
thing with it.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
drivers/infiniband/core/cma.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a1756ed1faa1..50b8da2c4720 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -700,6 +700,19 @@ cma_validate_port(struct ib_device *device, u32 port,
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
goto out;
+ if (rdma_protocol_iwarp(device, port)) {
+ sgid_attr = rdma_get_gid_attr(device, port, 0);
+ if (IS_ERR(sgid_attr))
+ goto out;
+
+ /* XXX: I don't think this is RCU-safe, but does it need to be? */
+ ndev = rcu_dereference(sgid_attr->ndev);
+ if (!net_eq(dev_net(ndev), dev_addr->net) ||
+ ndev->ifindex != bound_if_index)
+ sgid_attr = ERR_PTR(-ENODEV);
+ goto out;
+ }
+
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(dev_addr->net, bound_if_index);
if (!ndev)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
@ 2023-06-15 8:18 ` Bernard Metzler
2023-06-23 18:25 ` Tom Talpey
1 sibling, 0 replies; 15+ messages in thread
From: Bernard Metzler @ 2023-06-15 8:18 UTC (permalink / raw)
To: Chuck Lever, jgg@nvidia.com
Cc: Tom Talpey, Chuck Lever, linux-rdma@vger.kernel.org,
tom@talpey.com
> -----Original Message-----
> From: Chuck Lever <cel@kernel.org>
> Sent: Wednesday, 14 June 2023 16:01
> To: jgg@nvidia.com
> Cc: Tom Talpey <tom@talpey.com>; Chuck Lever <chuck.lever@oracle.com>;
> linux-rdma@vger.kernel.org; tom@talpey.com; Bernard Metzler
> <BMT@zurich.ibm.com>
> Subject: [EXTERNAL] [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and
> loopback devices
>
> From: Chuck Lever <chuck.lever@oracle.com>
>
> LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
> Currently, siw_device_create() falls back to copying the IB device's
> name in those cases, because an all-zero MAC address breaks the RDMA
> core address resolution mechanism.
>
> However, at the point when siw_device_create() constructs a GID, the
> ib_device::name field is uninitialized, leaving the MAC address to
> remain in an all-zero state.
>
> Fabricate a random artificial GID for such devices, and ensure that
> artificial GID is returned for all device query operations.
>
> Reported-by: Tom Talpey <tom@talpey.com>
> Link: INVALID URI REMOVED
> 3A__lore.kernel.org_linux-
> 2Drdma_SA0PR15MB391986C07C4D41E107E79659994FA-
> 40SA0PR15MB3919.namprd15.prod.outlook.com_T_-
> 23t&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-
> r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=w7mQNFzAaTSOyd34VFHqKq
> LkkCgAazZP2N4AAIhvYSeGp1Yql6zFl5sRSI5me4ZP&s=h4oUDwpMECpIMGmo
> nuXTcpqpDRIvWBQWq32ALVPSQjk&e=
> Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> drivers/infiniband/sw/siw/siw.h | 1 +
> drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
> drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
> 3 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw.h
> b/drivers/infiniband/sw/siw/siw.h
> index 2f3a9cda3850..8b4a710b82bc 100644
> --- a/drivers/infiniband/sw/siw/siw.h
> +++ b/drivers/infiniband/sw/siw/siw.h
> @@ -74,6 +74,7 @@ struct siw_device {
>
> u32 vendor_part_id;
> int numa_node;
> + char raw_gid[ETH_ALEN];
>
> /* physical port state (only one port per device) */
> enum ib_port_state state;
> diff --git a/drivers/infiniband/sw/siw/siw_main.c
> b/drivers/infiniband/sw/siw/siw_main.c
> index 65b5cda5457b..f45600d169ae 100644
> --- a/drivers/infiniband/sw/siw/siw_main.c
> +++ b/drivers/infiniband/sw/siw/siw_main.c
> @@ -75,8 +75,7 @@ static int siw_device_register(struct siw_device *sdev,
> const char *name)
> return rv;
> }
>
> - siw_dbg(base_dev, "HWaddr=%pM\n", sdev->netdev->dev_addr);
> -
> + siw_dbg(base_dev, "HWaddr=%pM\n", sdev->raw_gid);
> return 0;
> }
>
> @@ -313,24 +312,19 @@ static struct siw_device *siw_device_create(struct
> net_device *netdev)
> return NULL;
>
> base_dev = &sdev->base_dev;
> -
> sdev->netdev = netdev;
>
> - if (netdev->type != ARPHRD_LOOPBACK && netdev->type !=
> ARPHRD_NONE) {
> - addrconf_addr_eui48((unsigned char *)&base_dev-
> >node_guid,
> - netdev->dev_addr);
> + if (netdev->addr_len) {
> + memcpy(sdev->raw_gid, netdev->dev_addr,
> + min_t(unsigned int, netdev->addr_len, ETH_ALEN));
> } else {
> /*
> - * This device does not have a HW address,
> - * but connection mangagement lib expects gid != 0
> + * This device does not have a HW address, but
> + * connection mangagement requires a unique gid.
> */
> - size_t len = min_t(size_t, strlen(base_dev->name), 6);
> - char addr[6] = { };
> -
> - memcpy(addr, base_dev->name, len);
> - addrconf_addr_eui48((unsigned char *)&base_dev-
> >node_guid,
> - addr);
> + eth_random_addr(sdev->raw_gid);
> }
> + addrconf_addr_eui48((u8 *)&base_dev->node_guid, sdev->raw_gid);
>
> base_dev->uverbs_cmd_mask |=
> BIT_ULL(IB_USER_VERBS_CMD_POST_SEND);
>
> diff --git a/drivers/infiniband/sw/siw/siw_verbs.c
> b/drivers/infiniband/sw/siw/siw_verbs.c
> index 398ec13db624..32b0befd25e2 100644
> --- a/drivers/infiniband/sw/siw/siw_verbs.c
> +++ b/drivers/infiniband/sw/siw/siw_verbs.c
> @@ -157,7 +157,7 @@ int siw_query_device(struct ib_device *base_dev,
> struct ib_device_attr *attr,
> attr->vendor_part_id = sdev->vendor_part_id;
>
> addrconf_addr_eui48((u8 *)&attr->sys_image_guid,
> - sdev->netdev->dev_addr);
> + sdev->raw_gid);
>
> return 0;
> }
> @@ -218,7 +218,7 @@ int siw_query_gid(struct ib_device *base_dev, u32
> port, int idx,
>
> /* subnet_prefix == interface_id == 0; */
> memset(gid, 0, sizeof(*gid));
> - memcpy(&gid->raw[0], sdev->netdev->dev_addr, 6);
> + memcpy(gid->raw, sdev->raw_gid, ETH_ALEN);
>
> return 0;
> }
>
Thank you Chuck. Looks good to me!
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
` (3 preceding siblings ...)
2023-06-14 14:00 ` [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
@ 2023-06-15 11:59 ` Zhu Yanjun
2023-06-15 13:28 ` Chuck Lever III
4 siblings, 1 reply; 15+ messages in thread
From: Zhu Yanjun @ 2023-06-15 11:59 UTC (permalink / raw)
To: Chuck Lever, jgg; +Cc: Tom Talpey, Chuck Lever, linux-rdma, BMT
在 2023/6/14 22:00, Chuck Lever 写道:
> Here's a series that implements support for siw on tunnel devices,
> based on suggestions from Jason Gunthorpe and Tom Talpey.
>
> This series does not address a similar issue with rxe because RoCE
> GID resolution behaves differently than it does for iWARP devices.
> An independent solution is likely to be required for rxe.
Thanks a lot for letting me know. Look forward to the independent
solution for rxe.
Zhu Yanjun
>
> Changes since v2:
> - Split into multiple patches
> - Pre-initialize gid_attr::ndev for iWARP devices
>
> ---
>
> Chuck Lever (4):
> RDMA/siw: Fabricate a GID on tun and loopback devices
> RDMA/core: Set gid_attr.ndev for iWARP devices
> RDMA/cma: Deduplicate error flow in cma_validate_port()
> RDMA/cma: Avoid GID lookups on iWARP devices
>
>
> drivers/infiniband/core/cache.c | 12 ++++++++++++
> drivers/infiniband/core/cma.c | 24 +++++++++++++++++++-----
> drivers/infiniband/sw/siw/siw.h | 1 +
> drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
> drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
> 5 files changed, 42 insertions(+), 21 deletions(-)
>
> --
> Chuck Lever
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw
2023-06-15 11:59 ` [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Zhu Yanjun
@ 2023-06-15 13:28 ` Chuck Lever III
0 siblings, 0 replies; 15+ messages in thread
From: Chuck Lever III @ 2023-06-15 13:28 UTC (permalink / raw)
To: Zhu Yanjun
Cc: Chuck Lever, Jason Gunthorpe, Tom Talpey, linux-rdma,
BMT@zurich.ibm.com
> On Jun 15, 2023, at 7:59 AM, Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>
> 在 2023/6/14 22:00, Chuck Lever 写道:
>> Here's a series that implements support for siw on tunnel devices,
>> based on suggestions from Jason Gunthorpe and Tom Talpey.
>> This series does not address a similar issue with rxe because RoCE
>> GID resolution behaves differently than it does for iWARP devices.
>> An independent solution is likely to be required for rxe.
>
> Thanks a lot for letting me know.
Ooops, I should have Cc'd you on this. I just mindlessly copied the
to/cc from the previous version. I will try to remember next time.
> Look forward to the independent solution for rxe.
>
> Zhu Yanjun
>
>> Changes since v2:
>> - Split into multiple patches
>> - Pre-initialize gid_attr::ndev for iWARP devices
>> ---
>> Chuck Lever (4):
>> RDMA/siw: Fabricate a GID on tun and loopback devices
>> RDMA/core: Set gid_attr.ndev for iWARP devices
>> RDMA/cma: Deduplicate error flow in cma_validate_port()
>> RDMA/cma: Avoid GID lookups on iWARP devices
>> drivers/infiniband/core/cache.c | 12 ++++++++++++
>> drivers/infiniband/core/cma.c | 24 +++++++++++++++++++-----
>> drivers/infiniband/sw/siw/siw.h | 1 +
>> drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
>> drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
>> 5 files changed, 42 insertions(+), 21 deletions(-)
>> --
>> Chuck Lever
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices
2023-06-14 14:00 ` [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
@ 2023-06-22 15:13 ` Jason Gunthorpe
2023-06-23 17:02 ` Chuck Lever III
0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2023-06-22 15:13 UTC (permalink / raw)
To: Chuck Lever; +Cc: Chuck Lever, linux-rdma, tom, BMT
On Wed, Jun 14, 2023 at 10:00:59AM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> We would like to enable the use of siw on top of a VPN that is
> constructed and managed via a tun device. That hasn't worked up
> until now because ARPHRD_NONE devices (such as tun devices) have
> no GID for the RDMA/core to look up.
>
> But it turns out that the egress device has already been picked for
> us -- no GID is necessary. addr_handler() just has to do the right
> thing with it.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> drivers/infiniband/core/cma.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index a1756ed1faa1..50b8da2c4720 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -700,6 +700,19 @@ cma_validate_port(struct ib_device *device, u32 port,
> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
> goto out;
>
> + if (rdma_protocol_iwarp(device, port)) {
> + sgid_attr = rdma_get_gid_attr(device, port, 0);
> + if (IS_ERR(sgid_attr))
> + goto out;
> +
> + /* XXX: I don't think this is RCU-safe, but does it need to be? */
Maybe for subtle reasons related to iwarp it is safe, but it should
make a lockdep splat since you are not holding rcu?
Remove the comment and do a rcu_lock/unlock around this and the deref
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices
2023-06-22 15:13 ` Jason Gunthorpe
@ 2023-06-23 17:02 ` Chuck Lever III
2023-06-23 17:12 ` Jason Gunthorpe
0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever III @ 2023-06-23 17:02 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Chuck Lever, linux-rdma, Tom Talpey, BMT@zurich.ibm.com
> On Jun 22, 2023, at 11:13 AM, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jun 14, 2023 at 10:00:59AM -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> We would like to enable the use of siw on top of a VPN that is
>> constructed and managed via a tun device. That hasn't worked up
>> until now because ARPHRD_NONE devices (such as tun devices) have
>> no GID for the RDMA/core to look up.
>>
>> But it turns out that the egress device has already been picked for
>> us -- no GID is necessary. addr_handler() just has to do the right
>> thing with it.
>>
>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> drivers/infiniband/core/cma.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index a1756ed1faa1..50b8da2c4720 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -700,6 +700,19 @@ cma_validate_port(struct ib_device *device, u32 port,
>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
>> goto out;
>>
>> + if (rdma_protocol_iwarp(device, port)) {
>> + sgid_attr = rdma_get_gid_attr(device, port, 0);
>> + if (IS_ERR(sgid_attr))
>> + goto out;
>> +
>> + /* XXX: I don't think this is RCU-safe, but does it need to be? */
>
> Maybe for subtle reasons related to iwarp it is safe, but it should
> make a lockdep splat since you are not holding rcu?
>
> Remove the comment and do a rcu_lock/unlock around this and the deref
Done, v4 posted.
The reason I was unsure:
CC drivers/infiniband/core/roce_gid_mgmt.o
CHECK /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: warning: incorrect type in assignment (different address spaces)
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: expected struct net_device [noderef] __rcu *[addressable] ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: got struct net_device *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: warning: incorrect type in initializer (different address spaces)
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: expected struct net_device [noderef] __rcu *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: got struct net_device *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: warning: incorrect type in argument 2 (different address spaces)
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: expected void *filter_cookie
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: got struct net_device [noderef] __rcu *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: warning: incorrect type in argument 1 (different address spaces)
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: expected struct net_device *dev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: got struct net_device [noderef] __rcu *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: warning: incorrect type in assignment (different address spaces)
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: expected struct net_device [noderef] __rcu *ndev
/home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: got struct net_device *ndev
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices
2023-06-23 17:02 ` Chuck Lever III
@ 2023-06-23 17:12 ` Jason Gunthorpe
0 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2023-06-23 17:12 UTC (permalink / raw)
To: Chuck Lever III; +Cc: Chuck Lever, linux-rdma, Tom Talpey, BMT@zurich.ibm.com
On Fri, Jun 23, 2023 at 05:02:23PM +0000, Chuck Lever III wrote:
>
>
> > On Jun 22, 2023, at 11:13 AM, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Wed, Jun 14, 2023 at 10:00:59AM -0400, Chuck Lever wrote:
> >> From: Chuck Lever <chuck.lever@oracle.com>
> >>
> >> We would like to enable the use of siw on top of a VPN that is
> >> constructed and managed via a tun device. That hasn't worked up
> >> until now because ARPHRD_NONE devices (such as tun devices) have
> >> no GID for the RDMA/core to look up.
> >>
> >> But it turns out that the egress device has already been picked for
> >> us -- no GID is necessary. addr_handler() just has to do the right
> >> thing with it.
> >>
> >> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >> drivers/infiniband/core/cma.c | 13 +++++++++++++
> >> 1 file changed, 13 insertions(+)
> >>
> >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> >> index a1756ed1faa1..50b8da2c4720 100644
> >> --- a/drivers/infiniband/core/cma.c
> >> +++ b/drivers/infiniband/core/cma.c
> >> @@ -700,6 +700,19 @@ cma_validate_port(struct ib_device *device, u32 port,
> >> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
> >> goto out;
> >>
> >> + if (rdma_protocol_iwarp(device, port)) {
> >> + sgid_attr = rdma_get_gid_attr(device, port, 0);
> >> + if (IS_ERR(sgid_attr))
> >> + goto out;
> >> +
> >> + /* XXX: I don't think this is RCU-safe, but does it need to be? */
> >
> > Maybe for subtle reasons related to iwarp it is safe, but it should
> > make a lockdep splat since you are not holding rcu?
> >
> > Remove the comment and do a rcu_lock/unlock around this and the deref
>
> Done, v4 posted.
>
> The reason I was unsure:
>
> CC drivers/infiniband/core/roce_gid_mgmt.o
> CHECK /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: warning: incorrect type in assignment (different address spaces)
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: expected struct net_device [noderef] __rcu *[addressable] ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:292:23: got struct net_device *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: warning: incorrect type in initializer (different address spaces)
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: expected struct net_device [noderef] __rcu *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:386:48: got struct net_device *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: warning: incorrect type in argument 2 (different address spaces)
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: expected void *filter_cookie
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:811:48: got struct net_device [noderef] __rcu *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: warning: incorrect type in argument 1 (different address spaces)
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: expected struct net_device *dev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:814:31: got struct net_device [noderef] __rcu *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: warning: incorrect type in assignment (different address spaces)
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: expected struct net_device [noderef] __rcu *ndev
> /home/cel/src/linux/linux/drivers/infiniband/core/roce_gid_mgmt.c:851:31: got struct net_device *ndev
Ah, nobody has cleaned that stuff :\
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
2023-06-15 8:18 ` Bernard Metzler
@ 2023-06-23 18:25 ` Tom Talpey
1 sibling, 0 replies; 15+ messages in thread
From: Tom Talpey @ 2023-06-23 18:25 UTC (permalink / raw)
To: Chuck Lever, jgg; +Cc: Chuck Lever, linux-rdma, BMT
On 6/14/2023 10:00 AM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
> Currently, siw_device_create() falls back to copying the IB device's
> name in those cases, because an all-zero MAC address breaks the RDMA
> core address resolution mechanism.
>
> However, at the point when siw_device_create() constructs a GID, the
> ib_device::name field is uninitialized, leaving the MAC address to
> remain in an all-zero state.
>
> Fabricate a random artificial GID for such devices, and ensure that
> artificial GID is returned for all device query operations.
Just one wording suggestion on the above:
"Fabricate a random artificial GID for such devices, and ensure that
the same artificial GID is returned for all device query operations."
Reviewed-by: Tom Talpey <tom@talpey.com>
> Reported-by: Tom Talpey <tom@talpey.com>
> Link: https://lore.kernel.org/linux-rdma/SA0PR15MB391986C07C4D41E107E79659994FA@SA0PR15MB3919.namprd15.prod.outlook.com/T/#t
> Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> drivers/infiniband/sw/siw/siw.h | 1 +
> drivers/infiniband/sw/siw/siw_main.c | 22 ++++++++--------------
> drivers/infiniband/sw/siw/siw_verbs.c | 4 ++--
> 3 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
> index 2f3a9cda3850..8b4a710b82bc 100644
> --- a/drivers/infiniband/sw/siw/siw.h
> +++ b/drivers/infiniband/sw/siw/siw.h
> @@ -74,6 +74,7 @@ struct siw_device {
>
> u32 vendor_part_id;
> int numa_node;
> + char raw_gid[ETH_ALEN];
>
> /* physical port state (only one port per device) */
> enum ib_port_state state;
> diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
> index 65b5cda5457b..f45600d169ae 100644
> --- a/drivers/infiniband/sw/siw/siw_main.c
> +++ b/drivers/infiniband/sw/siw/siw_main.c
> @@ -75,8 +75,7 @@ static int siw_device_register(struct siw_device *sdev, const char *name)
> return rv;
> }
>
> - siw_dbg(base_dev, "HWaddr=%pM\n", sdev->netdev->dev_addr);
> -
> + siw_dbg(base_dev, "HWaddr=%pM\n", sdev->raw_gid);
> return 0;
> }
>
> @@ -313,24 +312,19 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
> return NULL;
>
> base_dev = &sdev->base_dev;
> -
> sdev->netdev = netdev;
>
> - if (netdev->type != ARPHRD_LOOPBACK && netdev->type != ARPHRD_NONE) {
> - addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
> - netdev->dev_addr);
> + if (netdev->addr_len) {
> + memcpy(sdev->raw_gid, netdev->dev_addr,
> + min_t(unsigned int, netdev->addr_len, ETH_ALEN));
> } else {
> /*
> - * This device does not have a HW address,
> - * but connection mangagement lib expects gid != 0
> + * This device does not have a HW address, but
> + * connection mangagement requires a unique gid.
> */
> - size_t len = min_t(size_t, strlen(base_dev->name), 6);
> - char addr[6] = { };
> -
> - memcpy(addr, base_dev->name, len);
> - addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
> - addr);
> + eth_random_addr(sdev->raw_gid);
> }
> + addrconf_addr_eui48((u8 *)&base_dev->node_guid, sdev->raw_gid);
>
> base_dev->uverbs_cmd_mask |= BIT_ULL(IB_USER_VERBS_CMD_POST_SEND);
>
> diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
> index 398ec13db624..32b0befd25e2 100644
> --- a/drivers/infiniband/sw/siw/siw_verbs.c
> +++ b/drivers/infiniband/sw/siw/siw_verbs.c
> @@ -157,7 +157,7 @@ int siw_query_device(struct ib_device *base_dev, struct ib_device_attr *attr,
> attr->vendor_part_id = sdev->vendor_part_id;
>
> addrconf_addr_eui48((u8 *)&attr->sys_image_guid,
> - sdev->netdev->dev_addr);
> + sdev->raw_gid);
>
> return 0;
> }
> @@ -218,7 +218,7 @@ int siw_query_gid(struct ib_device *base_dev, u32 port, int idx,
>
> /* subnet_prefix == interface_id == 0; */
> memset(gid, 0, sizeof(*gid));
> - memcpy(&gid->raw[0], sdev->netdev->dev_addr, 6);
> + memcpy(gid->raw, sdev->raw_gid, ETH_ALEN);
>
> return 0;
> }
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices
2023-06-14 14:00 ` [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
@ 2023-06-23 18:37 ` Tom Talpey
2023-06-23 18:40 ` Chuck Lever III
0 siblings, 1 reply; 15+ messages in thread
From: Tom Talpey @ 2023-06-23 18:37 UTC (permalink / raw)
To: Chuck Lever, jgg; +Cc: Chuck Lever, linux-rdma, BMT
On 6/14/2023 10:00 AM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Have the iwarp side properly set the ndev in the device's sgid_attrs
> so that address resolution can treat it more like a RoCE device.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> drivers/infiniband/core/cache.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 2e91d8879326..717524fe8a39 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -1439,6 +1439,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
> {
> struct ib_gid_attr gid_attr = {};
> struct ib_gid_table *table;
> + struct net_device *ndev;
> int ret = 0;
> int i;
>
> @@ -1457,10 +1458,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
> i);
> goto err;
> }
> +
> + ndev = NULL;
> + if (rdma_protocol_iwarp(device, port)) {
> + ndev = ib_device_get_netdev(device, port);
> + if (!ndev)
> + continue;
> + RCU_INIT_POINTER(gid_attr.ndev, ndev);
> + }
> +
> gid_attr.index = i;
> tprops->subnet_prefix =
> be64_to_cpu(gid_attr.gid.global.subnet_prefix);
> add_modify_gid(table, &gid_attr);
> +
> + dev_put(ndev);
I'm not sure about two things here...
1) is it correct to dev_put(ndev) when returning? It seems that this
may risk the device may vanish when it's still present in the cache.
Feel free to tell me I'm confused.
2) Assuming yes, shouldn't the dev_put(ndev) move to within the new
if(iwarp) block just above?
Tom.
> }
> err:
> mutex_unlock(&table->lock);
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices
2023-06-23 18:37 ` Tom Talpey
@ 2023-06-23 18:40 ` Chuck Lever III
2023-06-23 18:46 ` Tom Talpey
0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever III @ 2023-06-23 18:40 UTC (permalink / raw)
To: Tom Talpey
Cc: Chuck Lever, jgg@nvidia.com, linux-rdma@vger.kernel.org,
BMT@zurich.ibm.com
> On Jun 23, 2023, at 2:37 PM, Tom Talpey <tom@talpey.com> wrote:
>
> On 6/14/2023 10:00 AM, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>> Have the iwarp side properly set the ndev in the device's sgid_attrs
>> so that address resolution can treat it more like a RoCE device.
>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> drivers/infiniband/core/cache.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
>> index 2e91d8879326..717524fe8a39 100644
>> --- a/drivers/infiniband/core/cache.c
>> +++ b/drivers/infiniband/core/cache.c
>> @@ -1439,6 +1439,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>> {
>> struct ib_gid_attr gid_attr = {};
>> struct ib_gid_table *table;
>> + struct net_device *ndev;
>> int ret = 0;
>> int i;
>> @@ -1457,10 +1458,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>> i);
>> goto err;
>> }
>> +
>> + ndev = NULL;
>> + if (rdma_protocol_iwarp(device, port)) {
>> + ndev = ib_device_get_netdev(device, port);
>> + if (!ndev)
>> + continue;
>> + RCU_INIT_POINTER(gid_attr.ndev, ndev);
>> + }
>> +
>> gid_attr.index = i;
>> tprops->subnet_prefix =
>> be64_to_cpu(gid_attr.gid.global.subnet_prefix);
>> add_modify_gid(table, &gid_attr);
>> +
>> + dev_put(ndev);
>
> I'm not sure about two things here...
>
> 1) is it correct to dev_put(ndev) when returning? It seems that this
> may risk the device may vanish when it's still present in the cache.
> Feel free to tell me I'm confused.
ib_device_get_netdev() increments the ndev's reference count
before returning. dev_put() releases that reference.
> 2) Assuming yes, shouldn't the dev_put(ndev) move to within the new
> if(iwarp) block just above?
If the "if (iwarp)" block isn't taken, then ndev is NULL and that
makes the dev_put() a no-op.
> Tom.
>
>> }
>> err:
>> mutex_unlock(&table->lock);
--
Chuck Lever
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices
2023-06-23 18:40 ` Chuck Lever III
@ 2023-06-23 18:46 ` Tom Talpey
0 siblings, 0 replies; 15+ messages in thread
From: Tom Talpey @ 2023-06-23 18:46 UTC (permalink / raw)
To: Chuck Lever III
Cc: Chuck Lever, jgg@nvidia.com, linux-rdma@vger.kernel.org,
BMT@zurich.ibm.com
On 6/23/2023 2:40 PM, Chuck Lever III wrote:
>
>
>> On Jun 23, 2023, at 2:37 PM, Tom Talpey <tom@talpey.com> wrote:
>>
>> On 6/14/2023 10:00 AM, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>> Have the iwarp side properly set the ndev in the device's sgid_attrs
>>> so that address resolution can treat it more like a RoCE device.
>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> drivers/infiniband/core/cache.c | 12 ++++++++++++
>>> 1 file changed, 12 insertions(+)
>>> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
>>> index 2e91d8879326..717524fe8a39 100644
>>> --- a/drivers/infiniband/core/cache.c
>>> +++ b/drivers/infiniband/core/cache.c
>>> @@ -1439,6 +1439,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>>> {
>>> struct ib_gid_attr gid_attr = {};
>>> struct ib_gid_table *table;
>>> + struct net_device *ndev;
>>> int ret = 0;
>>> int i;
>>> @@ -1457,10 +1458,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>>> i);
>>> goto err;
>>> }
>>> +
>>> + ndev = NULL;
>>> + if (rdma_protocol_iwarp(device, port)) {
>>> + ndev = ib_device_get_netdev(device, port);
>>> + if (!ndev)
>>> + continue;
>>> + RCU_INIT_POINTER(gid_attr.ndev, ndev);
>>> + }
>>> +
>>> gid_attr.index = i;
>>> tprops->subnet_prefix =
>>> be64_to_cpu(gid_attr.gid.global.subnet_prefix);
>>> add_modify_gid(table, &gid_attr);
>>> +
>>> + dev_put(ndev);
>>
>> I'm not sure about two things here...
>>
>> 1) is it correct to dev_put(ndev) when returning? It seems that this
>> may risk the device may vanish when it's still present in the cache.
>> Feel free to tell me I'm confused.
>
> ib_device_get_netdev() increments the ndev's reference count
> before returning. dev_put() releases that reference.
>
>
>> 2) Assuming yes, shouldn't the dev_put(ndev) move to within the new
>> if(iwarp) block just above?
>
> If the "if (iwarp)" block isn't taken, then ndev is NULL and that
> makes the dev_put() a no-op.
I still think it would be clearer and less side-effect sensitive to
put the call inside the if(iwarp) block.
Thanks for the dev_put clarification.
Reviewed-by: Tom Talpey <tom@talpey.com>
>
>
>> Tom.
>>
>>> }
>>> err:
>>> mutex_unlock(&table->lock);
>
>
> --
> Chuck Lever
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-06-23 18:46 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-14 14:00 [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Chuck Lever
2023-06-14 14:00 ` [PATCH v3 1/4] RDMA/siw: Fabricate a GID on tun and loopback devices Chuck Lever
2023-06-15 8:18 ` Bernard Metzler
2023-06-23 18:25 ` Tom Talpey
2023-06-14 14:00 ` [PATCH v3 2/4] RDMA/core: Set gid_attr.ndev for iWARP devices Chuck Lever
2023-06-23 18:37 ` Tom Talpey
2023-06-23 18:40 ` Chuck Lever III
2023-06-23 18:46 ` Tom Talpey
2023-06-14 14:00 ` [PATCH v3 3/4] RDMA/cma: Deduplicate error flow in cma_validate_port() Chuck Lever
2023-06-14 14:00 ` [PATCH v3 4/4] RDMA/cma: Avoid GID lookups on iWARP devices Chuck Lever
2023-06-22 15:13 ` Jason Gunthorpe
2023-06-23 17:02 ` Chuck Lever III
2023-06-23 17:12 ` Jason Gunthorpe
2023-06-15 11:59 ` [PATCH v3 0/4] Handle ARPHRD_NONE devices for siw Zhu Yanjun
2023-06-15 13:28 ` Chuck Lever III
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).