* [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
@ 2024-10-25 7:23 Wenjia Zhang
2024-10-25 8:57 ` Halil Pasic
` (7 more replies)
0 siblings, 8 replies; 25+ messages in thread
From: Wenjia Zhang @ 2024-10-25 7:23 UTC (permalink / raw)
To: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Wenjia Zhang, Aswin K
Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
alternative to get_netdev") introduced an API ib_device_get_netdev.
The SMC-R variant of the SMC protocol continued to use the old API
ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
ib_device_ops.get_netdev didn't work any more at least by using a mlx5
device driver. Thus, using ib_device_set_netdev() now became mandatory.
Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
Reported-by: Aswin K <aswin@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
---
net/smc/smc_ib.c | 8 ++------
net/smc/smc_pnet.c | 4 +---
2 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9297dc20bfe2..9c563cdbea90 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
struct ib_device *ibdev = smcibdev->ibdev;
struct net_device *ndev;
- if (!ibdev->ops.get_netdev)
- return;
- ndev = ibdev->ops.get_netdev(ibdev, port + 1);
+ ndev = ib_device_get_netdev(ibdev, port + 1);
if (ndev) {
smcibdev->ndev_ifidx[port] = ndev->ifindex;
dev_put(ndev);
@@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
port_cnt = smcibdev->ibdev->phys_port_cnt;
for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
libdev = smcibdev->ibdev;
- if (!libdev->ops.get_netdev)
- continue;
- lndev = libdev->ops.get_netdev(libdev, i + 1);
+ lndev = ib_device_get_netdev(libdev, i + 1);
dev_put(lndev);
if (lndev != ndev)
continue;
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 1dd362326c0a..8566937c8903 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
for (i = 1; i <= SMC_MAX_PORTS; i++) {
if (!rdma_is_port_valid(ibdev->ibdev, i))
continue;
- if (!ibdev->ibdev->ops.get_netdev)
- continue;
- ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
+ ndev = ib_device_get_netdev(ibdev->ibdev, i);
if (!ndev)
continue;
dev_put(ndev);
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
@ 2024-10-25 8:57 ` Halil Pasic
2024-10-25 14:01 ` Simon Horman
` (6 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Halil Pasic @ 2024-10-25 8:57 UTC (permalink / raw)
To: Wenjia Zhang
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Nils Hoppmann, Niklas Schnell, Thorsten Winkler, Karsten Graul,
Stefan Raspl, Aswin K, Halil Pasic
On Fri, 25 Oct 2024 09:23:55 +0200
Wenjia Zhang <wenjia@linux.ibm.com> wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
2024-10-25 8:57 ` Halil Pasic
@ 2024-10-25 14:01 ` Simon Horman
2024-10-26 0:42 ` Dust Li
` (5 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Simon Horman @ 2024-10-25 14:01 UTC (permalink / raw)
To: Wenjia Zhang
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
2024-10-25 8:57 ` Halil Pasic
2024-10-25 14:01 ` Simon Horman
@ 2024-10-26 0:42 ` Dust Li
2024-10-27 11:18 ` Wen Gu
` (4 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Dust Li @ 2024-10-26 0:42 UTC (permalink / raw)
To: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
On 2024-10-25 09:23:55, Wenjia Zhang wrote:
>Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>alternative to get_netdev") introduced an API ib_device_get_netdev.
>The SMC-R variant of the SMC protocol continued to use the old API
>ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
>("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
>Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
>Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>Reported-by: Aswin K <aswin@linux.ibm.com>
>Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
>Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
>---
> net/smc/smc_ib.c | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
>diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>index 9297dc20bfe2..9c563cdbea90 100644
>--- a/net/smc/smc_ib.c
>+++ b/net/smc/smc_ib.c
>@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
>- if (!ibdev->ops.get_netdev)
>- return;
>- ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>+ ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
>@@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
>- if (!libdev->ops.get_netdev)
>- continue;
>- lndev = libdev->ops.get_netdev(libdev, i + 1);
>+ lndev = ib_device_get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
>diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>index 1dd362326c0a..8566937c8903 100644
>--- a/net/smc/smc_pnet.c
>+++ b/net/smc/smc_pnet.c
>@@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
> if (!rdma_is_port_valid(ibdev->ibdev, i))
> continue;
>- if (!ibdev->ibdev->ops.get_netdev)
>- continue;
>- ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>+ ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
>--
>2.43.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
` (2 preceding siblings ...)
2024-10-26 0:42 ` Dust Li
@ 2024-10-27 11:18 ` Wen Gu
2024-10-27 19:28 ` Zhu Yanjun
` (3 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Wen Gu @ 2024-10-27 11:18 UTC (permalink / raw)
To: Wenjia Zhang, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
On 2024/10/25 15:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
LGTM!
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
> ---
> net/smc/smc_ib.c | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
> - if (!ibdev->ops.get_netdev)
> - return;
> - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> + ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
> - if (!libdev->ops.get_netdev)
> - continue;
> - lndev = libdev->ops.get_netdev(libdev, i + 1);
> + lndev = ib_device_get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
> if (!rdma_is_port_valid(ibdev->ibdev, i))
> continue;
> - if (!ibdev->ibdev->ops.get_netdev)
> - continue;
> - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> + ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
` (3 preceding siblings ...)
2024-10-27 11:18 ` Wen Gu
@ 2024-10-27 19:28 ` Zhu Yanjun
2024-10-27 20:18 ` Leon Romanovsky
` (2 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Zhu Yanjun @ 2024-10-27 19:28 UTC (permalink / raw)
To: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
在 2024/10/25 9:23, Wenjia Zhang 写道:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
Thanks a lot.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Because the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
get_netdev functions") removes the get_netdev callback from
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev is still in
mlx4_ib_dev_ops. So the following commit will follow mlx5 to remove
get_netdev from mlx4 driver.
From a59f2e01428640a332a51b8d910ec166704aa441 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun <yanjun.zhu@linux.dev>
Date: Sun, 27 Oct 2024 20:21:27 +0100
Subject: [PATCH 1/1] RDMA/mlx4: Use IB get_netdev functions and remove
get_netdev callback
In the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
functions") removed the get_netdev callback from
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev callback should also
be removed.
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
compile successfully only
---
drivers/infiniband/hw/mlx4/main.c | 35 -------------------------------
1 file changed, 35 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c
b/drivers/infiniband/hw/mlx4/main.c
index 529db874d67c..cf34d92de7b1 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -123,40 +123,6 @@ static int num_ib_ports(struct mlx4_dev *dev)
return ib_ports;
}
-static struct net_device *mlx4_ib_get_netdev(struct ib_device *device,
- u32 port_num)
-{
- struct mlx4_ib_dev *ibdev = to_mdev(device);
- struct net_device *dev, *ret = NULL;
-
- rcu_read_lock();
- for_each_netdev_rcu(&init_net, dev) {
- if (dev->dev.parent != ibdev->ib_dev.dev.parent ||
- dev->dev_port + 1 != port_num)
- continue;
-
- if (mlx4_is_bonded(ibdev->dev)) {
- struct net_device *upper;
-
- upper = netdev_master_upper_dev_get_rcu(dev);
- if (upper) {
- struct net_device *active;
-
- active =
bond_option_active_slave_get_rcu(netdev_priv(upper));
- if (active)
- dev = active;
- }
- }
-
- dev_hold(dev);
- ret = dev;
- break;
- }
-
- rcu_read_unlock();
- return ret;
-}
-
static int mlx4_ib_update_gids_v1(struct gid_entry *gids,
struct mlx4_ib_dev *ibdev,
u32 port_num)
@@ -2544,7 +2510,6 @@ static const struct ib_device_ops mlx4_ib_dev_ops = {
.get_dev_fw_str = get_fw_ver_str,
.get_dma_mr = mlx4_ib_get_dma_mr,
.get_link_layer = mlx4_ib_port_link_layer,
- .get_netdev = mlx4_ib_get_netdev,
.get_port_immutable = mlx4_port_immutable,
.map_mr_sg = mlx4_ib_map_mr_sg,
.mmap = mlx4_ib_mmap,
--
2.34.1
Zhu Yanjun
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
> net/smc/smc_ib.c | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
> - if (!ibdev->ops.get_netdev)
> - return;
> - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> + ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
> - if (!libdev->ops.get_netdev)
> - continue;
> - lndev = libdev->ops.get_netdev(libdev, i + 1);
> + lndev = ib_device_get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
> if (!rdma_is_port_valid(ibdev->ibdev, i))
> continue;
> - if (!ibdev->ibdev->ops.get_netdev)
> - continue;
> - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> + ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
` (4 preceding siblings ...)
2024-10-27 19:28 ` Zhu Yanjun
@ 2024-10-27 20:18 ` Leon Romanovsky
2024-10-27 20:30 ` Leon Romanovsky
2024-11-05 9:50 ` Wenjia Zhang
2024-10-29 8:43 ` D. Wythe
2024-10-31 10:01 ` Paolo Abeni
7 siblings, 2 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-10-27 20:18 UTC (permalink / raw)
To: Wenjia Zhang
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev.
I would say that calls to ibdev ops from ULPs was never been right
thing to do. The ib_device_set_netdev() was introduced for the drivers.
So the whole commit message is not accurate and better to be rewritten.
> As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver.
It is not a correct statement too. All modern drivers (for last 5 years)
don't have that .get_netdev() ops, so it is not mlx5 specific, but another
justification to say that SMC-R was doing it wrong.
> Thus, using ib_device_set_netdev() now became mandatory.
ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
with ULPs.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
It is too late for me to do proper review for today, but I would say
that it is worth to pay attention to multiple dev_put() calls in the
functions around the ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
It is not related to this change Fixes line.
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
> net/smc/smc_ib.c | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
> - if (!ibdev->ops.get_netdev)
> - return;
> - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> + ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
> - if (!libdev->ops.get_netdev)
> - continue;
> - lndev = libdev->ops.get_netdev(libdev, i + 1);
> + lndev = ib_device_get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
> if (!rdma_is_port_valid(ibdev->ibdev, i))
> continue;
> - if (!ibdev->ibdev->ops.get_netdev)
> - continue;
> - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> + ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-27 20:18 ` Leon Romanovsky
@ 2024-10-27 20:30 ` Leon Romanovsky
2024-11-05 9:50 ` Wenjia Zhang
1 sibling, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-10-27 20:30 UTC (permalink / raw)
To: Wenjia Zhang, Jakub Kicinski
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Eric Dumazet,
Paolo Abeni, netdev, linux-rdma, linux-s390, Heiko Carstens,
Jan Karcher, Gerd Bayer, Alexandra Winter, Halil Pasic,
Nils Hoppmann, Niklas Schnell, Thorsten Winkler, Karsten Graul,
Stefan Raspl, Aswin K
On Sun, Oct 27, 2024 at 10:18:57PM +0200, Leon Romanovsky wrote:
> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > The SMC-R variant of the SMC protocol continued to use the old API
> > ib_device_ops.get_netdev() to lookup netdev.
>
> I would say that calls to ibdev ops from ULPs was never been right
> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>
> So the whole commit message is not accurate and better to be rewritten.
>
> > As this commit 8d159eb2117b
> > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > device driver.
>
> It is not a correct statement too. All modern drivers (for last 5 years)
> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> justification to say that SMC-R was doing it wrong.
>
> > Thus, using ib_device_set_netdev() now became mandatory.
>
> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> with ULPs.
>
> >
> > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> It is too late for me to do proper review for today, but I would say
> that it is worth to pay attention to multiple dev_put() calls in the
> functions around the ib_device_get_netdev().
>
> >
> > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
Honestly, this patch in Fixes line doesn't look right to me. It pokes inside
of ib_device to get netdev index. For example call to smc_ib_ndev_change()
will return completely unpredictable results, due to races.
It is bad that RDMA ML wasn't even CCed back then, we would say NAK to
this patch.
https://lore.kernel.org/netdev/20201201192049.53517-6-kgraul@linux.ibm.com/
Thanks
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
` (5 preceding siblings ...)
2024-10-27 20:18 ` Leon Romanovsky
@ 2024-10-29 8:43 ` D. Wythe
2024-10-31 10:01 ` Paolo Abeni
7 siblings, 0 replies; 25+ messages in thread
From: D. Wythe @ 2024-10-29 8:43 UTC (permalink / raw)
To: Wenjia Zhang, Wen Gu, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
On 10/25/24 3:23 PM, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
> net/smc/smc_ib.c | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
> - if (!ibdev->ops.get_netdev)
> - return;
> - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> + ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
> - if (!libdev->ops.get_netdev)
> - continue;
> - lndev = libdev->ops.get_netdev(libdev, i + 1);
> + lndev = ib_device_get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
> if (!rdma_is_port_valid(ibdev->ibdev, i))
> continue;
> - if (!ibdev->ibdev->ops.get_netdev)
> - continue;
> - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> + ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
` (6 preceding siblings ...)
2024-10-29 8:43 ` D. Wythe
@ 2024-10-31 10:01 ` Paolo Abeni
2024-11-05 9:53 ` Wenjia Zhang
7 siblings, 1 reply; 25+ messages in thread
From: Paolo Abeni @ 2024-10-31 10:01 UTC (permalink / raw)
To: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
On 10/25/24 09:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Please adjust the commit message as per Leon suggestion. You can retain
all the ack collected so far.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-27 20:18 ` Leon Romanovsky
2024-10-27 20:30 ` Leon Romanovsky
@ 2024-11-05 9:50 ` Wenjia Zhang
2024-11-05 11:23 ` Leon Romanovsky
1 sibling, 1 reply; 25+ messages in thread
From: Wenjia Zhang @ 2024-11-05 9:50 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On 27.10.24 21:18, Leon Romanovsky wrote:
> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>> The SMC-R variant of the SMC protocol continued to use the old API
>> ib_device_ops.get_netdev() to lookup netdev.
>
> I would say that calls to ibdev ops from ULPs was never been right
> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>
> So the whole commit message is not accurate and better to be rewritten.
>
>> As this commit 8d159eb2117b
>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>> device driver.
>
> It is not a correct statement too. All modern drivers (for last 5 years)
> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> justification to say that SMC-R was doing it wrong.
>
>> Thus, using ib_device_set_netdev() now became mandatory.
>
> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> with ULPs.
>
>>
>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
> It is too late for me to do proper review for today, but I would say
> that it is worth to pay attention to multiple dev_put() calls in the
> functions around the ib_device_get_netdev().
>
>>
>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>
> It is not related to this change Fixes line.
>
Hi Leon,
Thank you for the review! I agree that SMC could do better. However, we
should fix it and give enough information and reference on the changes,
since the code has already existed and didn't work with the old way. I
can rewrite the commit message.
What about:
"
The SMC-R variant of the SMC protocol still called
ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
driver to run SMC-R, it failed to find a device, because in mlx5_ib the
internal net device management for retrieving net devices was replaced
by a common interface ib_device_get_netdev() in commit 8d159eb2117b
("RDMA/mlx5: Use IB set_netdev and get_netdev functions"). Thus, replace
ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
"
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-10-31 10:01 ` Paolo Abeni
@ 2024-11-05 9:53 ` Wenjia Zhang
0 siblings, 0 replies; 25+ messages in thread
From: Wenjia Zhang @ 2024-11-05 9:53 UTC (permalink / raw)
To: Paolo Abeni, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet
Cc: netdev, linux-rdma, linux-s390, Heiko Carstens, Jan Karcher,
Gerd Bayer, Alexandra Winter, Halil Pasic, Nils Hoppmann,
Niklas Schnell, Thorsten Winkler, Karsten Graul, Stefan Raspl,
Aswin K
On 31.10.24 11:01, Paolo Abeni wrote:
> On 10/25/24 09:23, Wenjia Zhang wrote:
>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>> The SMC-R variant of the SMC protocol continued to use the old API
>> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>>
>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>
>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>> Reported-by: Aswin K <aswin@linux.ibm.com>
>> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
>> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
>
> Please adjust the commit message as per Leon suggestion. You can retain
> all the ack collected so far.
>
> Thanks,
>
> Paolo
>
Hi Paolo,
thank you for the reminder!
Sure, I'll do it.
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-05 9:50 ` Wenjia Zhang
@ 2024-11-05 11:23 ` Leon Romanovsky
2024-11-05 12:30 ` Wenjia Zhang
2024-11-06 9:24 ` Halil Pasic
0 siblings, 2 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-05 11:23 UTC (permalink / raw)
To: Wenjia Zhang
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
>
>
> On 27.10.24 21:18, Leon Romanovsky wrote:
> > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > > The SMC-R variant of the SMC protocol continued to use the old API
> > > ib_device_ops.get_netdev() to lookup netdev.
> >
> > I would say that calls to ibdev ops from ULPs was never been right
> > thing to do. The ib_device_set_netdev() was introduced for the drivers.
> >
> > So the whole commit message is not accurate and better to be rewritten.
> >
> > > As this commit 8d159eb2117b
> > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > > device driver.
> >
> > It is not a correct statement too. All modern drivers (for last 5 years)
> > don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> > justification to say that SMC-R was doing it wrong.
> >
> > > Thus, using ib_device_set_netdev() now became mandatory.
> >
> > ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> > with ULPs.
> >
> > >
> > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> >
> > It is too late for me to do proper review for today, but I would say
> > that it is worth to pay attention to multiple dev_put() calls in the
> > functions around the ib_device_get_netdev().
> >
> > >
> > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> >
> > It is not related to this change Fixes line.
> >
>
> Hi Leon,
>
> Thank you for the review! I agree that SMC could do better. However, we
> should fix it and give enough information and reference on the changes,
> since the code has already existed and didn't work with the old way.
The code which you change worked by chance and was wrong from day one.
> I can rewrite the commit message.
>
> What about:
> "
> The SMC-R variant of the SMC protocol still called
> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
> net device management for retrieving net devices was replaced by a common
> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
> set_netdev and get_netdev functions"). Thus, replace
> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
> "
The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
function to lookup netdev. Such direct accesses are not correct for any
usage outside of RDMA core code.
RDMA subsystem provides ib_device_get_netdev() function that works on
all RDMA drivers returns valid netdev with proper locking an reference
counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
functions") exposed that SMC-R didn't use that function.
So update the SMC-R to use proper API,
Thanks
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-05 11:23 ` Leon Romanovsky
@ 2024-11-05 12:30 ` Wenjia Zhang
2024-11-05 13:39 ` Leon Romanovsky
2024-11-06 9:24 ` Halil Pasic
1 sibling, 1 reply; 25+ messages in thread
From: Wenjia Zhang @ 2024-11-05 12:30 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On 05.11.24 12:23, Leon Romanovsky wrote:
> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
>>
>>
>> On 27.10.24 21:18, Leon Romanovsky wrote:
>>> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>>>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>>>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>>>> The SMC-R variant of the SMC protocol continued to use the old API
>>>> ib_device_ops.get_netdev() to lookup netdev.
>>>
>>> I would say that calls to ibdev ops from ULPs was never been right
>>> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>>>
>>> So the whole commit message is not accurate and better to be rewritten.
>>>
>>>> As this commit 8d159eb2117b
>>>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>>>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>>>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>>>> device driver.
>>>
>>> It is not a correct statement too. All modern drivers (for last 5 years)
>>> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
>>> justification to say that SMC-R was doing it wrong.
>>>
>>>> Thus, using ib_device_set_netdev() now became mandatory.
>>>
>>> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
>>> with ULPs.
>>>
>>>>
>>>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>>
>>> It is too late for me to do proper review for today, but I would say
>>> that it is worth to pay attention to multiple dev_put() calls in the
>>> functions around the ib_device_get_netdev().
>>>
>>>>
>>>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>>>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>>>
>>> It is not related to this change Fixes line.
>>>
>>
>> Hi Leon,
>>
>> Thank you for the review! I agree that SMC could do better. However, we
>> should fix it and give enough information and reference on the changes,
>> since the code has already existed and didn't work with the old way.
>
> The code which you change worked by chance and was wrong from day one.
>
>> I can rewrite the commit message.
>>
>> What about:
>> "
>> The SMC-R variant of the SMC protocol still called
>> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
>> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
>> net device management for retrieving net devices was replaced by a common
>> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
>> set_netdev and get_netdev functions"). Thus, replace
>> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
>> "
>
> The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
> function to lookup netdev. Such direct accesses are not correct for any
> usage outside of RDMA core code.
>
Is such an absolute statement documented somewhere? If not, I don't
think it's convenient that I use it. Maybe you guys as RDMA core
maintainer can, not I.
> RDMA subsystem provides ib_device_get_netdev() function that works on
> all RDMA drivers returns valid netdev with proper locking an reference
> counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> functions") exposed that SMC-R didn't use that function.
>
> So update the SMC-R to use proper API,
>
> Thanks
>
mhhh, I'd like to stick to my version, which sounds more neutral IMO. I
think the purpose is the same.
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-05 12:30 ` Wenjia Zhang
@ 2024-11-05 13:39 ` Leon Romanovsky
2024-11-05 14:14 ` Wenjia Zhang
0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-05 13:39 UTC (permalink / raw)
To: Wenjia Zhang
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Tue, Nov 05, 2024 at 01:30:24PM +0100, Wenjia Zhang wrote:
>
>
> On 05.11.24 12:23, Leon Romanovsky wrote:
> > On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> > >
> > >
> > > On 27.10.24 21:18, Leon Romanovsky wrote:
> > > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > > > > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > > > > The SMC-R variant of the SMC protocol continued to use the old API
> > > > > ib_device_ops.get_netdev() to lookup netdev.
> > > >
> > > > I would say that calls to ibdev ops from ULPs was never been right
> > > > thing to do. The ib_device_set_netdev() was introduced for the drivers.
> > > >
> > > > So the whole commit message is not accurate and better to be rewritten.
> > > >
> > > > > As this commit 8d159eb2117b
> > > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > > > > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > > > > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > > > > device driver.
> > > >
> > > > It is not a correct statement too. All modern drivers (for last 5 years)
> > > > don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> > > > justification to say that SMC-R was doing it wrong.
> > > >
> > > > > Thus, using ib_device_set_netdev() now became mandatory.
> > > >
> > > > ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> > > > with ULPs.
> > > >
> > > > >
> > > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> > > >
> > > > It is too late for me to do proper review for today, but I would say
> > > > that it is worth to pay attention to multiple dev_put() calls in the
> > > > functions around the ib_device_get_netdev().
> > > >
> > > > >
> > > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > >
> > > > It is not related to this change Fixes line.
> > > >
> > >
> > > Hi Leon,
> > >
> > > Thank you for the review! I agree that SMC could do better. However, we
> > > should fix it and give enough information and reference on the changes,
> > > since the code has already existed and didn't work with the old way.
> >
> > The code which you change worked by chance and was wrong from day one.
> >
> > > I can rewrite the commit message.
> > >
> > > What about:
> > > "
> > > The SMC-R variant of the SMC protocol still called
> > > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
> > > to run SMC-R, it failed to find a device, because in mlx5_ib the internal
> > > net device management for retrieving net devices was replaced by a common
> > > interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
> > > set_netdev and get_netdev functions"). Thus, replace
> > > ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
> > > "
> >
> > The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
> > function to lookup netdev. Such direct accesses are not correct for any
> > usage outside of RDMA core code.
> >
> Is such an absolute statement documented somewhere? If not, I don't think
> it's convenient that I use it. Maybe you guys as RDMA core maintainer can,
> not I.
You can too as it is very clear. All functions which can be used have
EXPORT_SYMBOL near them, ops.get_netdev() has nothing like that.
> > RDMA subsystem provides ib_device_get_netdev() function that works on
> > all RDMA drivers returns valid netdev with proper locking an reference
> > counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> > functions") exposed that SMC-R didn't use that function.
> >
> > So update the SMC-R to use proper API,
> >
> > Thanks
> >
> mhhh, I'd like to stick to my version, which sounds more neutral IMO. I
> think the purpose is the same.
I don't want to argue about the words, my point is that get_netdev() was
never been the right interface.
Thanks
>
> Thanks,
> Wenjia
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-05 13:39 ` Leon Romanovsky
@ 2024-11-05 14:14 ` Wenjia Zhang
0 siblings, 0 replies; 25+ messages in thread
From: Wenjia Zhang @ 2024-11-05 14:14 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wen Gu, D. Wythe, Tony Lu, David Miller, Jakub Kicinski,
Eric Dumazet, Paolo Abeni, netdev, linux-rdma, linux-s390,
Heiko Carstens, Jan Karcher, Gerd Bayer, Alexandra Winter,
Halil Pasic, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On 05.11.24 14:39, Leon Romanovsky wrote:
> On Tue, Nov 05, 2024 at 01:30:24PM +0100, Wenjia Zhang wrote:
>>
>>
>> On 05.11.24 12:23, Leon Romanovsky wrote:
>>> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
>>>>
>>>>
>>>> On 27.10.24 21:18, Leon Romanovsky wrote:
>>>>> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>>>>>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>>>>>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>>>>>> The SMC-R variant of the SMC protocol continued to use the old API
>>>>>> ib_device_ops.get_netdev() to lookup netdev.
>>>>>
>>>>> I would say that calls to ibdev ops from ULPs was never been right
>>>>> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>>>>>
>>>>> So the whole commit message is not accurate and better to be rewritten.
>>>>>
>>>>>> As this commit 8d159eb2117b
>>>>>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>>>>>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>>>>>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>>>>>> device driver.
>>>>>
>>>>> It is not a correct statement too. All modern drivers (for last 5 years)
>>>>> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
>>>>> justification to say that SMC-R was doing it wrong.
>>>>>
>>>>>> Thus, using ib_device_set_netdev() now became mandatory.
>>>>>
>>>>> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
>>>>> with ULPs.
>>>>>
>>>>>>
>>>>>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>>>>
>>>>> It is too late for me to do proper review for today, but I would say
>>>>> that it is worth to pay attention to multiple dev_put() calls in the
>>>>> functions around the ib_device_get_netdev().
>>>>>
>>>>>>
>>>>>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>>>>>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>>>>>
>>>>> It is not related to this change Fixes line.
>>>>>
>>>>
>>>> Hi Leon,
>>>>
>>>> Thank you for the review! I agree that SMC could do better. However, we
>>>> should fix it and give enough information and reference on the changes,
>>>> since the code has already existed and didn't work with the old way.
>>>
>>> The code which you change worked by chance and was wrong from day one.
>>>
>>>> I can rewrite the commit message.
>>>>
>>>> What about:
>>>> "
>>>> The SMC-R variant of the SMC protocol still called
>>>> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
>>>> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
>>>> net device management for retrieving net devices was replaced by a common
>>>> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
>>>> set_netdev and get_netdev functions"). Thus, replace
>>>> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
>>>> "
>>>
>>> The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
>>> function to lookup netdev. Such direct accesses are not correct for any
>>> usage outside of RDMA core code.
>>>
>> Is such an absolute statement documented somewhere? If not, I don't think
>> it's convenient that I use it. Maybe you guys as RDMA core maintainer can,
>> not I.
>
> You can too as it is very clear. All functions which can be used have
> EXPORT_SYMBOL near them, ops.get_netdev() has nothing like that.
>
>>> RDMA subsystem provides ib_device_get_netdev() function that works on
>>> all RDMA drivers returns valid netdev with proper locking an reference
>>> counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
>>> functions") exposed that SMC-R didn't use that function.
>>>
>>> So update the SMC-R to use proper API,
>>>
>>> Thanks
>>>
>> mhhh, I'd like to stick to my version, which sounds more neutral IMO. I
>> think the purpose is the same.
>
> I don't want to argue about the words, my point is that get_netdev() was
> never been the right interface.
>
> Thanks
>
Ok, I got you. I'll send v2 with a proper commit message.
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-05 11:23 ` Leon Romanovsky
2024-11-05 12:30 ` Wenjia Zhang
@ 2024-11-06 9:24 ` Halil Pasic
2024-11-06 13:59 ` Leon Romanovsky
1 sibling, 1 reply; 25+ messages in thread
From: Halil Pasic @ 2024-11-06 9:24 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, Halil Pasic
On Tue, 5 Nov 2024 13:23:13 +0200
Leon Romanovsky <leon@kernel.org> wrote:
> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> >
> >
> > On 27.10.24 21:18, Leon Romanovsky wrote:
> > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as
> > > > an alternative to get_netdev") introduced an API
> > > > ib_device_get_netdev. The SMC-R variant of the SMC protocol
> > > > continued to use the old API ib_device_ops.get_netdev() to
> > > > lookup netdev.
> > >
> > > I would say that calls to ibdev ops from ULPs was never been right
> > > thing to do. The ib_device_set_netdev() was introduced for the
> > > drivers.
> > >
> > > So the whole commit message is not accurate and better to be
> > > rewritten.
> > > > As this commit 8d159eb2117b
> > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > > removed the get_netdev callback from
> > > > mlx5_ib_dev_common_roce_ops, calling ib_device_ops.get_netdev
> > > > didn't work any more at least by using a mlx5 device driver.
> > >
> > > It is not a correct statement too. All modern drivers (for last 5
> > > years) don't have that .get_netdev() ops, so it is not mlx5
> > > specific, but another justification to say that SMC-R was doing it
> > > wrong.
> > > > Thus, using ib_device_set_netdev() now became mandatory.
> > >
> > > ib_device_set_netdev() is mandatory for the drivers, it is nothing
> > > to do with ULPs.
> > >
> > > >
> > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> > >
> > > It is too late for me to do proper review for today, but I would
> > > say that it is worth to pay attention to multiple dev_put() calls
> > > in the functions around the ib_device_get_netdev().
> > >
> > > >
> > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > > > get_netdev functions")
> > >
> > > It is not related to this change Fixes line.
> > >
> >
> > Hi Leon,
> >
> > Thank you for the review! I agree that SMC could do better. However,
> > we should fix it and give enough information and reference on the
> > changes, since the code has already existed and didn't work with the
> > old way.
>
> The code which you change worked by chance and was wrong from day one.
I absolutely agree with that statement. But please notice that the
commit date of commit c2261dd76b54 ("RDMA/device: Add
ib_device_set_netdev() as an alternative to get_netdev") predates the
commit date of commit 54903572c23c ("net/smc: allow pnetid-less
configuration") only by 9 days. And before commit c2261dd76b54
("RDMA/device: Add ib_device_set_netdev() as an alternative to
get_netdev") there was no
ib_device_get_netdev() AFAICT.
Maybe the two patches crossed mid air so to say.
>
> > I can rewrite the commit message.
> >
> > What about:
> > "
> > The SMC-R variant of the SMC protocol still called
> > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
> > driver to run SMC-R, it failed to find a device, because in mlx5_ib
> > the internal net device management for retrieving net devices was
> > replaced by a common interface ib_device_get_netdev() in commit
> > 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> > functions"). Thus, replace ib_device_ops.get_netdev() with
> > ib_device_get_netdev() in SMC. "
>
> The SMC-R variant of the SMC protocol used direct call to
> ib_device_ops.get_netdev() function to lookup netdev. Such direct
> accesses are not correct for any usage outside of RDMA core code.
>
I agree, it is not correct since c2261dd76b54 ("RDMA/device: Add
ib_device_set_netdev() as an alternative to get_netdev").
Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
I would guess it is not, and I would not actually mind sending a patch
but I have trouble figuring out the logic behind commit ecce70cf17d9
("ksmbd: fix missing RDMA-capable flag for IPoIB device in
ksmbd_rdma_capable_netdev()").
> RDMA subsystem provides ib_device_get_netdev() function that works on
> all RDMA drivers returns valid netdev with proper locking an reference
> counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> get_netdev functions") exposed that SMC-R didn't use that function.
>
I believe the intention was this all along. I think the commit message
was written with the idea that 54903572c23c happened before c2261dd76b54
which is not the case.
> So update the SMC-R to use proper API,
>
I believe this is exactly what the patch does! And I agree we need to
improve on the commit message.
Regards,
Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-06 9:24 ` Halil Pasic
@ 2024-11-06 13:59 ` Leon Romanovsky
2024-11-07 11:47 ` Halil Pasic
2024-11-07 11:56 ` Halil Pasic
0 siblings, 2 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-06 13:59 UTC (permalink / raw)
To: Halil Pasic
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Wed, Nov 06, 2024 at 10:24:39AM +0100, Halil Pasic wrote:
> On Tue, 5 Nov 2024 13:23:13 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> > >
> > >
> > > On 27.10.24 21:18, Leon Romanovsky wrote:
> > > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as
> > > > > an alternative to get_netdev") introduced an API
> > > > > ib_device_get_netdev. The SMC-R variant of the SMC protocol
> > > > > continued to use the old API ib_device_ops.get_netdev() to
> > > > > lookup netdev.
> > > >
> > > > I would say that calls to ibdev ops from ULPs was never been right
> > > > thing to do. The ib_device_set_netdev() was introduced for the
> > > > drivers.
> > > >
> > > > So the whole commit message is not accurate and better to be
> > > > rewritten.
> > > > > As this commit 8d159eb2117b
> > > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > > > removed the get_netdev callback from
> > > > > mlx5_ib_dev_common_roce_ops, calling ib_device_ops.get_netdev
> > > > > didn't work any more at least by using a mlx5 device driver.
> > > >
> > > > It is not a correct statement too. All modern drivers (for last 5
> > > > years) don't have that .get_netdev() ops, so it is not mlx5
> > > > specific, but another justification to say that SMC-R was doing it
> > > > wrong.
> > > > > Thus, using ib_device_set_netdev() now became mandatory.
> > > >
> > > > ib_device_set_netdev() is mandatory for the drivers, it is nothing
> > > > to do with ULPs.
> > > >
> > > > >
> > > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> > > >
> > > > It is too late for me to do proper review for today, but I would
> > > > say that it is worth to pay attention to multiple dev_put() calls
> > > > in the functions around the ib_device_get_netdev().
> > > >
> > > > >
> > > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > > > > get_netdev functions")
> > > >
> > > > It is not related to this change Fixes line.
> > > >
> > >
> > > Hi Leon,
> > >
> > > Thank you for the review! I agree that SMC could do better. However,
> > > we should fix it and give enough information and reference on the
> > > changes, since the code has already existed and didn't work with the
> > > old way.
> >
> > The code which you change worked by chance and was wrong from day one.
>
> I absolutely agree with that statement. But please notice that the
> commit date of commit c2261dd76b54 ("RDMA/device: Add
> ib_device_set_netdev() as an alternative to get_netdev") predates the
> commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> configuration") only by 9 days. And before commit c2261dd76b54
> ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> get_netdev") there was no
> ib_device_get_netdev() AFAICT.
It doesn't make it right.
1. While commit c2261dd76b54 was submitted and discussed, RDMA was not CCed.
2. Author didn't try to add his version of ib_device_get_netdev() as it
is done for all APIs exposed by RDMA core.
>
> Maybe the two patches crossed mid air so to say.
>
> >
> > > I can rewrite the commit message.
> > >
> > > What about:
> > > "
> > > The SMC-R variant of the SMC protocol still called
> > > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
> > > driver to run SMC-R, it failed to find a device, because in mlx5_ib
> > > the internal net device management for retrieving net devices was
> > > replaced by a common interface ib_device_get_netdev() in commit
> > > 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> > > functions"). Thus, replace ib_device_ops.get_netdev() with
> > > ib_device_get_netdev() in SMC. "
> >
> > The SMC-R variant of the SMC protocol used direct call to
> > ib_device_ops.get_netdev() function to lookup netdev. Such direct
> > accesses are not correct for any usage outside of RDMA core code.
> >
>
> I agree, it is not correct since c2261dd76b54 ("RDMA/device: Add
> ib_device_set_netdev() as an alternative to get_netdev").
>
> Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
RDMA core code is drivers/infiniband/core/*.
> I would guess it is not, and I would not actually mind sending a patch
> but I have trouble figuring out the logic behind commit ecce70cf17d9
> ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> ksmbd_rdma_capable_netdev()").
It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
GID, netdev and fabric complexity.
>
>
> > RDMA subsystem provides ib_device_get_netdev() function that works on
> > all RDMA drivers returns valid netdev with proper locking an reference
> > counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > get_netdev functions") exposed that SMC-R didn't use that function.
> >
>
> I believe the intention was this all along. I think the commit message
> was written with the idea that 54903572c23c happened before c2261dd76b54
> which is not the case.
>
> > So update the SMC-R to use proper API,
> >
>
> I believe this is exactly what the patch does! And I agree we need to
> improve on the commit message.
>
> Regards,
> Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-06 13:59 ` Leon Romanovsky
@ 2024-11-07 11:47 ` Halil Pasic
2024-11-07 12:08 ` Leon Romanovsky
2024-11-07 11:56 ` Halil Pasic
1 sibling, 1 reply; 25+ messages in thread
From: Halil Pasic @ 2024-11-07 11:47 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, Halil Pasic
On Wed, 6 Nov 2024 15:59:10 +0200
Leon Romanovsky <leon@kernel.org> wrote:
> > I absolutely agree with that statement. But please notice that the
> > commit date of commit c2261dd76b54 ("RDMA/device: Add
> > ib_device_set_netdev() as an alternative to get_netdev") predates the
> > commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> > configuration") only by 9 days. And before commit c2261dd76b54
> > ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> > get_netdev") there was no
> > ib_device_get_netdev() AFAICT.
>
> It doesn't make it right.
I agree!
>
> 1. While commit c2261dd76b54 was submitted and discussed, RDMA was not
> CCed.
Would the RDMA community agree with adding
L: linux-rdma@vger.kernel.org
to the "SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS" section of the
MAINTAINERS file, so that get_maintainer.pl tells contributors to cc
RDMA?
In my personal opinion SMC would have benefited greatly from review by
the RDMA community, and this is not the first time where the RDMA
community was not included where it should have been.
> 2. Author didn't try to add his version of ib_device_get_netdev() as it
> is done for all APIs exposed by RDMA core.
I understand now that direct access to ops callbacks is off limits for
ULPs. I'm not sure I understand all the details, but I hope I don't have
to.
Regards,
Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-06 13:59 ` Leon Romanovsky
2024-11-07 11:47 ` Halil Pasic
@ 2024-11-07 11:56 ` Halil Pasic
2024-11-07 12:13 ` Leon Romanovsky
2024-11-07 23:40 ` Namjae Jeon
1 sibling, 2 replies; 25+ messages in thread
From: Halil Pasic @ 2024-11-07 11:56 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, Halil Pasic, linux-cifs
On Wed, 6 Nov 2024 15:59:10 +0200
Leon Romanovsky <leon@kernel.org> wrote:
> > Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
>
> RDMA core code is drivers/infiniband/core/*.
Understood. So this is a violation of the no direct access to the
callbacks rule.
>
> > I would guess it is not, and I would not actually mind sending a patch
> > but I have trouble figuring out the logic behind commit ecce70cf17d9
> > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > ksmbd_rdma_capable_netdev()").
>
> It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> GID, netdev and fabric complexity.
I'm not familiar enough with either of the subsystems. Based on your
answer my guess is that it ain't outright bugous but still a layering
violation. Copying linux-cifs@vger.kernel.org so that
the smb are aware.
Thank you very much for all the explanations!
Regards,
Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-07 11:47 ` Halil Pasic
@ 2024-11-07 12:08 ` Leon Romanovsky
0 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-07 12:08 UTC (permalink / raw)
To: Halil Pasic
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K
On Thu, Nov 07, 2024 at 12:47:11PM +0100, Halil Pasic wrote:
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > > I absolutely agree with that statement. But please notice that the
> > > commit date of commit c2261dd76b54 ("RDMA/device: Add
> > > ib_device_set_netdev() as an alternative to get_netdev") predates the
> > > commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> > > configuration") only by 9 days. And before commit c2261dd76b54
> > > ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> > > get_netdev") there was no
> > > ib_device_get_netdev() AFAICT.
> >
> > It doesn't make it right.
>
> I agree!
> >
> > 1. While commit c2261dd76b54 was submitted and discussed, RDMA was not
> > CCed.
>
> Would the RDMA community agree with adding
> L: linux-rdma@vger.kernel.org
> to the "SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS" section of the
> MAINTAINERS file, so that get_maintainer.pl tells contributors to cc
> RDMA?
Yes, of course. We always curious to see how our in-kernel API works and
if it needs adjustments rather than ULP hacks to overcome its limitations.
>
> In my personal opinion SMC would have benefited greatly from review by
> the RDMA community, and this is not the first time where the RDMA
> community was not included where it should have been.
Jakub pushes SMC authors to CC RDMA, which is great, but it wasn't in
the past and your idea of adding new entry to MAINTAINERS file will
help.
>
> > 2. Author didn't try to add his version of ib_device_get_netdev() as it
> > is done for all APIs exposed by RDMA core.
>
> I understand now that direct access to ops callbacks is off limits for
> ULPs. I'm not sure I understand all the details, but I hope I don't have
> to.
>
> Regards,
> Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-07 11:56 ` Halil Pasic
@ 2024-11-07 12:13 ` Leon Romanovsky
2024-11-07 23:40 ` Namjae Jeon
1 sibling, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-07 12:13 UTC (permalink / raw)
To: Halil Pasic
Cc: Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu, David Miller,
Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev, linux-rdma,
linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, linux-cifs
On Thu, Nov 07, 2024 at 12:56:43PM +0100, Halil Pasic wrote:
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > > Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> >
> > RDMA core code is drivers/infiniband/core/*.
>
> Understood. So this is a violation of the no direct access to the
> callbacks rule.
It is not rule, but more common sense. Callbacks don't provide any
module reference counting, module autoload e.t.c
It is very rare situation where you call device callbacks from one subsystem
in another. I'm not familiar with such situations.
>
> >
> > > I would guess it is not, and I would not actually mind sending a patch
> > > but I have trouble figuring out the logic behind commit ecce70cf17d9
> > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > ksmbd_rdma_capable_netdev()").
> >
> > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > GID, netdev and fabric complexity.
>
> I'm not familiar enough with either of the subsystems. Based on your
> answer my guess is that it ain't outright bugous but still a layering
> violation. Copying linux-cifs@vger.kernel.org so that
> the smb are aware.
>
> Thank you very much for all the explanations!
>
> Regards,
> Halil
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-07 11:56 ` Halil Pasic
2024-11-07 12:13 ` Leon Romanovsky
@ 2024-11-07 23:40 ` Namjae Jeon
2024-11-08 17:59 ` Leon Romanovsky
1 sibling, 1 reply; 25+ messages in thread
From: Namjae Jeon @ 2024-11-07 23:40 UTC (permalink / raw)
To: Halil Pasic
Cc: Leon Romanovsky, Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu,
David Miller, Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev,
linux-rdma, linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, linux-cifs, Kangjing Huang
On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > > Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> >
> > RDMA core code is drivers/infiniband/core/*.
>
> Understood. So this is a violation of the no direct access to the
> callbacks rule.
>
> >
> > > I would guess it is not, and I would not actually mind sending a patch
> > > but I have trouble figuring out the logic behind commit ecce70cf17d9
> > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > ksmbd_rdma_capable_netdev()").
> >
> > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > GID, netdev and fabric complexity.
>
> I'm not familiar enough with either of the subsystems. Based on your
> answer my guess is that it ain't outright bugous but still a layering
> violation. Copying linux-cifs@vger.kernel.org so that
> the smb are aware.
Could you please elaborate what the violation is ?
I would also appreciate it if you could suggest to me how to fix this.
Thanks.
>
> Thank you very much for all the explanations!
>
> Regards,
> Halil
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-07 23:40 ` Namjae Jeon
@ 2024-11-08 17:59 ` Leon Romanovsky
2024-11-09 5:32 ` Namjae Jeon
0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2024-11-08 17:59 UTC (permalink / raw)
To: Namjae Jeon
Cc: Halil Pasic, Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu,
David Miller, Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev,
linux-rdma, linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, linux-cifs, Kangjing Huang
On Fri, Nov 08, 2024 at 08:40:40AM +0900, Namjae Jeon wrote:
> On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> >
> > On Wed, 6 Nov 2024 15:59:10 +0200
> > Leon Romanovsky <leon@kernel.org> wrote:
> >
> > > > Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> > >
> > > RDMA core code is drivers/infiniband/core/*.
> >
> > Understood. So this is a violation of the no direct access to the
> > callbacks rule.
> >
> > >
> > > > I would guess it is not, and I would not actually mind sending a patch
> > > > but I have trouble figuring out the logic behind commit ecce70cf17d9
> > > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > > ksmbd_rdma_capable_netdev()").
> > >
> > > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > > GID, netdev and fabric complexity.
> >
> > I'm not familiar enough with either of the subsystems. Based on your
> > answer my guess is that it ain't outright bugous but still a layering
> > violation. Copying linux-cifs@vger.kernel.org so that
> > the smb are aware.
> Could you please elaborate what the violation is ?
There are many, but the most screaming is that ksmbd has logic to
differentiate IPoIB devices. These devices are pure netdev devices
and should be treated like that. ULPs should treat them exactly
as they treat netdev devices.
> I would also appreciate it if you could suggest to me how to fix this.
>
> Thanks.
> >
> > Thank you very much for all the explanations!
> >
> > Regards,
> > Halil
> >
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()
2024-11-08 17:59 ` Leon Romanovsky
@ 2024-11-09 5:32 ` Namjae Jeon
0 siblings, 0 replies; 25+ messages in thread
From: Namjae Jeon @ 2024-11-09 5:32 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Halil Pasic, Wenjia Zhang, Wen Gu, D. Wythe, Tony Lu,
David Miller, Jakub Kicinski, Eric Dumazet, Paolo Abeni, netdev,
linux-rdma, linux-s390, Heiko Carstens, Jan Karcher, Gerd Bayer,
Alexandra Winter, Nils Hoppmann, Niklas Schnell, Thorsten Winkler,
Karsten Graul, Stefan Raspl, Aswin K, linux-cifs, Kangjing Huang
On Sat, Nov 9, 2024 at 2:59 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Nov 08, 2024 at 08:40:40AM +0900, Namjae Jeon wrote:
> > On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> > >
> > > On Wed, 6 Nov 2024 15:59:10 +0200
> > > Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > > > Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> > > >
> > > > RDMA core code is drivers/infiniband/core/*.
> > >
> > > Understood. So this is a violation of the no direct access to the
> > > callbacks rule.
> > >
> > > >
> > > > > I would guess it is not, and I would not actually mind sending a patch
> > > > > but I have trouble figuring out the logic behind commit ecce70cf17d9
> > > > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > > > ksmbd_rdma_capable_netdev()").
> > > >
> > > > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > > > GID, netdev and fabric complexity.
> > >
> > > I'm not familiar enough with either of the subsystems. Based on your
> > > answer my guess is that it ain't outright bugous but still a layering
> > > violation. Copying linux-cifs@vger.kernel.org so that
> > > the smb are aware.
> > Could you please elaborate what the violation is ?
>
> There are many, but the most screaming is that ksmbd has logic to
> differentiate IPoIB devices. These devices are pure netdev devices
> and should be treated like that. ULPs should treat them exactly
> as they treat netdev devices.
Okay, I'll discuss with Kangjing if there's another way to avoid this issue.
If not, I'll revert the patch.
Thanks.
>
> > I would also appreciate it if you could suggest to me how to fix this.
> >
> > Thanks.
> > >
> > > Thank you very much for all the explanations!
> > >
> > > Regards,
> > > Halil
> > >
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-11-09 5:32 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-25 7:23 [PATCH net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() Wenjia Zhang
2024-10-25 8:57 ` Halil Pasic
2024-10-25 14:01 ` Simon Horman
2024-10-26 0:42 ` Dust Li
2024-10-27 11:18 ` Wen Gu
2024-10-27 19:28 ` Zhu Yanjun
2024-10-27 20:18 ` Leon Romanovsky
2024-10-27 20:30 ` Leon Romanovsky
2024-11-05 9:50 ` Wenjia Zhang
2024-11-05 11:23 ` Leon Romanovsky
2024-11-05 12:30 ` Wenjia Zhang
2024-11-05 13:39 ` Leon Romanovsky
2024-11-05 14:14 ` Wenjia Zhang
2024-11-06 9:24 ` Halil Pasic
2024-11-06 13:59 ` Leon Romanovsky
2024-11-07 11:47 ` Halil Pasic
2024-11-07 12:08 ` Leon Romanovsky
2024-11-07 11:56 ` Halil Pasic
2024-11-07 12:13 ` Leon Romanovsky
2024-11-07 23:40 ` Namjae Jeon
2024-11-08 17:59 ` Leon Romanovsky
2024-11-09 5:32 ` Namjae Jeon
2024-10-29 8:43 ` D. Wythe
2024-10-31 10:01 ` Paolo Abeni
2024-11-05 9:53 ` Wenjia Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).