* [PATCH net-next 0/4] Make SMC-R can work with rxe devices
@ 2024-08-09 8:31 Liu Jian
2024-08-09 8:31 ` [PATCH net-next 1/4] rdma/device: export ib_device_get_netdev() Liu Jian
` (5 more replies)
0 siblings, 6 replies; 20+ messages in thread
From: Liu Jian @ 2024-08-09 8:31 UTC (permalink / raw)
To: linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni, liujian56
Make SMC-R can work with rxe devices. This allows us to easily test and
learn the SMC-R protocol without relying on a physical RoCE NIC.
Liu Jian (4):
rdma/device: export ib_device_get_netdev()
net/smc: use ib_device_get_netdev() helper to get netdev info
net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
RDMA/rxe: Set queue pair cur_qp_state when being queried
drivers/infiniband/core/core_priv.h | 3 ---
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
include/rdma/ib_verbs.h | 2 ++
net/smc/smc_ib.c | 10 +++++-----
net/smc/smc_pnet.c | 6 +-----
6 files changed, 11 insertions(+), 13 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net-next 1/4] rdma/device: export ib_device_get_netdev()
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
@ 2024-08-09 8:31 ` Liu Jian
2024-08-09 8:31 ` [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info Liu Jian
` (4 subsequent siblings)
5 siblings, 0 replies; 20+ messages in thread
From: Liu Jian @ 2024-08-09 8:31 UTC (permalink / raw)
To: linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni, liujian56
Many drivers do not implement the ib_device_ops.get_netdev callback
function; for them, use the generic helper function ib_device_get_netdev()
enough. Therefore, this patch exports ib_device_get_netdev() helper allows
it to be used in other modules.
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
drivers/infiniband/core/core_priv.h | 3 ---
drivers/infiniband/core/device.c | 1 +
include/rdma/ib_verbs.h | 2 ++
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index dd7715ba9fd1..693b3ded69e1 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -87,9 +87,6 @@ typedef void (*roce_netdev_callback)(struct ib_device *device, u32 port,
typedef bool (*roce_netdev_filter)(struct ib_device *device, u32 port,
struct net_device *idev, void *cookie);
-struct net_device *ib_device_get_netdev(struct ib_device *ib_dev,
- u32 port);
-
void ib_enum_roce_netdev(struct ib_device *ib_dev,
roce_netdev_filter filter,
void *filter_cookie,
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 0290aca18d26..d25923285447 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2263,6 +2263,7 @@ struct net_device *ib_device_get_netdev(struct ib_device *ib_dev,
return res;
}
+EXPORT_SYMBOL(ib_device_get_netdev);
/**
* ib_device_get_by_netdev - Find an IB device associated with a netdev
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6c5712ae559d..0fe252c32d60 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -4453,6 +4453,8 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u32 port,
const struct sockaddr *addr);
int ib_device_set_netdev(struct ib_device *ib_dev, struct net_device *ndev,
unsigned int port);
+struct net_device *ib_device_get_netdev(struct ib_device *ib_dev,
+ u32 port);
struct ib_wq *ib_create_wq(struct ib_pd *pd,
struct ib_wq_init_attr *init_attr);
int ib_destroy_wq_user(struct ib_wq *wq, struct ib_udata *udata);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
2024-08-09 8:31 ` [PATCH net-next 1/4] rdma/device: export ib_device_get_netdev() Liu Jian
@ 2024-08-09 8:31 ` Liu Jian
2024-08-09 14:59 ` Dust Li
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
` (3 subsequent siblings)
5 siblings, 1 reply; 20+ messages in thread
From: Liu Jian @ 2024-08-09 8:31 UTC (permalink / raw)
To: linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni, liujian56
Currently, in the SMC protocol, network devices are obtained by calling
ib_device_ops.get_netdev(). But for some drivers, this callback function
is not implemented separately. Therefore, here I modified to use
ib_device_get_netdev() to get net_device.
For rdma devices that do not implement ib_device_ops.get_netdev(), one of
the issues addressed is as follows:
before:
smcr device
Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
rxee 1 ACTIVE 0 No 0
after:
smcr device
Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
enp1s0f1 rxee 1 ACTIVE 0 No 0
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
net/smc/smc_ib.c | 8 +++-----
net/smc/smc_pnet.c | 6 +-----
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9297dc20bfe2..382351ac9434 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
struct ib_device *ibdev = smcibdev->ibdev;
struct net_device *ndev;
- if (!ibdev->ops.get_netdev)
- return;
- ndev = ibdev->ops.get_netdev(ibdev, port + 1);
+ ndev = ib_device_get_netdev(ibdev, port + 1);
if (ndev) {
smcibdev->ndev_ifidx[port] = ndev->ifindex;
dev_put(ndev);
@@ -921,9 +919,9 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
port_cnt = smcibdev->ibdev->phys_port_cnt;
for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
libdev = smcibdev->ibdev;
- if (!libdev->ops.get_netdev)
+ lndev = ib_device_get_netdev(libdev, i + 1);
+ if (!lndev)
continue;
- lndev = libdev->ops.get_netdev(libdev, i + 1);
dev_put(lndev);
if (lndev != ndev)
continue;
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 2adb92b8c469..a55a697a48de 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -1055,11 +1055,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
continue;
for (i = 1; i <= SMC_MAX_PORTS; i++) {
- if (!rdma_is_port_valid(ibdev->ibdev, i))
- continue;
- if (!ibdev->ibdev->ops.get_netdev)
- continue;
- ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
+ ndev = ib_device_get_netdev(ibdev->ibdev, i);
if (!ndev)
continue;
dev_put(ndev);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
2024-08-09 8:31 ` [PATCH net-next 1/4] rdma/device: export ib_device_get_netdev() Liu Jian
2024-08-09 8:31 ` [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info Liu Jian
@ 2024-08-09 8:31 ` Liu Jian
2024-08-09 10:01 ` Wen Gu
` (2 more replies)
2024-08-09 8:31 ` [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried Liu Jian
` (2 subsequent siblings)
5 siblings, 3 replies; 20+ messages in thread
From: Liu Jian @ 2024-08-09 8:31 UTC (permalink / raw)
To: linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni, liujian56
BUG: kernel NULL pointer dereference, address: 0000000000000238
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
Workqueue: smc_hs_wq smc_listen_work [smc]
RIP: 0010:dma_need_sync+0x5/0x60
...
Call Trace:
<TASK>
? dma_need_sync+0x5/0x60
? smc_ib_is_sg_need_sync+0x61/0xf0 [smc]
smcr_buf_map_link+0x24a/0x380 [smc]
__smc_buf_create+0x483/0xb10 [smc]
smc_buf_create+0x21/0xe0 [smc]
smc_listen_work+0xf11/0x14f0 [smc]
? smc_tcp_listen_work+0x364/0x520 [smc]
process_one_work+0x18d/0x3f0
worker_thread+0x304/0x440
kthread+0xe4/0x110
ret_from_fork+0x47/0x70
ret_from_fork_asm+0x1a/0x30
</TASK>
If the software RoCE device is used, ibdev->dma_device is a null pointer.
As a result, the problem occurs. Null pointer detection is added to
prevent problems.
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
net/smc/smc_ib.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 382351ac9434..059822cc3fde 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
buf_slot->sgt[lnk->link_idx].nents, i) {
if (!sg_dma_len(sg))
break;
+ if (!lnk->smcibdev->ibdev->dma_device)
+ break;
if (dma_need_sync(lnk->smcibdev->ibdev->dma_device,
sg_dma_address(sg))) {
ret = true;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
` (2 preceding siblings ...)
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
@ 2024-08-09 8:31 ` Liu Jian
2024-08-09 11:06 ` Zhu Yanjun
2024-08-11 6:05 ` [PATCH net-next 0/4] Make SMC-R can work with rxe devices Jan Karcher
2024-08-20 13:16 ` Jan Karcher
5 siblings, 1 reply; 20+ messages in thread
From: Liu Jian @ 2024-08-09 8:31 UTC (permalink / raw)
To: linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni, liujian56
Same with commit e375b9c92985 ("RDMA/cxgb4: Set queue pair state when
being queried"). The API for ib_query_qp requires the driver to set
cur_qp_state on return, add the missing set.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 5c18f7e342f2..699b4b315336 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -634,6 +634,8 @@ static int rxe_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
rxe_qp_to_init(qp, init);
rxe_qp_to_attr(qp, attr, mask);
+ attr->cur_qp_state = qp->attr.qp_state;
+
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
@ 2024-08-09 10:01 ` Wen Gu
2024-08-09 14:59 ` Dust Li
2024-08-12 3:30 ` D. Wythe
2 siblings, 0 replies; 20+ messages in thread
From: Wen Gu @ 2024-08-09 10:01 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, davem,
edumazet, kuba, pabeni
On 2024/8/9 16:31, Liu Jian wrote:
> BUG: kernel NULL pointer dereference, address: 0000000000000238
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:dma_need_sync+0x5/0x60
> ...
> Call Trace:
> <TASK>
> ? dma_need_sync+0x5/0x60
> ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc]
> smcr_buf_map_link+0x24a/0x380 [smc]
> __smc_buf_create+0x483/0xb10 [smc]
> smc_buf_create+0x21/0xe0 [smc]
> smc_listen_work+0xf11/0x14f0 [smc]
> ? smc_tcp_listen_work+0x364/0x520 [smc]
> process_one_work+0x18d/0x3f0
> worker_thread+0x304/0x440
> kthread+0xe4/0x110
> ret_from_fork+0x47/0x70
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
>
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> net/smc/smc_ib.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 382351ac9434..059822cc3fde 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> buf_slot->sgt[lnk->link_idx].nents, i) {
> if (!sg_dma_len(sg))
> break;
> + if (!lnk->smcibdev->ibdev->dma_device)
> + break;
LGTM.
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
> if (dma_need_sync(lnk->smcibdev->ibdev->dma_device,
> sg_dma_address(sg))) {
> ret = true;
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried
2024-08-09 8:31 ` [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried Liu Jian
@ 2024-08-09 11:06 ` Zhu Yanjun
2024-08-11 8:31 ` Leon Romanovsky
0 siblings, 1 reply; 20+ messages in thread
From: Zhu Yanjun @ 2024-08-09 11:06 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni
在 2024/8/9 16:31, Liu Jian 写道:
> Same with commit e375b9c92985 ("RDMA/cxgb4: Set queue pair state when
> being queried"). The API for ib_query_qp requires the driver to set
> cur_qp_state on return, add the missing set.
>
Add the following?
Cc: stable@vger.kernel.org
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 5c18f7e342f2..699b4b315336 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -634,6 +634,8 @@ static int rxe_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
> rxe_qp_to_init(qp, init);
> rxe_qp_to_attr(qp, attr, mask);
>
> + attr->cur_qp_state = qp->attr.qp_state;
I am fine with this commit.
But I think this "attr->cur_qp_state = qp->attr.qp_state;" should be put
into this function rxe_qp_to_attr.
And the access to qp->attr.qp_state should be protected by spin lock
qp->state_lock.
So the following is better.
Any way, thanks.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c
b/drivers/infiniband/sw/rxe/rxe_qp.c
index d2f7b5195c19..da723b9690e5 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -782,6 +782,10 @@ int rxe_qp_to_attr(struct rxe_qp *qp, struct
ib_qp_attr *attr, int mask)
spin_unlock_irqrestore(&qp->state_lock, flags);
}
+ spin_lock_irqsave(&qp->state_lock, flags);
+ attr->cur_qp_state = qp_state(qp);
+ spin_unlock_irqrestore(&qp->state_lock, flags);
+
return 0;
}
Best Regards,
Zhu Yanjun
> +
> return 0;
> }
>
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info
2024-08-09 8:31 ` [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info Liu Jian
@ 2024-08-09 14:59 ` Dust Li
2024-08-12 2:07 ` liujian (CE)
0 siblings, 1 reply; 20+ messages in thread
From: Dust Li @ 2024-08-09 14:59 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni
On 2024-08-09 16:31:46, Liu Jian wrote:
>Currently, in the SMC protocol, network devices are obtained by calling
>ib_device_ops.get_netdev(). But for some drivers, this callback function
>is not implemented separately. Therefore, here I modified to use
>ib_device_get_netdev() to get net_device.
>
>For rdma devices that do not implement ib_device_ops.get_netdev(), one of
>the issues addressed is as follows:
>before:
>smcr device
>Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
> rxee 1 ACTIVE 0 No 0
>
>after:
>smcr device
>Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
>enp1s0f1 rxee 1 ACTIVE 0 No 0
>
>Signed-off-by: Liu Jian <liujian56@huawei.com>
>---
> net/smc/smc_ib.c | 8 +++-----
> net/smc/smc_pnet.c | 6 +-----
> 2 files changed, 4 insertions(+), 10 deletions(-)
>
>diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>index 9297dc20bfe2..382351ac9434 100644
>--- a/net/smc/smc_ib.c
>+++ b/net/smc/smc_ib.c
>@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> struct ib_device *ibdev = smcibdev->ibdev;
> struct net_device *ndev;
>
>- if (!ibdev->ops.get_netdev)
>- return;
>- ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>+ ndev = ib_device_get_netdev(ibdev, port + 1);
> if (ndev) {
> smcibdev->ndev_ifidx[port] = ndev->ifindex;
> dev_put(ndev);
>@@ -921,9 +919,9 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> port_cnt = smcibdev->ibdev->phys_port_cnt;
> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> libdev = smcibdev->ibdev;
>- if (!libdev->ops.get_netdev)
>+ lndev = ib_device_get_netdev(libdev, i + 1);
>+ if (!lndev)
> continue;
>- lndev = libdev->ops.get_netdev(libdev, i + 1);
> dev_put(lndev);
> if (lndev != ndev)
> continue;
>diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>index 2adb92b8c469..a55a697a48de 100644
>--- a/net/smc/smc_pnet.c
>+++ b/net/smc/smc_pnet.c
>@@ -1055,11 +1055,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> continue;
>
> for (i = 1; i <= SMC_MAX_PORTS; i++) {
>- if (!rdma_is_port_valid(ibdev->ibdev, i))
>- continue;
Why remove this check ?
Best regard,
Dust
>- if (!ibdev->ibdev->ops.get_netdev)
>- continue;
>- ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>+ ndev = ib_device_get_netdev(ibdev->ibdev, i);
> if (!ndev)
> continue;
> dev_put(ndev);
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2024-08-09 10:01 ` Wen Gu
@ 2024-08-09 14:59 ` Dust Li
2024-08-12 3:30 ` D. Wythe
2 siblings, 0 replies; 20+ messages in thread
From: Dust Li @ 2024-08-09 14:59 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni
On 2024-08-09 16:31:47, Liu Jian wrote:
>BUG: kernel NULL pointer dereference, address: 0000000000000238
>PGD 0 P4D 0
>Oops: 0000 [#1] PREEMPT SMP PTI
>CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE
>Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>Workqueue: smc_hs_wq smc_listen_work [smc]
>RIP: 0010:dma_need_sync+0x5/0x60
>...
>Call Trace:
> <TASK>
> ? dma_need_sync+0x5/0x60
> ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc]
> smcr_buf_map_link+0x24a/0x380 [smc]
> __smc_buf_create+0x483/0xb10 [smc]
> smc_buf_create+0x21/0xe0 [smc]
> smc_listen_work+0xf11/0x14f0 [smc]
> ? smc_tcp_listen_work+0x364/0x520 [smc]
> process_one_work+0x18d/0x3f0
> worker_thread+0x304/0x440
> kthread+0xe4/0x110
> ret_from_fork+0x47/0x70
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
>If the software RoCE device is used, ibdev->dma_device is a null pointer.
>As a result, the problem occurs. Null pointer detection is added to
>prevent problems.
>
>Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Best regard,
Dust
>---
> net/smc/smc_ib.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>index 382351ac9434..059822cc3fde 100644
>--- a/net/smc/smc_ib.c
>+++ b/net/smc/smc_ib.c
>@@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> buf_slot->sgt[lnk->link_idx].nents, i) {
> if (!sg_dma_len(sg))
> break;
>+ if (!lnk->smcibdev->ibdev->dma_device)
>+ break;
> if (dma_need_sync(lnk->smcibdev->ibdev->dma_device,
> sg_dma_address(sg))) {
> ret = true;
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
` (3 preceding siblings ...)
2024-08-09 8:31 ` [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried Liu Jian
@ 2024-08-11 6:05 ` Jan Karcher
2024-08-20 13:16 ` Jan Karcher
5 siblings, 0 replies; 20+ messages in thread
From: Jan Karcher @ 2024-08-11 6:05 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
On 09/08/2024 10:31, Liu Jian wrote:
> Make SMC-R can work with rxe devices. This allows us to easily test and
> learn the SMC-R protocol without relying on a physical RoCE NIC.
Hi Liu,
thank you for your contribution. Please give me some time to review and
test it in the next couple of days.
Thanks
- J
>
> Liu Jian (4):
> rdma/device: export ib_device_get_netdev()
> net/smc: use ib_device_get_netdev() helper to get netdev info
> net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
> RDMA/rxe: Set queue pair cur_qp_state when being queried
>
> drivers/infiniband/core/core_priv.h | 3 ---
> drivers/infiniband/core/device.c | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
> include/rdma/ib_verbs.h | 2 ++
> net/smc/smc_ib.c | 10 +++++-----
> net/smc/smc_pnet.c | 6 +-----
> 6 files changed, 11 insertions(+), 13 deletions(-)
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried
2024-08-09 11:06 ` Zhu Yanjun
@ 2024-08-11 8:31 ` Leon Romanovsky
[not found] ` <bb0a5de8-f2d3-4276-ada4-f788f574b798@linux.dev>
0 siblings, 1 reply; 20+ messages in thread
From: Leon Romanovsky @ 2024-08-11 8:31 UTC (permalink / raw)
To: Zhu Yanjun
Cc: Liu Jian, linux-rdma, linux-s390, netdev, jgg, zyjzyj2000, wenjia,
jaka, alibuda, tonylu, guwen, davem, edumazet, kuba, pabeni
On Fri, Aug 09, 2024 at 07:06:34PM +0800, Zhu Yanjun wrote:
> 在 2024/8/9 16:31, Liu Jian 写道:
> > Same with commit e375b9c92985 ("RDMA/cxgb4: Set queue pair state when
> > being queried"). The API for ib_query_qp requires the driver to set
> > cur_qp_state on return, add the missing set.
> >
>
> Add the following?
> Cc: stable@vger.kernel.org
There is no need to add stable tag for RXE driver. Distros are not
supporting this driver, which is used for development and not for
production.
Thanks
>
> > Fixes: 8700e3e7c485 ("Soft RoCE driver")
> > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > ---
> > drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
> > 1 file changed, 2 insertions(+)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried
[not found] ` <bb0a5de8-f2d3-4276-ada4-f788f574b798@linux.dev>
@ 2024-08-11 10:24 ` Zhu Yanjun
0 siblings, 0 replies; 20+ messages in thread
From: Zhu Yanjun @ 2024-08-11 10:24 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Liu Jian, linux-rdma, linux-s390, netdev, jgg, zyjzyj2000, wenjia,
jaka, alibuda, tonylu, guwen, davem, edumazet, kuba, pabeni
在 2024/8/11 18:22, Zhu Yanjun 写道:
>
>
> 在 2024/8/11 16:31, Leon Romanovsky 写道:
>> On Fri, Aug 09, 2024 at 07:06:34PM +0800, Zhu Yanjun wrote:
>>> 在 2024/8/9 16:31, Liu Jian 写道:
>>>> Same with commit e375b9c92985 ("RDMA/cxgb4: Set queue pair state when
>>>> being queried"). The API for ib_query_qp requires the driver to set
>>>> cur_qp_state on return, add the missing set.
>>>>
>>> Add the following?
>>> Cc:stable@vger.kernel.org
>> There is no need to add stable tag for RXE driver. Distros are not
>> supporting this driver, which is used for development and not for
>> production.
>
> I do not mean that this driver is supported in Distros.
>
> I mean that QP cur_qp_state is not set when being queried in rxe. In
> other rdma drivers, this is set.
>
> This should be a problem in RXE. So I suggest to add "CC:
> stable@vger.kernel.org".
>
> As such, the stable branch will backport this commit to fix this
> problem in RXE.
>
Sorry. My mistakes.
No need to add this "CC: stable@vger.kernel.org"
Zhu Yanjun
> Anyway, thanks a lot for your reporting and fixing this problem in RXE.
>
> Best Regards,
>
> Zhu Yanjun
>
>> Thanks
>>
>>>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>>>> Signed-off-by: Liu Jian<liujian56@huawei.com>
>>>> ---
>>>> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
> --
> Best Regards,
> Yanjun.Zhu
--
Best Regards,
Yanjun.Zhu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info
2024-08-09 14:59 ` Dust Li
@ 2024-08-12 2:07 ` liujian (CE)
2024-08-12 2:43 ` Dust Li
0 siblings, 1 reply; 20+ messages in thread
From: liujian (CE) @ 2024-08-12 2:07 UTC (permalink / raw)
To: dust.li, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni
在 2024/8/9 22:59, Dust Li 写道:
> On 2024-08-09 16:31:46, Liu Jian wrote:
>> Currently, in the SMC protocol, network devices are obtained by calling
>> ib_device_ops.get_netdev(). But for some drivers, this callback function
>> is not implemented separately. Therefore, here I modified to use
>> ib_device_get_netdev() to get net_device.
>>
>> For rdma devices that do not implement ib_device_ops.get_netdev(), one of
>> the issues addressed is as follows:
>> before:
>> smcr device
>> Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
>> rxee 1 ACTIVE 0 No 0
>>
>> after:
>> smcr device
>> Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
>> enp1s0f1 rxee 1 ACTIVE 0 No 0
>>
>> Signed-off-by: Liu Jian <liujian56@huawei.com>
>> ---
>> net/smc/smc_ib.c | 8 +++-----
>> net/smc/smc_pnet.c | 6 +-----
>> 2 files changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>> index 9297dc20bfe2..382351ac9434 100644
>> --- a/net/smc/smc_ib.c
>> +++ b/net/smc/smc_ib.c
>> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>> struct ib_device *ibdev = smcibdev->ibdev;
>> struct net_device *ndev;
>>
>> - if (!ibdev->ops.get_netdev)
>> - return;
>> - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>> + ndev = ib_device_get_netdev(ibdev, port + 1);
>> if (ndev) {
>> smcibdev->ndev_ifidx[port] = ndev->ifindex;
>> dev_put(ndev);
>> @@ -921,9 +919,9 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>> port_cnt = smcibdev->ibdev->phys_port_cnt;
>> for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>> libdev = smcibdev->ibdev;
>> - if (!libdev->ops.get_netdev)
>> + lndev = ib_device_get_netdev(libdev, i + 1);
>> + if (!lndev)
>> continue;
>> - lndev = libdev->ops.get_netdev(libdev, i + 1);
>> dev_put(lndev);
>> if (lndev != ndev)
>> continue;
>> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>> index 2adb92b8c469..a55a697a48de 100644
>> --- a/net/smc/smc_pnet.c
>> +++ b/net/smc/smc_pnet.c
>> @@ -1055,11 +1055,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>> continue;
>>
>> for (i = 1; i <= SMC_MAX_PORTS; i++) {
>> - if (!rdma_is_port_valid(ibdev->ibdev, i))
>> - continue;
>
> Why remove this check ?
>
Hi, Dust,
The same check is already in ib_device_get_netdev().
> Best regard,
> Dust
>
>
>> - if (!ibdev->ibdev->ops.get_netdev)
>> - continue;
>> - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>> + ndev = ib_device_get_netdev(ibdev->ibdev, i);
>> if (!ndev)
>> continue;
>> dev_put(ndev);
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info
2024-08-12 2:07 ` liujian (CE)
@ 2024-08-12 2:43 ` Dust Li
0 siblings, 0 replies; 20+ messages in thread
From: Dust Li @ 2024-08-12 2:43 UTC (permalink / raw)
To: liujian (CE), linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, alibuda, tonylu, guwen,
davem, edumazet, kuba, pabeni
On 2024-08-12 10:07:42, liujian (CE) wrote:
>
>
>在 2024/8/9 22:59, Dust Li 写道:
>> On 2024-08-09 16:31:46, Liu Jian wrote:
>> > Currently, in the SMC protocol, network devices are obtained by calling
>> > ib_device_ops.get_netdev(). But for some drivers, this callback function
>> > is not implemented separately. Therefore, here I modified to use
>> > ib_device_get_netdev() to get net_device.
>> >
>> > For rdma devices that do not implement ib_device_ops.get_netdev(), one of
>> > the issues addressed is as follows:
>> > before:
>> > smcr device
>> > Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
>> > rxee 1 ACTIVE 0 No 0
>> >
>> > after:
>> > smcr device
>> > Net-Dev IB-Dev IB-P IB-State Type Crit #Links PNET-ID
>> > enp1s0f1 rxee 1 ACTIVE 0 No 0
>> >
>> > Signed-off-by: Liu Jian <liujian56@huawei.com>
>> > ---
>> > net/smc/smc_ib.c | 8 +++-----
>> > net/smc/smc_pnet.c | 6 +-----
>> > 2 files changed, 4 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>> > index 9297dc20bfe2..382351ac9434 100644
>> > --- a/net/smc/smc_ib.c
>> > +++ b/net/smc/smc_ib.c
>> > @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>> > struct ib_device *ibdev = smcibdev->ibdev;
>> > struct net_device *ndev;
>> >
>> > - if (!ibdev->ops.get_netdev)
>> > - return;
>> > - ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>> > + ndev = ib_device_get_netdev(ibdev, port + 1);
>> > if (ndev) {
>> > smcibdev->ndev_ifidx[port] = ndev->ifindex;
>> > dev_put(ndev);
>> > @@ -921,9 +919,9 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>> > port_cnt = smcibdev->ibdev->phys_port_cnt;
>> > for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>> > libdev = smcibdev->ibdev;
>> > - if (!libdev->ops.get_netdev)
>> > + lndev = ib_device_get_netdev(libdev, i + 1);
>> > + if (!lndev)
>> > continue;
>> > - lndev = libdev->ops.get_netdev(libdev, i + 1);
>> > dev_put(lndev);
>> > if (lndev != ndev)
>> > continue;
>> > diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>> > index 2adb92b8c469..a55a697a48de 100644
>> > --- a/net/smc/smc_pnet.c
>> > +++ b/net/smc/smc_pnet.c
>> > @@ -1055,11 +1055,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>> > continue;
>> >
>> > for (i = 1; i <= SMC_MAX_PORTS; i++) {
>> > - if (!rdma_is_port_valid(ibdev->ibdev, i))
>> > - continue;
>>
>> Why remove this check ?
>>
>Hi, Dust,
>The same check is already in ib_device_get_netdev().
Oh I see, thanks !
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
>> Best regard,
>> Dust
>>
>>
>> > - if (!ibdev->ibdev->ops.get_netdev)
>> > - continue;
>> > - ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>> > + ndev = ib_device_get_netdev(ibdev->ibdev, i);
>> > if (!ndev)
>> > continue;
>> > dev_put(ndev);
>> > --
>> > 2.34.1
>> >
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2024-08-09 10:01 ` Wen Gu
2024-08-09 14:59 ` Dust Li
@ 2024-08-12 3:30 ` D. Wythe
2 siblings, 0 replies; 20+ messages in thread
From: D. Wythe @ 2024-08-12 3:30 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, jaka, tonylu, guwen, davem,
edumazet, kuba, pabeni
On 8/9/24 4:31 PM, Liu Jian wrote:
> BUG: kernel NULL pointer dereference, address: 0000000000000238
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:dma_need_sync+0x5/0x60
> ...
> Call Trace:
> <TASK>
> ? dma_need_sync+0x5/0x60
> ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc]
> smcr_buf_map_link+0x24a/0x380 [smc]
> __smc_buf_create+0x483/0xb10 [smc]
> smc_buf_create+0x21/0xe0 [smc]
> smc_listen_work+0xf11/0x14f0 [smc]
> ? smc_tcp_listen_work+0x364/0x520 [smc]
> process_one_work+0x18d/0x3f0
> worker_thread+0x304/0x440
> kthread+0xe4/0x110
> ret_from_fork+0x47/0x70
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
>
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> net/smc/smc_ib.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 382351ac9434..059822cc3fde 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> buf_slot->sgt[lnk->link_idx].nents, i) {
> if (!sg_dma_len(sg))
> break;
> + if (!lnk->smcibdev->ibdev->dma_device)
> + break;
> if (dma_need_sync(lnk->smcibdev->ibdev->dma_device,
> sg_dma_address(sg))) {
> ret = true;
Maybe you need add a fix tag ?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
` (4 preceding siblings ...)
2024-08-11 6:05 ` [PATCH net-next 0/4] Make SMC-R can work with rxe devices Jan Karcher
@ 2024-08-20 13:16 ` Jan Karcher
2024-08-21 1:03 ` Dust Li
5 siblings, 1 reply; 20+ messages in thread
From: Jan Karcher @ 2024-08-20 13:16 UTC (permalink / raw)
To: Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
On 09/08/2024 10:31, Liu Jian wrote:
> Make SMC-R can work with rxe devices. This allows us to easily test and
> learn the SMC-R protocol without relying on a physical RoCE NIC.
Hi Liu,
sorry for taking quite some time to answer.
Looking into this i cannot accept this series at the given point of time.
FWIU, RXE is mainly for testing and development and i agree that it
would be a nice thing to have for SMC-R.
The problem is that there is no clean layer for different RoCE devices
currently. Adding RXE to it works but isn't clean.
Also we have no way to do a "test" build which would have such a device
supported and a "prod" build which would not support it.
Please give us time to investigate how to solve this in a neat way
without building up to much technical debt.
Thanks for your contribution and making us aware of this area of improvment.
- Jan
>
> Liu Jian (4):
> rdma/device: export ib_device_get_netdev()
> net/smc: use ib_device_get_netdev() helper to get netdev info
> net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
> RDMA/rxe: Set queue pair cur_qp_state when being queried
>
> drivers/infiniband/core/core_priv.h | 3 ---
> drivers/infiniband/core/device.c | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
> include/rdma/ib_verbs.h | 2 ++
> net/smc/smc_ib.c | 10 +++++-----
> net/smc/smc_pnet.c | 6 +-----
> 6 files changed, 11 insertions(+), 13 deletions(-)
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-20 13:16 ` Jan Karcher
@ 2024-08-21 1:03 ` Dust Li
2024-08-24 10:04 ` liujian (CE)
0 siblings, 1 reply; 20+ messages in thread
From: Dust Li @ 2024-08-21 1:03 UTC (permalink / raw)
To: Jan Karcher, Liu Jian, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
On 2024-08-20 15:16:57, Jan Karcher wrote:
>
>
>On 09/08/2024 10:31, Liu Jian wrote:
>> Make SMC-R can work with rxe devices. This allows us to easily test and
>> learn the SMC-R protocol without relying on a physical RoCE NIC.
>
>Hi Liu,
>
>sorry for taking quite some time to answer.
>
>Looking into this i cannot accept this series at the given point of time.
>
>FWIU, RXE is mainly for testing and development and i agree that it would be
>a nice thing to have for SMC-R.
>The problem is that there is no clean layer for different RoCE devices
>currently. Adding RXE to it works but isn't clean.
Hi jan,
>Also we have no way to do a "test" build which would have such a device
>supported and a "prod" build which would not support it.
I don't quite understand what you mean here, Maybe I missed something ?
IIUC, we can control whether to use RXE by simpling insmod or rmmod rdma_rxe.ko
I believe having RXE support is beneficial for testing, especially in
simple physical networking setups where many corner cases are unlikely
to occur. By using RXE, we can easily configure unusual scenarios with
the existing iptables/netfilter infrastructure to simulate real-world
situations, such as packet dropping or network retransmission. This
approach can be advantageous for finding hidden bugs.
Best regards,
Dust
>
>Please give us time to investigate how to solve this in a neat way without
>building up to much technical debt.
>
>Thanks for your contribution and making us aware of this area of improvment.
>- Jan
>
>>
>> Liu Jian (4):
>> rdma/device: export ib_device_get_netdev()
>> net/smc: use ib_device_get_netdev() helper to get netdev info
>> net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
>> RDMA/rxe: Set queue pair cur_qp_state when being queried
>>
>> drivers/infiniband/core/core_priv.h | 3 ---
>> drivers/infiniband/core/device.c | 1 +
>> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
>> include/rdma/ib_verbs.h | 2 ++
>> net/smc/smc_ib.c | 10 +++++-----
>> net/smc/smc_pnet.c | 6 +-----
>> 6 files changed, 11 insertions(+), 13 deletions(-)
>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-21 1:03 ` Dust Li
@ 2024-08-24 10:04 ` liujian (CE)
2024-08-26 19:04 ` Jan Karcher
0 siblings, 1 reply; 20+ messages in thread
From: liujian (CE) @ 2024-08-24 10:04 UTC (permalink / raw)
To: dust.li, Jan Karcher, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
在 2024/8/21 9:03, Dust Li 写道:
> On 2024-08-20 15:16:57, Jan Karcher wrote:
>>
>>
>> On 09/08/2024 10:31, Liu Jian wrote:
>>> Make SMC-R can work with rxe devices. This allows us to easily test and
>>> learn the SMC-R protocol without relying on a physical RoCE NIC.
>>
>> Hi Liu,
>>
>> sorry for taking quite some time to answer.
>>
>> Looking into this i cannot accept this series at the given point of time.
>>
>> FWIU, RXE is mainly for testing and development and i agree that it would be
>> a nice thing to have for SMC-R.
>> The problem is that there is no clean layer for different RoCE devices
>> currently. Adding RXE to it works but isn't clean.
>
> Hi jan,
>
>> Also we have no way to do a "test" build which would have such a device
>> supported and a "prod" build which would not support it.
> > I don't quite understand what you mean here, Maybe I missed something ?
> IIUC, we can control whether to use RXE by simpling insmod or rmmod rdma_rxe.ko
>
Yes, in the "prod" environment, we can completely turn off CONFIG_RDMA_RXE.
> I believe having RXE support is beneficial for testing, especially in
> simple physical networking setups where many corner cases are unlikely
> to occur. By using RXE, we can easily configure unusual scenarios with
> the existing iptables/netfilter infrastructure to simulate real-world
> situations, such as packet dropping or network retransmission. This
> approach can be advantageous for finding hidden bugs.
>
Yes, one of my main original intentions was to make testing smc-r
easier. This change is relatively simple, mainly patch2 and patch4, and
there are no logical changes.
> Best regards,
> Dust
>
>
>>
>> Please give us time to investigate how to solve this in a neat way without
>> building up to much technical debt.
>>
>> Thanks for your contribution and making us aware of this area of improvment.
>> - Jan
>>
>>>
>>> Liu Jian (4):
>>> rdma/device: export ib_device_get_netdev()
>>> net/smc: use ib_device_get_netdev() helper to get netdev info
>>> net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
>>> RDMA/rxe: Set queue pair cur_qp_state when being queried
>>>
>>> drivers/infiniband/core/core_priv.h | 3 ---
>>> drivers/infiniband/core/device.c | 1 +
>>> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
>>> include/rdma/ib_verbs.h | 2 ++
>>> net/smc/smc_ib.c | 10 +++++-----
>>> net/smc/smc_pnet.c | 6 +-----
>>> 6 files changed, 11 insertions(+), 13 deletions(-)
>>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-24 10:04 ` liujian (CE)
@ 2024-08-26 19:04 ` Jan Karcher
2024-08-28 10:29 ` liujian (CE)
0 siblings, 1 reply; 20+ messages in thread
From: Jan Karcher @ 2024-08-26 19:04 UTC (permalink / raw)
To: liujian (CE), dust.li, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
On 24/08/2024 12:04, liujian (CE) wrote:
>
>
> 在 2024/8/21 9:03, Dust Li 写道:
>> On 2024-08-20 15:16:57, Jan Karcher wrote:
>>>
>>>
>>> On 09/08/2024 10:31, Liu Jian wrote:
>>>> Make SMC-R can work with rxe devices. This allows us to easily test and
>>>> learn the SMC-R protocol without relying on a physical RoCE NIC.
>>>
>>> Hi Liu,
>>>
>>> sorry for taking quite some time to answer.
>>>
>>> Looking into this i cannot accept this series at the given point of
>>> time.
>>>
>>> FWIU, RXE is mainly for testing and development and i agree that it
>>> would be
>>> a nice thing to have for SMC-R.
>>> The problem is that there is no clean layer for different RoCE devices
>>> currently. Adding RXE to it works but isn't clean.
>>
>> Hi jan,
>>
>>> Also we have no way to do a "test" build which would have such a device
>>> supported and a "prod" build which would not support it.
>> > I don't quite understand what you mean here, Maybe I missed
>> something ?
>> IIUC, we can control whether to use RXE by simpling insmod or rmmod
>> rdma_rxe.ko
Hi,
Yes that enables RXE in general, but not the use of RXE in SMC.
>>
> Yes, in the "prod" environment, we can completely turn off CONFIG_RDMA_RXE.
Same as above + this is a compile time switch that is enabled for
distros like rh. Simply disabling it won't work here.
>
>> I believe having RXE support is beneficial for testing, especially in
>> simple physical networking setups where many corner cases are unlikely
>> to occur. By using RXE, we can easily configure unusual scenarios with
>> the existing iptables/netfilter infrastructure to simulate real-world
>> situations, such as packet dropping or network retransmission. This
>> approach can be advantageous for finding hidden bugs.
>>
> Yes, one of my main original intentions was to make testing smc-r
> easier. This change is relatively simple, mainly patch2 and patch4, and
> there are no logical changes.
I agree with you. It would be beneficial for testing.
This is not a never, this is a not right now.
If you want to push this forward as something you need now, feel free to
encapsulate it and introduce a vendor specific experimental option as
defined in the v2.1 protocol version [1] for it. This would be
compromise for me at the current time.
Thanks
- Jan
[1]
https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202.1%20Emulated-ISM_0.pdf
>> Best regards,
>> Dust
>>
>>
>>>
>>> Please give us time to investigate how to solve this in a neat way
>>> without
>>> building up to much technical debt.
>>>
>>> Thanks for your contribution and making us aware of this area of
>>> improvment.
>>> - Jan
>>>
>>>>
>>>> Liu Jian (4):
>>>> rdma/device: export ib_device_get_netdev()
>>>> net/smc: use ib_device_get_netdev() helper to get netdev info
>>>> net/smc: fix one NULL pointer dereference in
>>>> smc_ib_is_sg_need_sync()
>>>> RDMA/rxe: Set queue pair cur_qp_state when being queried
>>>>
>>>> drivers/infiniband/core/core_priv.h | 3 ---
>>>> drivers/infiniband/core/device.c | 1 +
>>>> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
>>>> include/rdma/ib_verbs.h | 2 ++
>>>> net/smc/smc_ib.c | 10 +++++-----
>>>> net/smc/smc_pnet.c | 6 +-----
>>>> 6 files changed, 11 insertions(+), 13 deletions(-)
>>>>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 0/4] Make SMC-R can work with rxe devices
2024-08-26 19:04 ` Jan Karcher
@ 2024-08-28 10:29 ` liujian (CE)
0 siblings, 0 replies; 20+ messages in thread
From: liujian (CE) @ 2024-08-28 10:29 UTC (permalink / raw)
To: Jan Karcher, dust.li, linux-rdma, linux-s390, netdev
Cc: jgg, leon, zyjzyj2000, wenjia, alibuda, tonylu, guwen, davem,
edumazet, kuba, pabeni
在 2024/8/27 3:04, Jan Karcher 写道:
>
>
> On 24/08/2024 12:04, liujian (CE) wrote:
>>
>>
>> 在 2024/8/21 9:03, Dust Li 写道:
>>> On 2024-08-20 15:16:57, Jan Karcher wrote:
>>>>
>>>>
>>>> On 09/08/2024 10:31, Liu Jian wrote:
>>>>> Make SMC-R can work with rxe devices. This allows us to easily test
>>>>> and
>>>>> learn the SMC-R protocol without relying on a physical RoCE NIC.
>>>>
>>>> Hi Liu,
>>>>
>>>> sorry for taking quite some time to answer.
>>>>
>>>> Looking into this i cannot accept this series at the given point of
>>>> time.
>>>>
>>>> FWIU, RXE is mainly for testing and development and i agree that it
>>>> would be
>>>> a nice thing to have for SMC-R.
>>>> The problem is that there is no clean layer for different RoCE devices
>>>> currently. Adding RXE to it works but isn't clean.
>>>
>>> Hi jan,
>>>
>>>> Also we have no way to do a "test" build which would have such a device
>>>> supported and a "prod" build which would not support it.
>>> > I don't quite understand what you mean here, Maybe I missed
>>> something ?
>>> IIUC, we can control whether to use RXE by simpling insmod or rmmod
>>> rdma_rxe.ko
>
> Hi,
>
> Yes that enables RXE in general, but not the use of RXE in SMC.
> >>>
>> Yes, in the "prod" environment, we can completely turn off
>> CONFIG_RDMA_RXE.
>
> Same as above + this is a compile time switch that is enabled for
> distros like rh. Simply disabling it won't work here.
>
Got it, I misunderstood what you meant above. Thank you.
>>
>>> I believe having RXE support is beneficial for testing, especially in
>>> simple physical networking setups where many corner cases are unlikely
>>> to occur. By using RXE, we can easily configure unusual scenarios with
>>> the existing iptables/netfilter infrastructure to simulate real-world
>>> situations, such as packet dropping or network retransmission. This
>>> approach can be advantageous for finding hidden bugs.
>>>
>> Yes, one of my main original intentions was to make testing smc-r
>> easier. This change is relatively simple, mainly patch2 and patch4,
>> and there are no logical changes.
>
> I agree with you. It would be beneficial for testing.
> This is not a never, this is a not right now.
>
> If you want to push this forward as something you need now, feel free to
> encapsulate it and introduce a vendor specific experimental option as
> defined in the v2.1 protocol version [1] for it. This would be
> compromise for me at the current time.
>
Got it, thanks.
> Thanks
> - Jan
>
> [1]
> https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202.1%20Emulated-ISM_0.pdf
> >>> Best regards,
>>> Dust
>>>
>>>
>>>>
>>>> Please give us time to investigate how to solve this in a neat way
>>>> without
>>>> building up to much technical debt.
>>>>
>>>> Thanks for your contribution and making us aware of this area of
>>>> improvment.
>>>> - Jan
>>>>
>>>>>
>>>>> Liu Jian (4):
>>>>> rdma/device: export ib_device_get_netdev()
>>>>> net/smc: use ib_device_get_netdev() helper to get netdev info
>>>>> net/smc: fix one NULL pointer dereference in
>>>>> smc_ib_is_sg_need_sync()
>>>>> RDMA/rxe: Set queue pair cur_qp_state when being queried
>>>>>
>>>>> drivers/infiniband/core/core_priv.h | 3 ---
>>>>> drivers/infiniband/core/device.c | 1 +
>>>>> drivers/infiniband/sw/rxe/rxe_verbs.c | 2 ++
>>>>> include/rdma/ib_verbs.h | 2 ++
>>>>> net/smc/smc_ib.c | 10 +++++-----
>>>>> net/smc/smc_pnet.c | 6 +-----
>>>>> 6 files changed, 11 insertions(+), 13 deletions(-)
>>>>>
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-08-28 10:29 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-09 8:31 [PATCH net-next 0/4] Make SMC-R can work with rxe devices Liu Jian
2024-08-09 8:31 ` [PATCH net-next 1/4] rdma/device: export ib_device_get_netdev() Liu Jian
2024-08-09 8:31 ` [PATCH net-next 2/4] net/smc: use ib_device_get_netdev() helper to get netdev info Liu Jian
2024-08-09 14:59 ` Dust Li
2024-08-12 2:07 ` liujian (CE)
2024-08-12 2:43 ` Dust Li
2024-08-09 8:31 ` [PATCH net-next 3/4] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2024-08-09 10:01 ` Wen Gu
2024-08-09 14:59 ` Dust Li
2024-08-12 3:30 ` D. Wythe
2024-08-09 8:31 ` [PATCH net-next 4/4] RDMA/rxe: Set queue pair cur_qp_state when being queried Liu Jian
2024-08-09 11:06 ` Zhu Yanjun
2024-08-11 8:31 ` Leon Romanovsky
[not found] ` <bb0a5de8-f2d3-4276-ada4-f788f574b798@linux.dev>
2024-08-11 10:24 ` Zhu Yanjun
2024-08-11 6:05 ` [PATCH net-next 0/4] Make SMC-R can work with rxe devices Jan Karcher
2024-08-20 13:16 ` Jan Karcher
2024-08-21 1:03 ` Dust Li
2024-08-24 10:04 ` liujian (CE)
2024-08-26 19:04 ` Jan Karcher
2024-08-28 10:29 ` liujian (CE)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).