* [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
@ 2025-08-28 12:41 Liu Jian
2025-08-28 13:04 ` Guangguan Wang
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Liu Jian @ 2025-08-28 12:41 UTC (permalink / raw)
To: alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen, davem,
edumazet, kuba, pabeni, horms, guangguan.wang
Cc: linux-rdma, linux-s390, netdev
BUG: kernel NULL pointer dereference, address: 00000000000002ec
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
Workqueue: smc_hs_wq smc_listen_work [smc]
RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
...
Call Trace:
<TASK>
smcr_buf_map_link+0x211/0x2a0 [smc]
__smc_buf_create+0x522/0x970 [smc]
smc_buf_create+0x3a/0x110 [smc]
smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
smc_listen_find_device+0x1dd/0x2b0 [smc]
smc_listen_work+0x30f/0x580 [smc]
process_one_work+0x18c/0x340
worker_thread+0x242/0x360
kthread+0xe7/0x220
ret_from_fork+0x13a/0x160
ret_from_fork_asm+0x1a/0x30
</TASK>
If the software RoCE device is used, ibdev->dma_device is a null pointer.
As a result, the problem occurs. Null pointer detection is added to
prevent problems.
Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
v1->v2:
move the check outside of loop.
net/smc/smc_ib.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 53828833a3f7..a42ef3f77b96 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
unsigned int i;
bool ret = false;
+ if (!lnk->smcibdev->ibdev->dma_device)
+ return ret;
+
/* for now there is just one DMA address */
for_each_sg(buf_slot->sgt[lnk->link_idx].sgl, sg,
buf_slot->sgt[lnk->link_idx].nents, i) {
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
@ 2025-08-28 13:04 ` Guangguan Wang
2025-08-28 18:24 ` yanjun.zhu
` (3 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Guangguan Wang @ 2025-08-28 13:04 UTC (permalink / raw)
To: Liu Jian, alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms
Cc: linux-rdma, linux-s390, netdev
在 2025/8/28 20:41, Liu Jian 写道:
> BUG: kernel NULL pointer dereference, address: 00000000000002ec
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> ...
> Call Trace:
> <TASK>
> smcr_buf_map_link+0x211/0x2a0 [smc]
> __smc_buf_create+0x522/0x970 [smc]
> smc_buf_create+0x3a/0x110 [smc]
> smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> smc_listen_find_device+0x1dd/0x2b0 [smc]
> smc_listen_work+0x30f/0x580 [smc]
> process_one_work+0x18c/0x340
> worker_thread+0x242/0x360
> kthread+0xe7/0x220
> ret_from_fork+0x13a/0x160
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
>
> Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> v1->v2:
> move the check outside of loop.
> net/smc/smc_ib.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 53828833a3f7..a42ef3f77b96 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> unsigned int i;
> bool ret = false;
>
> + if (!lnk->smcibdev->ibdev->dma_device)
> + return ret;
> +
> /* for now there is just one DMA address */
> for_each_sg(buf_slot->sgt[lnk->link_idx].sgl, sg,
> buf_slot->sgt[lnk->link_idx].nents, i) {
LGTM.
Reviewed-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2025-08-28 13:04 ` Guangguan Wang
@ 2025-08-28 18:24 ` yanjun.zhu
2025-08-29 3:32 ` D. Wythe
` (2 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: yanjun.zhu @ 2025-08-28 18:24 UTC (permalink / raw)
To: Liu Jian, alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms, guangguan.wang
Cc: linux-rdma, linux-s390, netdev
On 8/28/25 5:41 AM, Liu Jian wrote:
> BUG: kernel NULL pointer dereference, address: 00000000000002ec
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> ...
> Call Trace:
> <TASK>
> smcr_buf_map_link+0x211/0x2a0 [smc]
> __smc_buf_create+0x522/0x970 [smc]
> smc_buf_create+0x3a/0x110 [smc]
> smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> smc_listen_find_device+0x1dd/0x2b0 [smc]
> smc_listen_work+0x30f/0x580 [smc]
> process_one_work+0x18c/0x340
> worker_thread+0x242/0x360
> kthread+0xe7/0x220
> ret_from_fork+0x13a/0x160
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
Normally, SoftRoCE relies on the DMA capabilities of the underlying NIC
device. In SMC, if DMA will play an important role in the subsequent
operations, we can leverage the NIC device’s DMA.
If DMA is not required for the upcoming actions, then this kind of
checking is sufficient.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Zhu Yanjun
>
> Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> v1->v2:
> move the check outside of loop.
> net/smc/smc_ib.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 53828833a3f7..a42ef3f77b96 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> unsigned int i;
> bool ret = false;
>
> + if (!lnk->smcibdev->ibdev->dma_device)
> + return ret;
> +
> /* for now there is just one DMA address */
> for_each_sg(buf_slot->sgt[lnk->link_idx].sgl, sg,
> buf_slot->sgt[lnk->link_idx].nents, i) {
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2025-08-28 13:04 ` Guangguan Wang
2025-08-28 18:24 ` yanjun.zhu
@ 2025-08-29 3:32 ` D. Wythe
2025-09-02 9:20 ` patchwork-bot+netdevbpf
2025-09-09 9:45 ` Leon Romanovsky
4 siblings, 0 replies; 12+ messages in thread
From: D. Wythe @ 2025-08-29 3:32 UTC (permalink / raw)
To: Liu Jian
Cc: alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen, davem,
edumazet, kuba, pabeni, horms, guangguan.wang, linux-rdma,
linux-s390, netdev
On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> BUG: kernel NULL pointer dereference, address: 00000000000002ec
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> ...
> Call Trace:
> <TASK>
> smcr_buf_map_link+0x211/0x2a0 [smc]
> __smc_buf_create+0x522/0x970 [smc]
> smc_buf_create+0x3a/0x110 [smc]
> smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> smc_listen_find_device+0x1dd/0x2b0 [smc]
> smc_listen_work+0x30f/0x580 [smc]
> process_one_work+0x18c/0x340
> worker_thread+0x242/0x360
> kthread+0xe7/0x220
> ret_from_fork+0x13a/0x160
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
>
> Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> v1->v2:
> move the check outside of loop.
> net/smc/smc_ib.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 53828833a3f7..a42ef3f77b96 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> unsigned int i;
> bool ret = false;
>
> + if (!lnk->smcibdev->ibdev->dma_device)
> + return ret;
> +
> /* for now there is just one DMA address */
> for_each_sg(buf_slot->sgt[lnk->link_idx].sgl, sg,
> buf_slot->sgt[lnk->link_idx].nents, i) {
LGTM, thanks.
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
` (2 preceding siblings ...)
2025-08-29 3:32 ` D. Wythe
@ 2025-09-02 9:20 ` patchwork-bot+netdevbpf
2025-09-09 9:45 ` Leon Romanovsky
4 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-02 9:20 UTC (permalink / raw)
To: Liu Jian
Cc: alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen, davem,
edumazet, kuba, pabeni, horms, guangguan.wang, linux-rdma,
linux-s390, netdev
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Thu, 28 Aug 2025 20:41:17 +0800 you wrote:
> BUG: kernel NULL pointer dereference, address: 00000000000002ec
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> ...
> Call Trace:
> <TASK>
> smcr_buf_map_link+0x211/0x2a0 [smc]
> __smc_buf_create+0x522/0x970 [smc]
> smc_buf_create+0x3a/0x110 [smc]
> smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> smc_listen_find_device+0x1dd/0x2b0 [smc]
> smc_listen_work+0x30f/0x580 [smc]
> process_one_work+0x18c/0x340
> worker_thread+0x242/0x360
> kthread+0xe7/0x220
> ret_from_fork+0x13a/0x160
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> [...]
Here is the summary with links:
- [net,v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
https://git.kernel.org/netdev/net/c/ba1e9421cf1a
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
` (3 preceding siblings ...)
2025-09-02 9:20 ` patchwork-bot+netdevbpf
@ 2025-09-09 9:45 ` Leon Romanovsky
2026-01-26 4:45 ` D. Wythe
4 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2025-09-09 9:45 UTC (permalink / raw)
To: Liu Jian
Cc: alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen, davem,
edumazet, kuba, pabeni, horms, guangguan.wang, linux-rdma,
linux-s390, netdev
On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> BUG: kernel NULL pointer dereference, address: 00000000000002ec
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP PTI
> CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> Workqueue: smc_hs_wq smc_listen_work [smc]
> RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> ...
> Call Trace:
> <TASK>
> smcr_buf_map_link+0x211/0x2a0 [smc]
> __smc_buf_create+0x522/0x970 [smc]
> smc_buf_create+0x3a/0x110 [smc]
> smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> smc_listen_find_device+0x1dd/0x2b0 [smc]
> smc_listen_work+0x30f/0x580 [smc]
> process_one_work+0x18c/0x340
> worker_thread+0x242/0x360
> kthread+0xe7/0x220
> ret_from_fork+0x13a/0x160
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> If the software RoCE device is used, ibdev->dma_device is a null pointer.
> As a result, the problem occurs. Null pointer detection is added to
> prevent problems.
>
> Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> v1->v2:
> move the check outside of loop.
> net/smc/smc_ib.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 53828833a3f7..a42ef3f77b96 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> unsigned int i;
> bool ret = false;
>
> + if (!lnk->smcibdev->ibdev->dma_device)
> + return ret;
Please use ib_uses_virt_dma() function for that.
It is clearly stated in the code:
2784 struct ib_device {
2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
2786 struct device *dma_device;
Thanks
> +
> /* for now there is just one DMA address */
> for_each_sg(buf_slot->sgt[lnk->link_idx].sgl, sg,
> buf_slot->sgt[lnk->link_idx].nents, i) {
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2025-09-09 9:45 ` Leon Romanovsky
@ 2026-01-26 4:45 ` D. Wythe
2026-01-27 12:00 ` Leon Romanovsky
0 siblings, 1 reply; 12+ messages in thread
From: D. Wythe @ 2026-01-26 4:45 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Liu Jian, alibuda, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms, guangguan.wang,
linux-rdma, linux-s390, netdev
On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > PGD 0 P4D 0
> > Oops: Oops: 0000 [#1] SMP PTI
> > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > Workqueue: smc_hs_wq smc_listen_work [smc]
> > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > ...
> > Call Trace:
> > <TASK>
> > smcr_buf_map_link+0x211/0x2a0 [smc]
> > __smc_buf_create+0x522/0x970 [smc]
> > smc_buf_create+0x3a/0x110 [smc]
> > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > smc_listen_work+0x30f/0x580 [smc]
> > process_one_work+0x18c/0x340
> > worker_thread+0x242/0x360
> > kthread+0xe7/0x220
> > ret_from_fork+0x13a/0x160
> > ret_from_fork_asm+0x1a/0x30
> > </TASK>
> >
> > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > As a result, the problem occurs. Null pointer detection is added to
> > prevent problems.
> >
> > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > ---
> > v1->v2:
> > move the check outside of loop.
> > net/smc/smc_ib.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > index 53828833a3f7..a42ef3f77b96 100644
> > --- a/net/smc/smc_ib.c
> > +++ b/net/smc/smc_ib.c
> > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > unsigned int i;
> > bool ret = false;
> >
> > + if (!lnk->smcibdev->ibdev->dma_device)
> > + return ret;
>
> Please use ib_uses_virt_dma() function for that.
>
> It is clearly stated in the code:
> 2784 struct ib_device {
> 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> 2786 struct device *dma_device;
>
> Thanks
>
>
Hi Leon,
Sorry for the late reply, I just noticed this and thank you for your review.
I agree completely with your feedback: we should not be accessing ibdev->dma_device
directly. Following your advice, replacing the
if (!lnk->smcibdev->ibdev->dma_device)
check with ib_uses_virt_dma() is straightforward and absolutely the correct
thing to do for that part of the logic.
However, this has led me to a further challenge. The main purpose of the
smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
This means that even after correctly using ib_uses_virt_dma(), the function
still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
right back into the direct access pattern we are trying to eliminate.
Since the RDMA doesn't currently offer a helper for this "check" functionality,
I'd like to propose adding one. What are your thoughts on a new helper like
the one below, which would allow us to solve this cleanly?
/* ib_verbs.h */
static inline bool ib_dma_need_sync(struct ib_device *dev, dma_addr_t dma_addr) {
return !ib_uses_virt_dma(dev) && dma_need_sync(dev->dma_device, dma_addr);
}
Best wishes,
D. Wythe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2026-01-26 4:45 ` D. Wythe
@ 2026-01-27 12:00 ` Leon Romanovsky
2026-01-28 5:50 ` D. Wythe
0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2026-01-27 12:00 UTC (permalink / raw)
To: D. Wythe
Cc: Liu Jian, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen,
davem, edumazet, kuba, pabeni, horms, guangguan.wang, linux-rdma,
linux-s390, netdev, Christoph Hellwig, Jason Gunthorpe
On Mon, Jan 26, 2026 at 12:45:01PM +0800, D. Wythe wrote:
> On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> > On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > > PGD 0 P4D 0
> > > Oops: Oops: 0000 [#1] SMP PTI
> > > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > > Workqueue: smc_hs_wq smc_listen_work [smc]
> > > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > > ...
> > > Call Trace:
> > > <TASK>
> > > smcr_buf_map_link+0x211/0x2a0 [smc]
> > > __smc_buf_create+0x522/0x970 [smc]
> > > smc_buf_create+0x3a/0x110 [smc]
> > > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > > smc_listen_work+0x30f/0x580 [smc]
> > > process_one_work+0x18c/0x340
> > > worker_thread+0x242/0x360
> > > kthread+0xe7/0x220
> > > ret_from_fork+0x13a/0x160
> > > ret_from_fork_asm+0x1a/0x30
> > > </TASK>
> > >
> > > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > > As a result, the problem occurs. Null pointer detection is added to
> > > prevent problems.
> > >
> > > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > ---
> > > v1->v2:
> > > move the check outside of loop.
> > > net/smc/smc_ib.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > > index 53828833a3f7..a42ef3f77b96 100644
> > > --- a/net/smc/smc_ib.c
> > > +++ b/net/smc/smc_ib.c
> > > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > > unsigned int i;
> > > bool ret = false;
> > >
> > > + if (!lnk->smcibdev->ibdev->dma_device)
> > > + return ret;
> >
> > Please use ib_uses_virt_dma() function for that.
> >
> > It is clearly stated in the code:
> > 2784 struct ib_device {
> > 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> > 2786 struct device *dma_device;
> >
> > Thanks
> >
> >
>
> Hi Leon,
>
> Sorry for the late reply, I just noticed this and thank you for your review.
> I agree completely with your feedback: we should not be accessing ibdev->dma_device
> directly. Following your advice, replacing the
>
> if (!lnk->smcibdev->ibdev->dma_device)
>
> check with ib_uses_virt_dma() is straightforward and absolutely the correct
> thing to do for that part of the logic.
>
> However, this has led me to a further challenge. The main purpose of the
> smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
> This means that even after correctly using ib_uses_virt_dma(), the function
> still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
> right back into the direct access pattern we are trying to eliminate.
I would like to challenge the use of dma_need_sync() in smc_ib.c. From
what I see, it is used to avoid calls to dma_sync_single_*_device()
which will be NOP anyway in that case.
Why did you copy that logic from XSK?
>
> Since the RDMA doesn't currently offer a helper for this "check" functionality,
> I'd like to propose adding one. What are your thoughts on a new helper like
> the one below, which would allow us to solve this cleanly?
>
> /* ib_verbs.h */
> static inline bool ib_dma_need_sync(struct ib_device *dev, dma_addr_t dma_addr) {
> return !ib_uses_virt_dma(dev) && dma_need_sync(dev->dma_device, dma_addr);
> }
If we're discussing wrappers, it's likely better to provide a wrapper around
dma_sync_sg_for_cpu() for use in ib_dma_sync_sg_for_cpu(), rather than
open‑coding the logic.
Thanks
>
> Best wishes,
> D. Wythe
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2026-01-27 12:00 ` Leon Romanovsky
@ 2026-01-28 5:50 ` D. Wythe
2026-01-28 11:21 ` Leon Romanovsky
0 siblings, 1 reply; 12+ messages in thread
From: D. Wythe @ 2026-01-28 5:50 UTC (permalink / raw)
To: Leon Romanovsky
Cc: D. Wythe, Liu Jian, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms, guangguan.wang,
linux-rdma, linux-s390, netdev, Christoph Hellwig,
Jason Gunthorpe
On Tue, Jan 27, 2026 at 02:00:04PM +0200, Leon Romanovsky wrote:
> On Mon, Jan 26, 2026 at 12:45:01PM +0800, D. Wythe wrote:
> > On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> > > On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > > > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > > > PGD 0 P4D 0
> > > > Oops: Oops: 0000 [#1] SMP PTI
> > > > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > > > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > > > Workqueue: smc_hs_wq smc_listen_work [smc]
> > > > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > > > ...
> > > > Call Trace:
> > > > <TASK>
> > > > smcr_buf_map_link+0x211/0x2a0 [smc]
> > > > __smc_buf_create+0x522/0x970 [smc]
> > > > smc_buf_create+0x3a/0x110 [smc]
> > > > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > > > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > > > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > > > smc_listen_work+0x30f/0x580 [smc]
> > > > process_one_work+0x18c/0x340
> > > > worker_thread+0x242/0x360
> > > > kthread+0xe7/0x220
> > > > ret_from_fork+0x13a/0x160
> > > > ret_from_fork_asm+0x1a/0x30
> > > > </TASK>
> > > >
> > > > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > > > As a result, the problem occurs. Null pointer detection is added to
> > > > prevent problems.
> > > >
> > > > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > ---
> > > > v1->v2:
> > > > move the check outside of loop.
> > > > net/smc/smc_ib.c | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > > > index 53828833a3f7..a42ef3f77b96 100644
> > > > --- a/net/smc/smc_ib.c
> > > > +++ b/net/smc/smc_ib.c
> > > > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > > > unsigned int i;
> > > > bool ret = false;
> > > >
> > > > + if (!lnk->smcibdev->ibdev->dma_device)
> > > > + return ret;
> > >
> > > Please use ib_uses_virt_dma() function for that.
> > >
> > > It is clearly stated in the code:
> > > 2784 struct ib_device {
> > > 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> > > 2786 struct device *dma_device;
> > >
> > > Thanks
> > >
> > >
> >
> > Hi Leon,
> >
> > Sorry for the late reply, I just noticed this and thank you for your review.
> > I agree completely with your feedback: we should not be accessing ibdev->dma_device
> > directly. Following your advice, replacing the
> >
> > if (!lnk->smcibdev->ibdev->dma_device)
> >
> > check with ib_uses_virt_dma() is straightforward and absolutely the correct
> > thing to do for that part of the logic.
> >
> > However, this has led me to a further challenge. The main purpose of the
> > smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
> > This means that even after correctly using ib_uses_virt_dma(), the function
> > still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
> > right back into the direct access pattern we are trying to eliminate.
>
> I would like to challenge the use of dma_need_sync() in smc_ib.c. From
> what I see, it is used to avoid calls to dma_sync_single_*_device()
> which will be NOP anyway in that case.
>
> Why did you copy that logic from XSK?
>
You are right. I just noticed that the DMA has already introduced a
similar optimization.
The check in SMC was added back in 2022, while the DMA introduced
the internal "skip redundant sync" mechanism in 2024 (commit f406c8e4b770).
Given this, it might be better to simply remove this redundant check
from SMC now, which would also eliminate the need for direct access to
ibdev->dma_device. I need to perform some tests to confirm this.
Thanks for your feedback.
D. Wythe
> >
> > Since the RDMA doesn't currently offer a helper for this "check" functionality,
> > I'd like to propose adding one. What are your thoughts on a new helper like
> > the one below, which would allow us to solve this cleanly?
> >
> > /* ib_verbs.h */
> > static inline bool ib_dma_need_sync(struct ib_device *dev, dma_addr_t dma_addr) {
> > return !ib_uses_virt_dma(dev) && dma_need_sync(dev->dma_device, dma_addr);
> > }
>
> If we're discussing wrappers, it's likely better to provide a wrapper around
> dma_sync_sg_for_cpu() for use in ib_dma_sync_sg_for_cpu(), rather than
> open‑coding the logic.
>
> Thanks
>
> >
> > Best wishes,
> > D. Wythe
> >
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2026-01-28 5:50 ` D. Wythe
@ 2026-01-28 11:21 ` Leon Romanovsky
2026-01-28 12:50 ` D. Wythe
2026-01-30 9:18 ` D. Wythe
0 siblings, 2 replies; 12+ messages in thread
From: Leon Romanovsky @ 2026-01-28 11:21 UTC (permalink / raw)
To: D. Wythe
Cc: Liu Jian, dust.li, sidraya, wenjia, mjambigi, tonylu, guwen,
davem, edumazet, kuba, pabeni, horms, guangguan.wang, linux-rdma,
linux-s390, netdev, Christoph Hellwig, Jason Gunthorpe
On Wed, Jan 28, 2026 at 01:50:51PM +0800, D. Wythe wrote:
> On Tue, Jan 27, 2026 at 02:00:04PM +0200, Leon Romanovsky wrote:
> > On Mon, Jan 26, 2026 at 12:45:01PM +0800, D. Wythe wrote:
> > > On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> > > > On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > > > > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > > > > PGD 0 P4D 0
> > > > > Oops: Oops: 0000 [#1] SMP PTI
> > > > > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > > > > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > > > > Workqueue: smc_hs_wq smc_listen_work [smc]
> > > > > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > > > > ...
> > > > > Call Trace:
> > > > > <TASK>
> > > > > smcr_buf_map_link+0x211/0x2a0 [smc]
> > > > > __smc_buf_create+0x522/0x970 [smc]
> > > > > smc_buf_create+0x3a/0x110 [smc]
> > > > > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > > > > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > > > > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > > > > smc_listen_work+0x30f/0x580 [smc]
> > > > > process_one_work+0x18c/0x340
> > > > > worker_thread+0x242/0x360
> > > > > kthread+0xe7/0x220
> > > > > ret_from_fork+0x13a/0x160
> > > > > ret_from_fork_asm+0x1a/0x30
> > > > > </TASK>
> > > > >
> > > > > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > > > > As a result, the problem occurs. Null pointer detection is added to
> > > > > prevent problems.
> > > > >
> > > > > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > > ---
> > > > > v1->v2:
> > > > > move the check outside of loop.
> > > > > net/smc/smc_ib.c | 3 +++
> > > > > 1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > > > > index 53828833a3f7..a42ef3f77b96 100644
> > > > > --- a/net/smc/smc_ib.c
> > > > > +++ b/net/smc/smc_ib.c
> > > > > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > > > > unsigned int i;
> > > > > bool ret = false;
> > > > >
> > > > > + if (!lnk->smcibdev->ibdev->dma_device)
> > > > > + return ret;
> > > >
> > > > Please use ib_uses_virt_dma() function for that.
> > > >
> > > > It is clearly stated in the code:
> > > > 2784 struct ib_device {
> > > > 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> > > > 2786 struct device *dma_device;
> > > >
> > > > Thanks
> > > >
> > > >
> > >
> > > Hi Leon,
> > >
> > > Sorry for the late reply, I just noticed this and thank you for your review.
> > > I agree completely with your feedback: we should not be accessing ibdev->dma_device
> > > directly. Following your advice, replacing the
> > >
> > > if (!lnk->smcibdev->ibdev->dma_device)
> > >
> > > check with ib_uses_virt_dma() is straightforward and absolutely the correct
> > > thing to do for that part of the logic.
> > >
> > > However, this has led me to a further challenge. The main purpose of the
> > > smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
> > > This means that even after correctly using ib_uses_virt_dma(), the function
> > > still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
> > > right back into the direct access pattern we are trying to eliminate.
> >
> > I would like to challenge the use of dma_need_sync() in smc_ib.c. From
> > what I see, it is used to avoid calls to dma_sync_single_*_device()
> > which will be NOP anyway in that case.
> >
> > Why did you copy that logic from XSK?
> >
>
> You are right. I just noticed that the DMA has already introduced a
> similar optimization.
>
> The check in SMC was added back in 2022, while the DMA introduced
> the internal "skip redundant sync" mechanism in 2024 (commit f406c8e4b770).
>
> Given this, it might be better to simply remove this redundant check
> from SMC now, which would also eliminate the need for direct access to
> ibdev->dma_device. I need to perform some tests to confirm this.
It may also be worth looking at this series from Chuck, which reuses the
recently introduced DMA API to remove the SG conversions:
https://lore.kernel.org/linux-rdma/20260128005400.25147-1-cel@kernel.org/
You might be able to apply a similar approach and drop SG from your
datapath entirely.
Thanks.
>
> Thanks for your feedback.
>
> D. Wythe
>
> > >
> > > Since the RDMA doesn't currently offer a helper for this "check" functionality,
> > > I'd like to propose adding one. What are your thoughts on a new helper like
> > > the one below, which would allow us to solve this cleanly?
> > >
> > > /* ib_verbs.h */
> > > static inline bool ib_dma_need_sync(struct ib_device *dev, dma_addr_t dma_addr) {
> > > return !ib_uses_virt_dma(dev) && dma_need_sync(dev->dma_device, dma_addr);
> > > }
> >
> > If we're discussing wrappers, it's likely better to provide a wrapper around
> > dma_sync_sg_for_cpu() for use in ib_dma_sync_sg_for_cpu(), rather than
> > open‑coding the logic.
> >
> > Thanks
> >
> > >
> > > Best wishes,
> > > D. Wythe
> > >
> > >
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2026-01-28 11:21 ` Leon Romanovsky
@ 2026-01-28 12:50 ` D. Wythe
2026-01-30 9:18 ` D. Wythe
1 sibling, 0 replies; 12+ messages in thread
From: D. Wythe @ 2026-01-28 12:50 UTC (permalink / raw)
To: Leon Romanovsky
Cc: D. Wythe, Liu Jian, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms, guangguan.wang,
linux-rdma, linux-s390, netdev, Christoph Hellwig,
Jason Gunthorpe
On Wed, Jan 28, 2026 at 01:21:22PM +0200, Leon Romanovsky wrote:
> On Wed, Jan 28, 2026 at 01:50:51PM +0800, D. Wythe wrote:
> > On Tue, Jan 27, 2026 at 02:00:04PM +0200, Leon Romanovsky wrote:
> > > On Mon, Jan 26, 2026 at 12:45:01PM +0800, D. Wythe wrote:
> > > > On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> > > > > On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > > > > > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > > > > > PGD 0 P4D 0
> > > > > > Oops: Oops: 0000 [#1] SMP PTI
> > > > > > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > > > > > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > > > > > Workqueue: smc_hs_wq smc_listen_work [smc]
> > > > > > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > > > > > ...
> > > > > > Call Trace:
> > > > > > <TASK>
> > > > > > smcr_buf_map_link+0x211/0x2a0 [smc]
> > > > > > __smc_buf_create+0x522/0x970 [smc]
> > > > > > smc_buf_create+0x3a/0x110 [smc]
> > > > > > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > > > > > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > > > > > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > > > > > smc_listen_work+0x30f/0x580 [smc]
> > > > > > process_one_work+0x18c/0x340
> > > > > > worker_thread+0x242/0x360
> > > > > > kthread+0xe7/0x220
> > > > > > ret_from_fork+0x13a/0x160
> > > > > > ret_from_fork_asm+0x1a/0x30
> > > > > > </TASK>
> > > > > >
> > > > > > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > > > > > As a result, the problem occurs. Null pointer detection is added to
> > > > > > prevent problems.
> > > > > >
> > > > > > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > > > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > > > ---
> > > > > > v1->v2:
> > > > > > move the check outside of loop.
> > > > > > net/smc/smc_ib.c | 3 +++
> > > > > > 1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > > > > > index 53828833a3f7..a42ef3f77b96 100644
> > > > > > --- a/net/smc/smc_ib.c
> > > > > > +++ b/net/smc/smc_ib.c
> > > > > > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > > > > > unsigned int i;
> > > > > > bool ret = false;
> > > > > >
> > > > > > + if (!lnk->smcibdev->ibdev->dma_device)
> > > > > > + return ret;
> > > > >
> > > > > Please use ib_uses_virt_dma() function for that.
> > > > >
> > > > > It is clearly stated in the code:
> > > > > 2784 struct ib_device {
> > > > > 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> > > > > 2786 struct device *dma_device;
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > >
> > > > Hi Leon,
> > > >
> > > > Sorry for the late reply, I just noticed this and thank you for your review.
> > > > I agree completely with your feedback: we should not be accessing ibdev->dma_device
> > > > directly. Following your advice, replacing the
> > > >
> > > > if (!lnk->smcibdev->ibdev->dma_device)
> > > >
> > > > check with ib_uses_virt_dma() is straightforward and absolutely the correct
> > > > thing to do for that part of the logic.
> > > >
> > > > However, this has led me to a further challenge. The main purpose of the
> > > > smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
> > > > This means that even after correctly using ib_uses_virt_dma(), the function
> > > > still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
> > > > right back into the direct access pattern we are trying to eliminate.
> > >
> > > I would like to challenge the use of dma_need_sync() in smc_ib.c. From
> > > what I see, it is used to avoid calls to dma_sync_single_*_device()
> > > which will be NOP anyway in that case.
> > >
> > > Why did you copy that logic from XSK?
> > >
> >
> > You are right. I just noticed that the DMA has already introduced a
> > similar optimization.
> >
> > The check in SMC was added back in 2022, while the DMA introduced
> > the internal "skip redundant sync" mechanism in 2024 (commit f406c8e4b770).
> >
> > Given this, it might be better to simply remove this redundant check
> > from SMC now, which would also eliminate the need for direct access to
> > ibdev->dma_device. I need to perform some tests to confirm this.
>
> It may also be worth looking at this series from Chuck, which reuses the
> recently introduced DMA API to remove the SG conversions:
> https://lore.kernel.org/linux-rdma/20260128005400.25147-1-cel@kernel.org/
>
> You might be able to apply a similar approach and drop SG from your
> datapath entirely.
>
> Thanks.
>
Thanks for the pointer. I'll take some time to look into this series and
see if we can adopt a similar approach.
> >
> > Thanks for your feedback.
> >
> > D. Wythe
> >
> > > >
> > > > Since the RDMA doesn't currently offer a helper for this "check" functionality,
> > > > I'd like to propose adding one. What are your thoughts on a new helper like
> > > > the one below, which would allow us to solve this cleanly?
> > > >
> > > > /* ib_verbs.h */
> > > > static inline bool ib_dma_need_sync(struct ib_device *dev, dma_addr_t dma_addr) {
> > > > return !ib_uses_virt_dma(dev) && dma_need_sync(dev->dma_device, dma_addr);
> > > > }
> > >
> > > If we're discussing wrappers, it's likely better to provide a wrapper around
> > > dma_sync_sg_for_cpu() for use in ib_dma_sync_sg_for_cpu(), rather than
> > > open‑coding the logic.
> > >
> > > Thanks
> > >
> > > >
> > > > Best wishes,
> > > > D. Wythe
> > > >
> > > >
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync()
2026-01-28 11:21 ` Leon Romanovsky
2026-01-28 12:50 ` D. Wythe
@ 2026-01-30 9:18 ` D. Wythe
1 sibling, 0 replies; 12+ messages in thread
From: D. Wythe @ 2026-01-30 9:18 UTC (permalink / raw)
To: Leon Romanovsky
Cc: D. Wythe, Liu Jian, dust.li, sidraya, wenjia, mjambigi, tonylu,
guwen, davem, edumazet, kuba, pabeni, horms, guangguan.wang,
linux-rdma, linux-s390, netdev, Christoph Hellwig,
Jason Gunthorpe
On Wed, Jan 28, 2026 at 01:21:22PM +0200, Leon Romanovsky wrote:
> On Wed, Jan 28, 2026 at 01:50:51PM +0800, D. Wythe wrote:
> > On Tue, Jan 27, 2026 at 02:00:04PM +0200, Leon Romanovsky wrote:
> > > On Mon, Jan 26, 2026 at 12:45:01PM +0800, D. Wythe wrote:
> > > > On Tue, Sep 09, 2025 at 12:45:32PM +0300, Leon Romanovsky wrote:
> > > > > On Thu, Aug 28, 2025 at 08:41:17PM +0800, Liu Jian wrote:
> > > > > > BUG: kernel NULL pointer dereference, address: 00000000000002ec
> > > > > > PGD 0 P4D 0
> > > > > > Oops: Oops: 0000 [#1] SMP PTI
> > > > > > CPU: 28 UID: 0 PID: 343 Comm: kworker/28:1 Kdump: loaded Tainted: G OE 6.17.0-rc2+ #9 NONE
> > > > > > Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > > > > > Workqueue: smc_hs_wq smc_listen_work [smc]
> > > > > > RIP: 0010:smc_ib_is_sg_need_sync+0x9e/0xd0 [smc]
> > > > > > ...
> > > > > > Call Trace:
> > > > > > <TASK>
> > > > > > smcr_buf_map_link+0x211/0x2a0 [smc]
> > > > > > __smc_buf_create+0x522/0x970 [smc]
> > > > > > smc_buf_create+0x3a/0x110 [smc]
> > > > > > smc_find_rdma_v2_device_serv+0x18f/0x240 [smc]
> > > > > > ? smc_vlan_by_tcpsk+0x7e/0xe0 [smc]
> > > > > > smc_listen_find_device+0x1dd/0x2b0 [smc]
> > > > > > smc_listen_work+0x30f/0x580 [smc]
> > > > > > process_one_work+0x18c/0x340
> > > > > > worker_thread+0x242/0x360
> > > > > > kthread+0xe7/0x220
> > > > > > ret_from_fork+0x13a/0x160
> > > > > > ret_from_fork_asm+0x1a/0x30
> > > > > > </TASK>
> > > > > >
> > > > > > If the software RoCE device is used, ibdev->dma_device is a null pointer.
> > > > > > As a result, the problem occurs. Null pointer detection is added to
> > > > > > prevent problems.
> > > > > >
> > > > > > Fixes: 0ef69e788411c ("net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu")
> > > > > > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > > > > > ---
> > > > > > v1->v2:
> > > > > > move the check outside of loop.
> > > > > > net/smc/smc_ib.c | 3 +++
> > > > > > 1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> > > > > > index 53828833a3f7..a42ef3f77b96 100644
> > > > > > --- a/net/smc/smc_ib.c
> > > > > > +++ b/net/smc/smc_ib.c
> > > > > > @@ -742,6 +742,9 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk,
> > > > > > unsigned int i;
> > > > > > bool ret = false;
> > > > > >
> > > > > > + if (!lnk->smcibdev->ibdev->dma_device)
> > > > > > + return ret;
> > > > >
> > > > > Please use ib_uses_virt_dma() function for that.
> > > > >
> > > > > It is clearly stated in the code:
> > > > > 2784 struct ib_device {
> > > > > 2785 /* Do not access @dma_device directly from ULP nor from HW drivers. */
> > > > > 2786 struct device *dma_device;
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > >
> > > > Hi Leon,
> > > >
> > > > Sorry for the late reply, I just noticed this and thank you for your review.
> > > > I agree completely with your feedback: we should not be accessing ibdev->dma_device
> > > > directly. Following your advice, replacing the
> > > >
> > > > if (!lnk->smcibdev->ibdev->dma_device)
> > > >
> > > > check with ib_uses_virt_dma() is straightforward and absolutely the correct
> > > > thing to do for that part of the logic.
> > > >
> > > > However, this has led me to a further challenge. The main purpose of the
> > > > smc_ib_is_sg_need_sync() function is to get the result of dma_need_sync().
> > > > This means that even after correctly using ib_uses_virt_dma(), the function
> > > > still needs to call dma_need_sync(ibdev->dma_device, ...), which forces us
> > > > right back into the direct access pattern we are trying to eliminate.
> > >
> > > I would like to challenge the use of dma_need_sync() in smc_ib.c. From
> > > what I see, it is used to avoid calls to dma_sync_single_*_device()
> > > which will be NOP anyway in that case.
> > >
> > > Why did you copy that logic from XSK?
> > >
> >
> > You are right. I just noticed that the DMA has already introduced a
> > similar optimization.
> >
> > The check in SMC was added back in 2022, while the DMA introduced
> > the internal "skip redundant sync" mechanism in 2024 (commit f406c8e4b770).
> >
> > Given this, it might be better to simply remove this redundant check
> > from SMC now, which would also eliminate the need for direct access to
> > ibdev->dma_device. I need to perform some tests to confirm this.
>
> It may also be worth looking at this series from Chuck, which reuses the
> recently introduced DMA API to remove the SG conversions:
> https://lore.kernel.org/linux-rdma/20260128005400.25147-1-cel@kernel.org/
>
> You might be able to apply a similar approach and drop SG from your
> datapath entirely.
>
> Thanks.
>
After a quick look at the rdma_rw_ctx, but it doesn't seem
suitable for SMC. SMC uses a ring buffer for its transmit path and
requires frequent offset adjustments during data transmission.
The rdma_rw_ctx model is likely not flexible enough for this
kind of circular buffer logic.
D. Wythe
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-30 9:18 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 12:41 [PATCH net v2] net/smc: fix one NULL pointer dereference in smc_ib_is_sg_need_sync() Liu Jian
2025-08-28 13:04 ` Guangguan Wang
2025-08-28 18:24 ` yanjun.zhu
2025-08-29 3:32 ` D. Wythe
2025-09-02 9:20 ` patchwork-bot+netdevbpf
2025-09-09 9:45 ` Leon Romanovsky
2026-01-26 4:45 ` D. Wythe
2026-01-27 12:00 ` Leon Romanovsky
2026-01-28 5:50 ` D. Wythe
2026-01-28 11:21 ` Leon Romanovsky
2026-01-28 12:50 ` D. Wythe
2026-01-30 9:18 ` D. Wythe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox