[PATCH net 0/6] several fixes for smc

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH net 0/6] several fixes for smc
@ 2024-11-28 12:14 Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 1/6] net/smc: protect link down work from execute after lgr freed Guangguan Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

several fixes for smc

Guangguan Wang (6):
  net/smc: protect link down work from execute after lgr freed
  net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left
  net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving
    proposal msg
  net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving
    proposal msg
  net/smc: check smcd_v2_ext_offset when receiving proposal msg
  net/smc: check return value of sock_recvmsg when draining clc data

 net/smc/af_smc.c   | 11 +++++++++--
 net/smc/smc_clc.c  | 17 ++++++++++++++++-
 net/smc/smc_clc.h  | 22 +++++++++++++++++++---
 net/smc/smc_core.c |  9 +++++++--
 net/smc/smc_tx.c   |  5 ++++-
 5 files changed, 55 insertions(+), 9 deletions(-)

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/6] net/smc: protect link down work from execute after lgr freed
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left Guangguan Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

link down work may be scheduled before lgr freed but execute
after lgr freed, which may result in crash. So it is need to
hold a reference before shedule link down work, and put the
reference after work executed or canceled.

The relevant crash call stack as follows:
 list_del corruption. prev->next should be ffffb638c9c0fe20,
    but was 0000000000000000
 ------------[ cut here ]------------
 kernel BUG at lib/list_debug.c:51!
 invalid opcode: 0000 [#1] SMP NOPTI
 CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1
 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014
 Workqueue: events smc_link_down_work [smc]
 RIP: 0010:__list_del_entry_valid.cold+0x31/0x47
 RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086
 RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000
 RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80
 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38
 R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002
 R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0
 FS:  0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  rwsem_down_write_slowpath+0x17e/0x470
  smc_link_down_work+0x3c/0x60 [smc]
  process_one_work+0x1ac/0x350
  worker_thread+0x49/0x2f0
  ? rescuer_thread+0x360/0x360
  kthread+0x118/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x1f/0x30

Fixes: 541afa10c126 ("net/smc: add smcr_port_err() and smcr_link_down() processing")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
---
 net/smc/smc_core.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 500952c2e67b..3b125d348b4a 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -1818,7 +1818,9 @@ void smcr_link_down_cond_sched(struct smc_link *lnk)
 {
 	if (smc_link_downing(&lnk->state)) {
 		trace_smcr_link_down(lnk, __builtin_return_address(0));
-		schedule_work(&lnk->link_down_wrk);
+		smcr_link_hold(lnk); /* smcr_link_put in link_down_wrk */
+		if (!schedule_work(&lnk->link_down_wrk))
+			smcr_link_put(lnk);
 	}
 }
 
@@ -1850,11 +1852,14 @@ static void smc_link_down_work(struct work_struct *work)
 	struct smc_link_group *lgr = link->lgr;
 
 	if (list_empty(&lgr->list))
-		return;
+		goto out;
 	wake_up_all(&lgr->llc_msg_waiter);
 	down_write(&lgr->llc_conf_mutex);
 	smcr_link_down(link);
 	up_write(&lgr->llc_conf_mutex);
+
+out:
+	smcr_link_put(link); /* smcr_link_hold by schedulers of link_down_work */
 }
 
 static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 1/6] net/smc: protect link down work from execute after lgr freed Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  2024-12-03 10:04   ` Paolo Abeni
  2024-11-28 12:14 ` [PATCH net 3/6] net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg Guangguan Wang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

When application sending data more than sndbuf_space, there have chances
application will sleep in epoll_wait, and will never be wakeup again. This
is caused by a race between smc_poll and smc_cdc_tx_handler.

application                                      tasklet
smc_tx_sendmsg(len > sndbuf_space)   |
epoll_wait for EPOLL_OUT,timeout=0   |
  smc_poll                           |
    if (!smc->conn.sndbuf_space)     |
                                     |  smc_cdc_tx_handler
                                     |    atomic_add sndbuf_space
                                     |    smc_tx_sndbuf_nonfull
                                     |      if (!test_bit SOCK_NOSPACE)
                                     |        do not sk_write_space;
      set_bit SOCK_NOSPACE;          |
    return mask=0;                   |

Application will sleep in epoll_wait as smc_poll returns 0. And
smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE
has not be set. If there is no inflight cdc msg, sk_write_space will not be
called any more, and application will sleep in epoll_wait forever.
So set SOCK_NOSPACE when send_remaining but no sndbuf_space left in
smc_tx_sendmsg, to ensure call sk_write_space in smc_cdc_tx_handler
even when the above race happens.

Fixes: 6889b36da78a ("net/smc: don't wait for send buffer space when data was already sent")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/smc_tx.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 214ac3cbcf9a..60cfec8eb255 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -222,8 +222,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 			conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
 
 		if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
-			if (send_done)
+			if (send_done) {
+				sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
+				set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 				return send_done;
+			}
 			rc = smc_tx_wait(smc, msg->msg_flags);
 			if (rc)
 				goto out_err;
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left
  2024-11-28 12:14 ` [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left Guangguan Wang
@ 2024-12-03 10:04   ` Paolo Abeni
  2024-12-04  7:12     ` Guangguan Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2024-12-03 10:04 UTC (permalink / raw)
  To: Guangguan Wang, wenjia, jaka, alibuda, tonylu, guwen, davem,
	edumazet, kuba, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel



On 11/28/24 13:14, Guangguan Wang wrote:
> When application sending data more than sndbuf_space, there have chances
> application will sleep in epoll_wait, and will never be wakeup again. This
> is caused by a race between smc_poll and smc_cdc_tx_handler.
> 
> application                                      tasklet
> smc_tx_sendmsg(len > sndbuf_space)   |
> epoll_wait for EPOLL_OUT,timeout=0   |
>   smc_poll                           |
>     if (!smc->conn.sndbuf_space)     |
>                                      |  smc_cdc_tx_handler
>                                      |    atomic_add sndbuf_space
>                                      |    smc_tx_sndbuf_nonfull
>                                      |      if (!test_bit SOCK_NOSPACE)
>                                      |        do not sk_write_space;
>       set_bit SOCK_NOSPACE;          |
>     return mask=0;                   |
> 
> Application will sleep in epoll_wait as smc_poll returns 0. And
> smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE
> has not be set. If there is no inflight cdc msg, sk_write_space will not be
> called any more, and application will sleep in epoll_wait forever.
> So set SOCK_NOSPACE when send_remaining but no sndbuf_space left in
> smc_tx_sendmsg, to ensure call sk_write_space in smc_cdc_tx_handler
> even when the above race happens.

I think it should be preferable to address the mentioned race the same
way as tcp_poll(). i.e. checking again smc->conn.sndbuf_space after
setting the NOSPACE bit with appropriate barrier, see:

https://elixir.bootlin.com/linux/v6.12.1/source/net/ipv4/tcp.c#L590

that will avoid additional, possibly unneeded atomic operation in the tx
path (the application could do the next sendmsg()/poll() call after that
the send buf has been freed) and will avoid some code duplication.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left
  2024-12-03 10:04   ` Paolo Abeni
@ 2024-12-04  7:12     ` Guangguan Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-12-04  7:12 UTC (permalink / raw)
  To: Paolo Abeni, wenjia, jaka, alibuda, tonylu, guwen, davem,
	edumazet, kuba, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel



On 2024/12/3 18:04, Paolo Abeni wrote:
> 
> 
> On 11/28/24 13:14, Guangguan Wang wrote:
>> When application sending data more than sndbuf_space, there have chances
>> application will sleep in epoll_wait, and will never be wakeup again. This
>> is caused by a race between smc_poll and smc_cdc_tx_handler.
>>
>> application                                      tasklet
>> smc_tx_sendmsg(len > sndbuf_space)   |
>> epoll_wait for EPOLL_OUT,timeout=0   |
>>   smc_poll                           |
>>     if (!smc->conn.sndbuf_space)     |
>>                                      |  smc_cdc_tx_handler
>>                                      |    atomic_add sndbuf_space
>>                                      |    smc_tx_sndbuf_nonfull
>>                                      |      if (!test_bit SOCK_NOSPACE)
>>                                      |        do not sk_write_space;
>>       set_bit SOCK_NOSPACE;          |
>>     return mask=0;                   |
>>
>> Application will sleep in epoll_wait as smc_poll returns 0. And
>> smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE
>> has not be set. If there is no inflight cdc msg, sk_write_space will not be
>> called any more, and application will sleep in epoll_wait forever.
>> So set SOCK_NOSPACE when send_remaining but no sndbuf_space left in
>> smc_tx_sendmsg, to ensure call sk_write_space in smc_cdc_tx_handler
>> even when the above race happens.
> 
> I think it should be preferable to address the mentioned race the same
> way as tcp_poll(). i.e. checking again smc->conn.sndbuf_space after
> setting the NOSPACE bit with appropriate barrier, see:
> 
> https://elixir.bootlin.com/linux/v6.12.1/source/net/ipv4/tcp.c#L590
> 
> that will avoid additional, possibly unneeded atomic operation in the tx
> path (the application could do the next sendmsg()/poll() call after that
> the send buf has been freed) and will avoid some code duplication.
> 
> Cheers,
> 
> Paolo

Hi, Paolo

Thanks for advice, and the way in tcp_poll() seems a better solution for this race.
I will retest it, and resend a new version of patch if it works.

Thanks,
Guangguan Wang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 3/6] net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 1/6] net/smc: protect link down work from execute after lgr freed Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 4/6] net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt " Guangguan Wang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

When receiving proposal msg in server, the field iparea_offset
and the field ipv6_prefixes_cnt in proposal msg are from the
remote client and can not be fully trusted. Especially the
field iparea_offset, once exceed the max value, there has the
chance to access wrong address, and crash may happen.

This patch checks iparea_offset and ipv6_prefixes_cnt before using them.

Fixes: e7b7a64a8493 ("smc: support variable CLC proposal messages")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c  | 6 +++++-
 net/smc/smc_clc.c | 4 ++++
 net/smc/smc_clc.h | 6 +++++-
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 9d76e902fd77..8a2f196ab995 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2032,6 +2032,8 @@ static int smc_listen_prfx_check(struct smc_sock *new_smc,
 	if (pclc->hdr.typev1 == SMC_TYPE_N)
 		return 0;
 	pclc_prfx = smc_clc_proposal_get_prefix(pclc);
+	if (!pclc_prfx)
+		return -EPROTO;
 	if (smc_clc_prfx_match(newclcsock, pclc_prfx))
 		return SMC_CLC_DECL_DIFFPREFIX;
 
@@ -2221,7 +2223,9 @@ static void smc_find_ism_v1_device_serv(struct smc_sock *new_smc,
 	int rc = 0;
 
 	/* check if ISM V1 is available */
-	if (!(ini->smcd_version & SMC_V1) || !smcd_indicated(ini->smc_type_v1))
+	if (!(ini->smcd_version & SMC_V1) ||
+	    !smcd_indicated(ini->smc_type_v1) ||
+	    !pclc_smcd)
 		goto not_found;
 	ini->is_smcd = true; /* prepare ISM check */
 	ini->ism_peer_gid[0].gid = ntohll(pclc_smcd->ism.gid);
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 33fa787c28eb..66a43b97eede 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -354,6 +354,10 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc)
 
 	v2_ext = smc_get_clc_v2_ext(pclc);
 	pclc_prfx = smc_clc_proposal_get_prefix(pclc);
+	if (!pclc_prfx ||
+	    pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX)
+		return false;
+
 	if (hdr->version == SMC_V1) {
 		if (hdr->typev1 == SMC_TYPE_N)
 			return false;
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index 5fd6f5b8ef03..ac8de6a177fa 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -336,8 +336,12 @@ struct smc_clc_msg_decline_v2 {	/* clc decline message */
 static inline struct smc_clc_msg_proposal_prefix *
 smc_clc_proposal_get_prefix(struct smc_clc_msg_proposal *pclc)
 {
+	u16 offset = ntohs(pclc->iparea_offset);
+
+	if (offset > sizeof(struct smc_clc_msg_smcd))
+		return NULL;
 	return (struct smc_clc_msg_proposal_prefix *)
-	       ((u8 *)pclc + sizeof(*pclc) + ntohs(pclc->iparea_offset));
+	       ((u8 *)pclc + sizeof(*pclc) + offset);
 }
 
 static inline bool smcr_indicated(int smc_type)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 4/6] net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
                   ` (2 preceding siblings ...)
  2024-11-28 12:14 ` [PATCH net 3/6] net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 5/6] net/smc: check smcd_v2_ext_offset " Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 6/6] net/smc: check return value of sock_recvmsg when draining clc data Guangguan Wang
  5 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

When receiving proposal msg in server, the fields v2_ext_offset/
eid_cnt/ism_gid_cnt in proposal msg are from the remote client
and can not be fully trusted. Especially the field v2_ext_offset,
once exceed the max value, there has the chance to access wrong
address, and crash may happen.

This patch checks the fields v2_ext_offset/eid_cnt/ism_gid_cnt
before using them.

Fixes: 8c3dca341aea ("net/smc: build and send V2 CLC proposal")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c  | 3 ++-
 net/smc/smc_clc.c | 8 +++++++-
 net/smc/smc_clc.h | 8 +++++++-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 8a2f196ab995..5bfd38eaee3a 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2276,7 +2276,8 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
 		goto not_found;
 
 	smc_v2_ext = smc_get_clc_v2_ext(pclc);
-	if (!smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL))
+	if (!smc_v2_ext ||
+	    !smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL))
 		goto not_found;
 
 	/* prepare RDMA check */
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 66a43b97eede..f721d03efcbd 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -352,7 +352,6 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc)
 	struct smc_clc_msg_hdr *hdr = &pclc->hdr;
 	struct smc_clc_v2_extension *v2_ext;
 
-	v2_ext = smc_get_clc_v2_ext(pclc);
 	pclc_prfx = smc_clc_proposal_get_prefix(pclc);
 	if (!pclc_prfx ||
 	    pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX)
@@ -369,6 +368,13 @@ static bool smc_clc_msg_prop_valid(struct smc_clc_msg_proposal *pclc)
 			sizeof(struct smc_clc_msg_trail))
 			return false;
 	} else {
+		v2_ext = smc_get_clc_v2_ext(pclc);
+		if ((hdr->typev2 != SMC_TYPE_N &&
+		     (!v2_ext || v2_ext->hdr.eid_cnt > SMC_CLC_MAX_UEID)) ||
+		    (smcd_indicated(hdr->typev2) &&
+		     v2_ext->hdr.ism_gid_cnt > SMCD_CLC_MAX_V2_GID_ENTRIES))
+			return false;
+
 		if (ntohs(hdr->length) !=
 			sizeof(*pclc) +
 			sizeof(struct smc_clc_msg_smcd) +
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index ac8de6a177fa..23afa4df862e 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -380,8 +380,14 @@ static inline struct smc_clc_v2_extension *
 smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop)
 {
 	struct smc_clc_msg_smcd *prop_smcd = smc_get_clc_msg_smcd(prop);
+	u16 max_offset;
 
-	if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset))
+	max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) -
+		     offsetof(struct smc_clc_msg_proposal_area, pclc_smcd) -
+		     offsetofend(struct smc_clc_msg_smcd, v2_ext_offset);
+
+	if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset) ||
+	    ntohs(prop_smcd->v2_ext_offset) > max_offset)
 		return NULL;
 
 	return (struct smc_clc_v2_extension *)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 5/6] net/smc: check smcd_v2_ext_offset when receiving proposal msg
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
                   ` (3 preceding siblings ...)
  2024-11-28 12:14 ` [PATCH net 4/6] net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt " Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  2024-11-28 12:14 ` [PATCH net 6/6] net/smc: check return value of sock_recvmsg when draining clc data Guangguan Wang
  5 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

When receiving proposal msg in server, the field smcd_v2_ext_offset in
proposal msg is from the remote client and can not be fully trusted.
Once the value of smcd_v2_ext_offset exceed the max value, there has
the chance to access wrong address, and crash may happen.

This patch checks the value of smcd_v2_ext_offset before using it.

Fixes: 5c21c4ccafe8 ("net/smc: determine accepted ISM devices")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c  | 2 ++
 net/smc/smc_clc.h | 8 +++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 5bfd38eaee3a..ef4e0ff6beed 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2147,6 +2147,8 @@ static void smc_find_ism_v2_device_serv(struct smc_sock *new_smc,
 	pclc_smcd = smc_get_clc_msg_smcd(pclc);
 	smc_v2_ext = smc_get_clc_v2_ext(pclc);
 	smcd_v2_ext = smc_get_clc_smcd_v2_ext(smc_v2_ext);
+	if (!pclc_smcd || !smc_v2_ext || !smcd_v2_ext)
+		goto not_found;
 
 	mutex_lock(&smcd_dev_list.mutex);
 	if (pclc_smcd->ism.chid) {
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index 23afa4df862e..767289925410 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -400,9 +400,15 @@ smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop)
 static inline struct smc_clc_smcd_v2_extension *
 smc_get_clc_smcd_v2_ext(struct smc_clc_v2_extension *prop_v2ext)
 {
+	u16 max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_smcd_v2_ext) -
+		offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) -
+		offsetof(struct smc_clc_v2_extension, hdr) -
+		offsetofend(struct smc_clnt_opts_area_hdr, smcd_v2_ext_offset);
+
 	if (!prop_v2ext)
 		return NULL;
-	if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset))
+	if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) ||
+	    ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) > max_offset)
 		return NULL;
 
 	return (struct smc_clc_smcd_v2_extension *)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 6/6] net/smc: check return value of sock_recvmsg when draining clc data
  2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
                   ` (4 preceding siblings ...)
  2024-11-28 12:14 ` [PATCH net 5/6] net/smc: check smcd_v2_ext_offset " Guangguan Wang
@ 2024-11-28 12:14 ` Guangguan Wang
  5 siblings, 0 replies; 9+ messages in thread
From: Guangguan Wang @ 2024-11-28 12:14 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
	pabeni, horms
  Cc: linux-rdma, linux-s390, netdev, linux-kernel

When receiving clc msg, the field length in smc_clc_msg_hdr indicates the
length of msg should be received from network and the value should not be
fully trusted as it is from the network. Once the value of length exceeds
the value of buflen in function smc_clc_wait_msg it may run into deadloop
when trying to drain the remaining data exceeding buflen.

This patch checks the return value of sock_recvmsg when draining data in
case of deadloop in draining.

Fixes: fb4f79264c0f ("net/smc: tolerate future SMCD versions")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/smc_clc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index f721d03efcbd..521f5df80e10 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -774,6 +774,11 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen,
 						SMC_CLC_RECV_BUF_LEN : datlen;
 		iov_iter_kvec(&msg.msg_iter, ITER_DEST, &vec, 1, recvlen);
 		len = sock_recvmsg(smc->clcsock, &msg, krflags);
+		if (len < recvlen) {
+			smc->sk.sk_err = EPROTO;
+			reason_code = -EPROTO;
+			goto out;
+		}
 		datlen -= len;
 	}
 	if (clcm->type == SMC_CLC_DECLINE) {
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-12-04  7:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-28 12:14 [PATCH net 0/6] several fixes for smc Guangguan Wang
2024-11-28 12:14 ` [PATCH net 1/6] net/smc: protect link down work from execute after lgr freed Guangguan Wang
2024-11-28 12:14 ` [PATCH net 2/6] net/smc: set SOCK_NOSPACE when send_remaining but no sndbuf_space left Guangguan Wang
2024-12-03 10:04   ` Paolo Abeni
2024-12-04  7:12     ` Guangguan Wang
2024-11-28 12:14 ` [PATCH net 3/6] net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg Guangguan Wang
2024-11-28 12:14 ` [PATCH net 4/6] net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt " Guangguan Wang
2024-11-28 12:14 ` [PATCH net 5/6] net/smc: check smcd_v2_ext_offset " Guangguan Wang
2024-11-28 12:14 ` [PATCH net 6/6] net/smc: check return value of sock_recvmsg when draining clc data Guangguan Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox