* [PATCH v3 0/3] net/smc: bound wire-controlled CDC cursors against the local buffers
@ 2026-06-14 8:23 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas via B4 Relay @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
A peer's CDC producer/consumer cursors are copied from the wire and used,
without an upper bound against the local buffers, as (a) a raw index into the
RMB on the urgent path, (b) the receive length in smc_rx_recvmsg(), and (c) the
send length in smc_tx_sendmsg() on the SMC-D DMB-merge path. A malicious or
buggy peer can forge a cursor so each of these runs past the relevant buffer:
an out-of-bounds read of adjacent kernel memory (disclosed to the peer) on the
receive/urgent side, and an out-of-bounds write of attacker-influenced length
and content on the send side.
This series bounds each wire-controlled value at its point of use against the
local buffer, enforcing invariants the code already documents
("0 <= bytes_to_rcv <= rmb_desc->len", "0 <= sndbuf_space <= sndbuf_desc->len").
Conforming peers always keep these values in range, so the bounds are no-ops in
normal operation.
1/3 bounds the producer cursor count to rmb_desc->len at the SMC-R/SMC-D
conversion boundary (the urgent-path raw index). The bound is applied to
the producer cursor only -- the consumer cursor indexes the peer's RMB and
is bounded by peer_rmbe_size, so clamping it to our rmb_desc->len would
under-credit peer_rmbe_space and stall transmit to a peer with a larger
RMB.
2/3 bounds the readable count in smc_rx_recvmsg() so the wrap-around copy
cannot read past the RMB.
3/3 bounds the write space in smc_tx_sendmsg() so the wrap-around copy cannot
write past the send buffer.
This supersedes two separately-posted patches and folds them into one series
together with the producer-cursor fix, after review feedback that they share a
root cause:
- net/smc: bound peer producer cursor and bytes_to_rcv on SMC-D CDC receive
https://lore.kernel.org/netdev/20260610084803.186516-1-hexlabsecurity@proton.me/
- net/smc: bound sndbuf_space on the SMC-D DMB-merge receive path
https://lore.kernel.org/netdev/20260610090928.192177-1-hexlabsecurity@proton.me/
Changes since those postings (addressing the review):
- The receive/send bounds were previously applied in the CDC receive tasklet,
after the atomic_add(). As the review noted, that read-then-set is not
atomic, and a recvmsg()/sendmsg() on another CPU can observe the inflated
value in the window between the atomic_add() and the clamp: recvmsg() runs
under lock_sock(), which leaves the slock free, so it is not serialized
against the bh_lock_sock() CDC tasklet. The bound now lives at the consumer,
where the value is used to size the copy, which is race-free.
- The bounds now also reject a negative value (if (x < 0 || x > len)): across
many forged CDC messages the signed accumulator can wrap negative, which a
plain "> len" check misses and min_t(size_t, ...) then turns into a huge
length.
- The SMC-R producer-cursor bound is applied only to the producer cursor at
the call site, not in the shared smc_cdc_cursor_to_host() helper, so the
consumer cursor (bounded by peer_rmbe_size) is no longer truncated.
Verified with an in-kernel KASAN A/B matrix on x86-64 (SMC-D loopback,
CONFIG_SMC_LO; no special hardware): each sink produces a slab-out-of-bounds
read/write for a forged cursor and is clean with the patch, and both the
cross-CPU race and the negative-accumulator case are reproduced and closed.
Logs available on request.
---
Bryam Vargas (3):
net/smc: bound the wire-controlled producer cursor to the RMB
net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
net/smc/smc_cdc.h | 27 ++++++++++++++++++++++++---
net/smc/smc_rx.c | 12 ++++++++++++
net/smc/smc_tx.c | 13 +++++++++++++
3 files changed, 49 insertions(+), 3 deletions(-)
---
base-commit: 8e65320d91cdc3b241d4b94855c88459b91abf66
change-id: 20260614-b4-disp-edd64be9-b094cf67fded
Best regards,
--
Bryam Vargas <hexlabsecurity@proton.me>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3 0/3] net/smc: bound wire-controlled CDC cursors against the local buffers
@ 2026-06-14 8:23 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
A peer's CDC producer/consumer cursors are copied from the wire and used,
without an upper bound against the local buffers, as (a) a raw index into the
RMB on the urgent path, (b) the receive length in smc_rx_recvmsg(), and (c) the
send length in smc_tx_sendmsg() on the SMC-D DMB-merge path. A malicious or
buggy peer can forge a cursor so each of these runs past the relevant buffer:
an out-of-bounds read of adjacent kernel memory (disclosed to the peer) on the
receive/urgent side, and an out-of-bounds write of attacker-influenced length
and content on the send side.
This series bounds each wire-controlled value at its point of use against the
local buffer, enforcing invariants the code already documents
("0 <= bytes_to_rcv <= rmb_desc->len", "0 <= sndbuf_space <= sndbuf_desc->len").
Conforming peers always keep these values in range, so the bounds are no-ops in
normal operation.
1/3 bounds the producer cursor count to rmb_desc->len at the SMC-R/SMC-D
conversion boundary (the urgent-path raw index). The bound is applied to
the producer cursor only -- the consumer cursor indexes the peer's RMB and
is bounded by peer_rmbe_size, so clamping it to our rmb_desc->len would
under-credit peer_rmbe_space and stall transmit to a peer with a larger
RMB.
2/3 bounds the readable count in smc_rx_recvmsg() so the wrap-around copy
cannot read past the RMB.
3/3 bounds the write space in smc_tx_sendmsg() so the wrap-around copy cannot
write past the send buffer.
This supersedes two separately-posted patches and folds them into one series
together with the producer-cursor fix, after review feedback that they share a
root cause:
- net/smc: bound peer producer cursor and bytes_to_rcv on SMC-D CDC receive
https://lore.kernel.org/netdev/20260610084803.186516-1-hexlabsecurity@proton.me/
- net/smc: bound sndbuf_space on the SMC-D DMB-merge receive path
https://lore.kernel.org/netdev/20260610090928.192177-1-hexlabsecurity@proton.me/
Changes since those postings (addressing the review):
- The receive/send bounds were previously applied in the CDC receive tasklet,
after the atomic_add(). As the review noted, that read-then-set is not
atomic, and a recvmsg()/sendmsg() on another CPU can observe the inflated
value in the window between the atomic_add() and the clamp: recvmsg() runs
under lock_sock(), which leaves the slock free, so it is not serialized
against the bh_lock_sock() CDC tasklet. The bound now lives at the consumer,
where the value is used to size the copy, which is race-free.
- The bounds now also reject a negative value (if (x < 0 || x > len)): across
many forged CDC messages the signed accumulator can wrap negative, which a
plain "> len" check misses and min_t(size_t, ...) then turns into a huge
length.
- The SMC-R producer-cursor bound is applied only to the producer cursor at
the call site, not in the shared smc_cdc_cursor_to_host() helper, so the
consumer cursor (bounded by peer_rmbe_size) is no longer truncated.
Verified with an in-kernel KASAN A/B matrix on x86-64 (SMC-D loopback,
CONFIG_SMC_LO; no special hardware): each sink produces a slab-out-of-bounds
read/write for a forged cursor and is clean with the patch, and both the
cross-CPU race and the negative-accumulator case are reproduced and closed.
Logs available on request.
---
Bryam Vargas (3):
net/smc: bound the wire-controlled producer cursor to the RMB
net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
net/smc/smc_cdc.h | 27 ++++++++++++++++++++++++---
net/smc/smc_rx.c | 12 ++++++++++++
net/smc/smc_tx.c | 13 +++++++++++++
3 files changed, 49 insertions(+), 3 deletions(-)
---
base-commit: 8e65320d91cdc3b241d4b94855c88459b91abf66
change-id: 20260614-b4-disp-edd64be9-b094cf67fded
Best regards,
--
Bryam Vargas <hexlabsecurity@proton.me>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
2026-06-14 8:23 ` Bryam Vargas
@ 2026-06-14 8:23 ` Bryam Vargas
-1 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas via B4 Relay @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
From: Bryam Vargas <hexlabsecurity@proton.me>
smc_cdc_cursor_to_host() (SMC-R) and smcd_cdc_msg_to_host() (SMC-D)
import the peer's producer cursor from the wire into the local
connection cursor with no upper bound against the receive buffer (RMB).
The urgent path then uses that count as a raw index:
base = conn->rmb_desc->cpu_addr + conn->rx_off;
conn->urg_rx_byte = *(base + conn->urg_curs.count - 1);
so a peer that advertises a producer cursor past rmb_desc->len reads
out of bounds of the RMB allocation in the receive tasklet (softirq).
Bound the producer cursor count to rmb_desc->len at the conversion
boundary, for both SMC-R and SMC-D. Apply the bound to the producer
cursor only: the consumer cursor indexes the peer's RMB and is bounded
by peer_rmbe_size, so clamping it to our rmb_desc->len would
under-credit peer_rmbe_space and stall transmit to a peer whose RMB is
larger than ours.
Fixes: de8474eb9d50 ("net/smc: urgent data support")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_cdc.h | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
index 696cc11f2303..ca76ef630356 100644
--- a/net/smc/smc_cdc.h
+++ b/net/smc/smc_cdc.h
@@ -221,7 +221,8 @@ static inline void smc_host_msg_to_cdc(struct smc_cdc_msg *peer,
static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
union smc_cdc_cursor *peer,
- struct smc_connection *conn)
+ struct smc_connection *conn,
+ int max_count)
{
union smc_host_cursor temp, old;
union smc_cdc_cursor net;
@@ -235,6 +236,15 @@ static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
if ((old.wrap == temp.wrap) &&
(old.count > temp.count))
return;
+ /* The peer producer cursor is wire-controlled and is later used as a
+ * raw index into our RMB by the urgent path; bound its count to the
+ * RMB. max_count == 0 leaves the consumer cursor unbounded here: it
+ * indexes the peer's RMB (bounded by peer_rmbe_size, not our
+ * rmb_desc->len), so clamping it to rmb_desc->len would under-credit
+ * peer_rmbe_space and stall transmit to peers with a larger RMB.
+ */
+ if (max_count && temp.count > max_count)
+ temp.count = max_count;
smc_curs_copy(local, &temp, conn);
}
@@ -246,8 +256,13 @@ static inline void smcr_cdc_msg_to_host(struct smc_host_cdc_msg *local,
local->len = peer->len;
local->seqno = ntohs(peer->seqno);
local->token = ntohl(peer->token);
- smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn);
- smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn);
+ /* bound the wire-controlled producer cursor to our RMB (used as a raw
+ * index by the urgent path); leave the consumer cursor unbounded -- it
+ * indexes the peer's RMB and is bounded by peer_rmbe_size.
+ */
+ smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn,
+ conn->rmb_desc->len);
+ smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn, 0);
local->prod_flags = peer->prod_flags;
local->conn_state_flags = peer->conn_state_flags;
}
@@ -260,6 +275,12 @@ static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
temp.wrap = peer->prod.wrap;
temp.count = peer->prod.count;
+ /* the peer producer cursor is wire-controlled and is used as a raw
+ * index into our RMB by the urgent path; bound it to the RMB. The
+ * consumer cursor below indexes the peer's RMB and is left unbounded.
+ */
+ if (temp.count > conn->rmb_desc->len)
+ temp.count = conn->rmb_desc->len;
smc_curs_copy(&local->prod, &temp, conn);
temp.wrap = peer->cons.wrap;
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
@ 2026-06-14 8:23 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
smc_cdc_cursor_to_host() (SMC-R) and smcd_cdc_msg_to_host() (SMC-D)
import the peer's producer cursor from the wire into the local
connection cursor with no upper bound against the receive buffer (RMB).
The urgent path then uses that count as a raw index:
base = conn->rmb_desc->cpu_addr + conn->rx_off;
conn->urg_rx_byte = *(base + conn->urg_curs.count - 1);
so a peer that advertises a producer cursor past rmb_desc->len reads
out of bounds of the RMB allocation in the receive tasklet (softirq).
Bound the producer cursor count to rmb_desc->len at the conversion
boundary, for both SMC-R and SMC-D. Apply the bound to the producer
cursor only: the consumer cursor indexes the peer's RMB and is bounded
by peer_rmbe_size, so clamping it to our rmb_desc->len would
under-credit peer_rmbe_space and stall transmit to a peer whose RMB is
larger than ours.
Fixes: de8474eb9d50 ("net/smc: urgent data support")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_cdc.h | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
index 696cc11f2303..ca76ef630356 100644
--- a/net/smc/smc_cdc.h
+++ b/net/smc/smc_cdc.h
@@ -221,7 +221,8 @@ static inline void smc_host_msg_to_cdc(struct smc_cdc_msg *peer,
static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
union smc_cdc_cursor *peer,
- struct smc_connection *conn)
+ struct smc_connection *conn,
+ int max_count)
{
union smc_host_cursor temp, old;
union smc_cdc_cursor net;
@@ -235,6 +236,15 @@ static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
if ((old.wrap == temp.wrap) &&
(old.count > temp.count))
return;
+ /* The peer producer cursor is wire-controlled and is later used as a
+ * raw index into our RMB by the urgent path; bound its count to the
+ * RMB. max_count == 0 leaves the consumer cursor unbounded here: it
+ * indexes the peer's RMB (bounded by peer_rmbe_size, not our
+ * rmb_desc->len), so clamping it to rmb_desc->len would under-credit
+ * peer_rmbe_space and stall transmit to peers with a larger RMB.
+ */
+ if (max_count && temp.count > max_count)
+ temp.count = max_count;
smc_curs_copy(local, &temp, conn);
}
@@ -246,8 +256,13 @@ static inline void smcr_cdc_msg_to_host(struct smc_host_cdc_msg *local,
local->len = peer->len;
local->seqno = ntohs(peer->seqno);
local->token = ntohl(peer->token);
- smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn);
- smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn);
+ /* bound the wire-controlled producer cursor to our RMB (used as a raw
+ * index by the urgent path); leave the consumer cursor unbounded -- it
+ * indexes the peer's RMB and is bounded by peer_rmbe_size.
+ */
+ smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn,
+ conn->rmb_desc->len);
+ smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn, 0);
local->prod_flags = peer->prod_flags;
local->conn_state_flags = peer->conn_state_flags;
}
@@ -260,6 +275,12 @@ static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
temp.wrap = peer->prod.wrap;
temp.count = peer->prod.count;
+ /* the peer producer cursor is wire-controlled and is used as a raw
+ * index into our RMB by the urgent path; bound it to the RMB. The
+ * consumer cursor below indexes the peer's RMB and is left unbounded.
+ */
+ if (temp.count > conn->rmb_desc->len)
+ temp.count = conn->rmb_desc->len;
smc_curs_copy(&local->prod, &temp, conn);
temp.wrap = peer->cons.wrap;
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
2026-06-14 8:23 ` Bryam Vargas
@ 2026-06-14 8:23 ` Bryam Vargas
-1 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas via B4 Relay @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
From: Bryam Vargas <hexlabsecurity@proton.me>
conn->bytes_to_rcv is accumulated in the receive tasklet from the peer's
producer cursor:
diff_prod = smc_curs_diff(rmb_desc->len, &prod_old, &prod_new);
atomic_add(diff_prod, &conn->bytes_to_rcv);
smc_curs_diff()'s differing-wrap branch returns (size - old.count) +
new.count, which exceeds rmb_desc->len for a forged producer cursor and
accumulates across CDC messages, so bytes_to_rcv can grow past the RMB
(and across many messages can overflow the signed counter negative).
smc_rx_recvmsg() reads it as the number of readable bytes and performs a
wrap-around copy whose second chunk (copylen - first_chunk, read from
ring offset 0) is never re-bounded to rmb_desc->len, reading past the
RMB into adjacent kernel memory and disclosing it to the peer.
Bound the readable count to rmb_desc->len where it is consumed, treating
a negative (sign-overflowed) value as out of range too, so the copy
length can never exceed the ring. This enforces the documented
0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer, where it
is race-free against the producer update that runs in the receive
tasklet.
Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_rx.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index c1d9b923938d..f461cf10b085 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -442,6 +442,18 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
/* initialize variables for 1st iteration of subsequent loop */
/* could be just 1 byte, even after waiting on data above */
readable = smc_rx_data_available(conn, peeked_bytes);
+ /* bytes_to_rcv is accumulated from the peer's wire-controlled
+ * producer cursor; a forged cursor can drive it past the RMB,
+ * or overflow the signed accumulator to a negative value across
+ * many CDC messages (which a plain "> len" check would miss
+ * before the size_t cast below turns it huge). Bound it to the
+ * RMB in either case so the wrap-around copy cannot run past
+ * rmb_desc->len. This enforces the documented
+ * 0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer,
+ * race-free against the producer update in the receive tasklet.
+ */
+ if (readable < 0 || readable > conn->rmb_desc->len)
+ readable = conn->rmb_desc->len;
splbytes = atomic_read(&conn->splice_pending);
if (!readable || (msg && splbytes)) {
if (splbytes)
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
@ 2026-06-14 8:23 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
conn->bytes_to_rcv is accumulated in the receive tasklet from the peer's
producer cursor:
diff_prod = smc_curs_diff(rmb_desc->len, &prod_old, &prod_new);
atomic_add(diff_prod, &conn->bytes_to_rcv);
smc_curs_diff()'s differing-wrap branch returns (size - old.count) +
new.count, which exceeds rmb_desc->len for a forged producer cursor and
accumulates across CDC messages, so bytes_to_rcv can grow past the RMB
(and across many messages can overflow the signed counter negative).
smc_rx_recvmsg() reads it as the number of readable bytes and performs a
wrap-around copy whose second chunk (copylen - first_chunk, read from
ring offset 0) is never re-bounded to rmb_desc->len, reading past the
RMB into adjacent kernel memory and disclosing it to the peer.
Bound the readable count to rmb_desc->len where it is consumed, treating
a negative (sign-overflowed) value as out of range too, so the copy
length can never exceed the ring. This enforces the documented
0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer, where it
is race-free against the producer update that runs in the receive
tasklet.
Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_rx.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index c1d9b923938d..f461cf10b085 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -442,6 +442,18 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
/* initialize variables for 1st iteration of subsequent loop */
/* could be just 1 byte, even after waiting on data above */
readable = smc_rx_data_available(conn, peeked_bytes);
+ /* bytes_to_rcv is accumulated from the peer's wire-controlled
+ * producer cursor; a forged cursor can drive it past the RMB,
+ * or overflow the signed accumulator to a negative value across
+ * many CDC messages (which a plain "> len" check would miss
+ * before the size_t cast below turns it huge). Bound it to the
+ * RMB in either case so the wrap-around copy cannot run past
+ * rmb_desc->len. This enforces the documented
+ * 0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer,
+ * race-free against the producer update in the receive tasklet.
+ */
+ if (readable < 0 || readable > conn->rmb_desc->len)
+ readable = conn->rmb_desc->len;
splbytes = atomic_read(&conn->splice_pending);
if (!readable || (msg && splbytes)) {
if (splbytes)
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
2026-06-14 8:23 ` Bryam Vargas
@ 2026-06-14 8:23 ` Bryam Vargas
-1 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas via B4 Relay @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
From: Bryam Vargas <hexlabsecurity@proton.me>
On the SMC-D DMB-merge (nocopy) path, smc_cdc_msg_recv_action() advances
conn->sndbuf_space from the peer's consumer cursor:
diff_tx = smc_curs_diff(sndbuf_desc->len, &tx_curs_fin,
&local_rx_ctrl.cons);
atomic_add(diff_tx, &conn->sndbuf_space);
The consumer cursor is wire-controlled and unvalidated, and
smc_curs_diff()'s differing-wrap branch can return more than
sndbuf_desc->len, so a forged cursor drives sndbuf_space past the send
buffer (and across many CDC messages can overflow the signed counter
negative). smc_tx_sendmsg() reads it as the available write space and
performs a wrap-around copy whose second chunk (copylen - first_chunk,
written at ring offset 0) is never re-bounded to sndbuf_desc->len,
writing user data past the send buffer -- a heap out-of-bounds write of
attacker-influenced length and content.
Bound the write space to sndbuf_desc->len where it is consumed, treating
a negative (sign-overflowed) value as out of range too, so the copy
length can never exceed the ring. This enforces the documented
0 <= sndbuf_space <= sndbuf_desc->len invariant at the producer,
race-free against the CDC tasklet that advances sndbuf_space.
Fixes: cc0ab806fc52 ("net/smc: adapt cursor update when sndbuf and peer DMB are merged")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_tx.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3144b4b1fe29..5916f02060fb 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -233,6 +233,19 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
/* initialize variables for 1st iteration of subsequent loop */
/* could be just 1 byte, even after smc_tx_wait above */
writespace = atomic_read(&conn->sndbuf_space);
+ /* sndbuf_space is advanced from the peer's wire-controlled
+ * consumer cursor on the SMC-D DMB-merge path; a forged cursor
+ * can inflate it past the send buffer, or overflow the signed
+ * accumulator to a negative value across many CDC messages
+ * (which a plain "> len" check would miss before the size_t
+ * cast below turns it huge). Bound it to the send buffer in
+ * either case so the wrap-around write cannot run past
+ * sndbuf_desc->len. This enforces the documented
+ * 0 <= sndbuf_space <= sndbuf_desc->len invariant at the
+ * producer, race-free against the CDC tasklet.
+ */
+ if (writespace < 0 || writespace > conn->sndbuf_desc->len)
+ writespace = conn->sndbuf_desc->len;
/* not more than what user space asked for */
copylen = min_t(size_t, send_remaining, writespace);
/* determine start of sndbuf */
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
@ 2026-06-14 8:23 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas @ 2026-06-14 8:23 UTC (permalink / raw)
To: Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
On the SMC-D DMB-merge (nocopy) path, smc_cdc_msg_recv_action() advances
conn->sndbuf_space from the peer's consumer cursor:
diff_tx = smc_curs_diff(sndbuf_desc->len, &tx_curs_fin,
&local_rx_ctrl.cons);
atomic_add(diff_tx, &conn->sndbuf_space);
The consumer cursor is wire-controlled and unvalidated, and
smc_curs_diff()'s differing-wrap branch can return more than
sndbuf_desc->len, so a forged cursor drives sndbuf_space past the send
buffer (and across many CDC messages can overflow the signed counter
negative). smc_tx_sendmsg() reads it as the available write space and
performs a wrap-around copy whose second chunk (copylen - first_chunk,
written at ring offset 0) is never re-bounded to sndbuf_desc->len,
writing user data past the send buffer -- a heap out-of-bounds write of
attacker-influenced length and content.
Bound the write space to sndbuf_desc->len where it is consumed, treating
a negative (sign-overflowed) value as out of range too, so the copy
length can never exceed the ring. This enforces the documented
0 <= sndbuf_space <= sndbuf_desc->len invariant at the producer,
race-free against the CDC tasklet that advances sndbuf_space.
Fixes: cc0ab806fc52 ("net/smc: adapt cursor update when sndbuf and peer DMB are merged")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
net/smc/smc_tx.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3144b4b1fe29..5916f02060fb 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -233,6 +233,19 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
/* initialize variables for 1st iteration of subsequent loop */
/* could be just 1 byte, even after smc_tx_wait above */
writespace = atomic_read(&conn->sndbuf_space);
+ /* sndbuf_space is advanced from the peer's wire-controlled
+ * consumer cursor on the SMC-D DMB-merge path; a forged cursor
+ * can inflate it past the send buffer, or overflow the signed
+ * accumulator to a negative value across many CDC messages
+ * (which a plain "> len" check would miss before the size_t
+ * cast below turns it huge). Bound it to the send buffer in
+ * either case so the wrap-around write cannot run past
+ * sndbuf_desc->len. This enforces the documented
+ * 0 <= sndbuf_space <= sndbuf_desc->len invariant at the
+ * producer, race-free against the CDC tasklet.
+ */
+ if (writespace < 0 || writespace > conn->sndbuf_desc->len)
+ writespace = conn->sndbuf_desc->len;
/* not more than what user space asked for */
copylen = min_t(size_t, send_remaining, writespace);
/* determine start of sndbuf */
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
2026-06-14 8:23 ` Bryam Vargas
(?)
@ 2026-06-15 8:23 ` sashiko-bot
-1 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-06-15 8:23 UTC (permalink / raw)
To: Bryam Vargas via B4 Relay
Cc: Heiko Carstens, linux-s390, Vasily Gorbik, Alexander Gordeev
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] Use-After-Free in connection lookup when receiving CDC messages.
- [High] Data Race on cursor `tx_curs_fin` leading to cursor corruption during link failover.
--
commit 0a889a6cc8b55d508b84bd0b15d47a8711c29162
Author: Bryam Vargas <hexlabsecurity@proton.me>
net/smc: bound the wire-controlled producer cursor to the RMB
This commit bounds the wire-controlled producer cursor count to rmb_desc->len
to prevent out-of-bounds reads in the receive tasklet. It applies this bound
only to the producer cursor to avoid under-crediting peer space.
> diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
> index 696cc11f2303b..ca76ef6303566 100644
> --- a/net/smc/smc_cdc.h
> +++ b/net/smc/smc_cdc.h
[ ... ]
> @@ -260,6 +275,12 @@ static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
>
> temp.wrap = peer->prod.wrap;
> temp.count = peer->prod.count;
> + /* the peer producer cursor is wire-controlled and is used as a raw
> + * index into our RMB by the urgent path; bound it to the RMB. The
> + * consumer cursor below indexes the peer's RMB and is left unbounded.
> + */
> + if (temp.count > conn->rmb_desc->len)
> + temp.count = conn->rmb_desc->len;
> smc_curs_copy(&local->prod, &temp, conn);
>
> temp.wrap = peer->cons.wrap;
[Severity: High]
This isn't a bug introduced by this patch, but is it possible for a
use-after-free to occur in smc_cdc_rx_handler() during connection lookup?
Looking at net/smc/smc_cdc.c:smc_cdc_rx_handler(), the connection pointer is
retrieved under the read lock but accessed after the lock is dropped:
read_lock_bh(&lgr->conns_lock);
conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
read_unlock_bh(&lgr->conns_lock);
if (!conn || conn->out_of_sync)
return;
It seems a reference to the socket is not acquired before releasing the read
lock.
Could smc_conn_free() concurrently remove the connection and drop the final
socket reference, freeing the memory before the receive tasklet dereferences
conn->out_of_sync and calls smc_cdc_msg_recv() (which then performs
sock_hold() on freed memory)?
[Severity: High]
This is a pre-existing issue, but does the cursor tx_curs_fin suffer from a
data race during link failover?
In net/smc/smc_core.c:smc_switch_cursor(), the cursor is modified under
send_lock without the socket lock:
smc_curs_add(conn->sndbuf_desc->len,
&conn->tx_curs_fin, diff);
Concurrently, net/smc/smc_cdc.c:smc_cdc_tx_handler() overwrites the same
cursor under the socket lock without send_lock:
smc_curs_copy(&conn->tx_curs_fin, &cdcpend->cursor, conn);
Since smc_curs_add() performs non-atomic field updates on the 64-bit cursor
and they do not share a lock, could this concurrent read-modify-write and
store operation corrupt the cursor's wrap and count fields?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260614-b4-disp-edd64be9-v3-0-551fa514257e@proton.me?part=1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
2026-06-14 8:23 ` Bryam Vargas
(?)
@ 2026-06-15 8:23 ` sashiko-bot
-1 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-06-15 8:23 UTC (permalink / raw)
To: Bryam Vargas via B4 Relay
Cc: Alexander Gordeev, Heiko Carstens, linux-s390, Vasily Gorbik
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [High] Clamping a negative `bytes_to_rcv` to a positive length causes persistent phantom data delivery and infinite loops.
Pre-existing issues:
- [Critical] Unbounded `peeked_bytes` accumulation bypasses cursor wrap logic, leading to massive out-of-bounds read and kernel memory disclosure.
--
commit 7335ccb72e218a6f34e62494f13d66e4b8827bb3
Author: Bryam Vargas <hexlabsecurity@proton.me>
net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
The patch clamps the readable bytes calculation in smc_rx_recvmsg() to the RMB
length. This aims to prevent out-of-bounds reads if a forged producer cursor
drives bytes_to_rcv past the RMB size or into negative values.
> diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
> index c1d9b923938da..f461cf10b0855 100644
> --- a/net/smc/smc_rx.c
> +++ b/net/smc/smc_rx.c
> @@ -442,6 +442,18 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
> /* initialize variables for 1st iteration of subsequent loop */
> /* could be just 1 byte, even after waiting on data above */
> readable = smc_rx_data_available(conn, peeked_bytes);
> + /* bytes_to_rcv is accumulated from the peer's wire-controlled
> + * producer cursor; a forged cursor can drive it past the RMB,
> + * or overflow the signed accumulator to a negative value across
> + * many CDC messages (which a plain "> len" check would miss
> + * before the size_t cast below turns it huge). Bound it to the
> + * RMB in either case so the wrap-around copy cannot run past
> + * rmb_desc->len. This enforces the documented
> + * 0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer,
> + * race-free against the producer update in the receive tasklet.
> + */
> + if (readable < 0 || readable > conn->rmb_desc->len)
> + readable = conn->rmb_desc->len;
[Severity: Critical]
This isn't a bug introduced by this patch, but does unbounded peeked_bytes
accumulation bypass the cursor wrap logic?
If bytes_to_rcv is negative, subtracting peeked_bytes in smc_rx_data_available()
guarantees readable is negative. Clamping this negative value to a positive
buffer length (conn->rmb_desc->len) allows the loop to iterate indefinitely,
growing peeked_bytes unboundedly.
When peeked_bytes exceeds 2 * conn->rmb_desc->len, is it correctly wrapped
when applied to the cursor?
net/smc/smc_cdc.h:smc_curs_add() {
curs->count += value;
if (curs->count >= size) {
curs->wrap++;
curs->count -= size;
}
}
Since smc_curs_add() only subtracts size once, cons.count will remain >= size
if value is massive.
This out-of-bounds cursor would then cause an underflow in the chunk length
calculation:
net/smc/smc_rx.c:smc_rx_recvmsg() {
...
/* determine chunks where to read from rcvbuf */
/* either unwrapped case, or 1st chunk of wrapped case */
chunk_len = min_t(size_t, copylen, conn->rmb_desc->len -
cons.count);
...
}
Could this underflow result in a massive chunk_len and a subsequent
out-of-bounds kernel memory read?
[Severity: High]
Will clamping a negative bytes_to_rcv to a positive length cause an infinite
stream of phantom data?
If readable is clamped to a positive value when bytes_to_rcv is negative, the
function will authorize the phantom read and copy rmb_desc->len bytes of
stale data.
At the end of the loop:
net/smc/smc_rx.c:smc_rx_recvmsg() {
...
/* update cursors */
if (!(flags & MSG_PEEK)) {
/* increased in recv tasklet smc_cdc_msg_rcv() */
smp_mb__before_atomic();
atomic_sub(copylen, &conn->bytes_to_rcv);
...
}
When atomic_sub() executes, subtracting copylen from the already negative
bytes_to_rcv makes it even more negative.
Since bytes_to_rcv is never zeroed or corrected, will the next recvmsg()
call see a negative value, clamp it to rmb_desc->len again, and deliver another
chunk of phantom data indefinitely?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260614-b4-disp-edd64be9-v3-0-551fa514257e@proton.me?part=2
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/3] net/smc: bound wire-controlled CDC cursors against the local buffers
2026-06-14 8:23 ` Bryam Vargas
` (3 preceding siblings ...)
(?)
@ 2026-06-17 23:24 ` Jakub Kicinski
-1 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2026-06-17 23:24 UTC (permalink / raw)
To: Bryam Vargas via B4 Relay
Cc: hexlabsecurity, Wenjia Zhang, Dust Li, D. Wythe, Sidraya Jayagond,
Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Tony Lu
On Sun, 14 Jun 2026 03:23:29 -0500 Bryam Vargas via B4 Relay wrote:
> A peer's CDC producer/consumer cursors are copied from the wire and used,
> without an upper bound against the local buffers, as (a) a raw index into the
> RMB on the urgent path, (b) the receive length in smc_rx_recvmsg(), and (c) the
> send length in smc_tx_sendmsg() on the SMC-D DMB-merge path. A malicious or
> buggy peer can forge a cursor so each of these runs past the relevant buffer:
> an out-of-bounds read of adjacent kernel memory (disclosed to the peer) on the
> receive/urgent side, and an out-of-bounds write of attacker-influenced length
> and content on the send side.
Once again, SMC maintainers -- please review.
--
mping: SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
2026-06-14 8:23 ` Bryam Vargas
(?)
(?)
@ 2026-06-18 14:29 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
-1 siblings, 1 reply; 19+ messages in thread
From: Dust Li @ 2026-06-18 14:29 UTC (permalink / raw)
To: hexlabsecurity, Wenjia Zhang, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
On 2026-06-14 03:23:30, Bryam Vargas via B4 Relay wrote:
>From: Bryam Vargas <hexlabsecurity@proton.me>
>
>smc_cdc_cursor_to_host() (SMC-R) and smcd_cdc_msg_to_host() (SMC-D)
>import the peer's producer cursor from the wire into the local
>connection cursor with no upper bound against the receive buffer (RMB).
>The urgent path then uses that count as a raw index:
>
> base = conn->rmb_desc->cpu_addr + conn->rx_off;
> conn->urg_rx_byte = *(base + conn->urg_curs.count - 1);
>
>so a peer that advertises a producer cursor past rmb_desc->len reads
>out of bounds of the RMB allocation in the receive tasklet (softirq).
>
>Bound the producer cursor count to rmb_desc->len at the conversion
>boundary, for both SMC-R and SMC-D. Apply the bound to the producer
>cursor only: the consumer cursor indexes the peer's RMB and is bounded
>by peer_rmbe_size, so clamping it to our rmb_desc->len would
>under-credit peer_rmbe_space and stall transmit to a peer whose RMB is
>larger than ours.
>
>Fixes: de8474eb9d50 ("net/smc: urgent data support")
>Cc: stable@vger.kernel.org
>Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
>---
> net/smc/smc_cdc.h | 27 ++++++++++++++++++++++++---
> 1 file changed, 24 insertions(+), 3 deletions(-)
>
>diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
>index 696cc11f2303..ca76ef630356 100644
>--- a/net/smc/smc_cdc.h
>+++ b/net/smc/smc_cdc.h
>@@ -221,7 +221,8 @@ static inline void smc_host_msg_to_cdc(struct smc_cdc_msg *peer,
>
> static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
> union smc_cdc_cursor *peer,
>- struct smc_connection *conn)
>+ struct smc_connection *conn,
>+ int max_count)
> {
> union smc_host_cursor temp, old;
> union smc_cdc_cursor net;
>@@ -235,6 +236,15 @@ static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
> if ((old.wrap == temp.wrap) &&
> (old.count > temp.count))
> return;
>+ /* The peer producer cursor is wire-controlled and is later used as a
>+ * raw index into our RMB by the urgent path; bound its count to the
>+ * RMB. max_count == 0 leaves the consumer cursor unbounded here: it
>+ * indexes the peer's RMB (bounded by peer_rmbe_size, not our
>+ * rmb_desc->len), so clamping it to rmb_desc->len would under-credit
>+ * peer_rmbe_space and stall transmit to peers with a larger RMB.
>+ */
>+ if (max_count && temp.count > max_count)
>+ temp.count = max_count;
> smc_curs_copy(local, &temp, conn);
> }
>
>@@ -246,8 +256,13 @@ static inline void smcr_cdc_msg_to_host(struct smc_host_cdc_msg *local,
> local->len = peer->len;
> local->seqno = ntohs(peer->seqno);
> local->token = ntohl(peer->token);
>- smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn);
>- smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn);
>+ /* bound the wire-controlled producer cursor to our RMB (used as a raw
>+ * index by the urgent path); leave the consumer cursor unbounded -- it
>+ * indexes the peer's RMB and is bounded by peer_rmbe_size.
>+ */
>+ smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn,
>+ conn->rmb_desc->len);
>+ smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn, 0);
> local->prod_flags = peer->prod_flags;
> local->conn_state_flags = peer->conn_state_flags;
> }
>@@ -260,6 +275,12 @@ static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
>
> temp.wrap = peer->prod.wrap;
> temp.count = peer->prod.count;
>+ /* the peer producer cursor is wire-controlled and is used as a raw
>+ * index into our RMB by the urgent path; bound it to the RMB. The
>+ * consumer cursor below indexes the peer's RMB and is left unbounded.
>+ */
>+ if (temp.count > conn->rmb_desc->len)
>+ temp.count = conn->rmb_desc->len;
> smc_curs_copy(&local->prod, &temp, conn);
>
> temp.wrap = peer->cons.wrap;
Hi Bryam,
I agree the issue is real. SMC-R's original design didn't fully
account for misbehaving peers, which is the root cause behind a
number of similar issues we've seen. The good news is that this
class of problem isn't easy to hit in practice, so it isn't
particularly urgent.
On the approach itself: once we detect that the peer is misbehaving,
I think the right action is to abort the connection and record the
event, rather than silently clamp. An invalid CDC means the whole
communication state can no longer be trusted, so continuing on a
clamped value just papers over a peer bug.
I'd suggest we add a dedicated CDC message check, and route any
failure through the existing abort path, maybe something like bellow:
static bool smc_cdc_msg_check(struct smc_connection *conn,
struct smc_cdc_msg *cdc)
{
u32 prod_count = ntohs(cdc->prod.count);
u32 cons_count = ntohs(cdc->cons.count);
if (prod_count > conn->rmb_desc->len ||
cons_count > conn->peer_rmbe_size ||
cdc->prod.wrap > 1 || cdc->cons.wrap > 1) {
this_cpu_inc(net->smc.smc_stats->...cdc_inval);
net_ratelimited_function(pr_warn,
"smc: invalid CDC from peer (token=%u)\n",
ntohl(cdc->token));
return false;
}
return true;
}
For -stable, your current minimal patch is fine. For net-next, though, I'd prefer
the approach above: validate at the wire boundary, abort on violation, and
make the event observable via smc_stats and a ratelimited warning.
Best regards,
Dust
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
2026-06-14 8:23 ` Bryam Vargas
(?)
(?)
@ 2026-06-18 16:03 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
-1 siblings, 1 reply; 19+ messages in thread
From: Dust Li @ 2026-06-18 16:03 UTC (permalink / raw)
To: hexlabsecurity, Wenjia Zhang, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
On 2026-06-14 03:23:31, Bryam Vargas via B4 Relay wrote:
>From: Bryam Vargas <hexlabsecurity@proton.me>
>
>conn->bytes_to_rcv is accumulated in the receive tasklet from the peer's
>producer cursor:
>
> diff_prod = smc_curs_diff(rmb_desc->len, &prod_old, &prod_new);
> atomic_add(diff_prod, &conn->bytes_to_rcv);
>
>smc_curs_diff()'s differing-wrap branch returns (size - old.count) +
>new.count, which exceeds rmb_desc->len for a forged producer cursor and
>accumulates across CDC messages, so bytes_to_rcv can grow past the RMB
>(and across many messages can overflow the signed counter negative).
>smc_rx_recvmsg() reads it as the number of readable bytes and performs a
>wrap-around copy whose second chunk (copylen - first_chunk, read from
>ring offset 0) is never re-bounded to rmb_desc->len, reading past the
>RMB into adjacent kernel memory and disclosing it to the peer.
Hi Bryam,
Once we validate the CDC message at the input boundary (as in the
previous patch), bytes_to_rcv can never exceed rmb_desc->len, so
this check becomes unreachable. So I don't think this patch is needed.
Best regards,
Dust
>
>Bound the readable count to rmb_desc->len where it is consumed, treating
>a negative (sign-overflowed) value as out of range too, so the copy
>length can never exceed the ring. This enforces the documented
>0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer, where it
>is race-free against the producer update that runs in the receive
>tasklet.
>
>Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
>Cc: stable@vger.kernel.org
>Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
>---
> net/smc/smc_rx.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
>diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
>index c1d9b923938d..f461cf10b085 100644
>--- a/net/smc/smc_rx.c
>+++ b/net/smc/smc_rx.c
>@@ -442,6 +442,18 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
> /* initialize variables for 1st iteration of subsequent loop */
> /* could be just 1 byte, even after waiting on data above */
> readable = smc_rx_data_available(conn, peeked_bytes);
>+ /* bytes_to_rcv is accumulated from the peer's wire-controlled
>+ * producer cursor; a forged cursor can drive it past the RMB,
>+ * or overflow the signed accumulator to a negative value across
>+ * many CDC messages (which a plain "> len" check would miss
>+ * before the size_t cast below turns it huge). Bound it to the
>+ * RMB in either case so the wrap-around copy cannot run past
>+ * rmb_desc->len. This enforces the documented
>+ * 0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer,
>+ * race-free against the producer update in the receive tasklet.
>+ */
>+ if (readable < 0 || readable > conn->rmb_desc->len)
>+ readable = conn->rmb_desc->len;
> splbytes = atomic_read(&conn->splice_pending);
> if (!readable || (msg && splbytes)) {
> if (splbytes)
>
>--
>2.43.0
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
2026-06-14 8:23 ` Bryam Vargas
(?)
@ 2026-06-18 16:08 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
-1 siblings, 1 reply; 19+ messages in thread
From: Dust Li @ 2026-06-18 16:08 UTC (permalink / raw)
To: hexlabsecurity, Wenjia Zhang, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
On 2026-06-14 03:23:32, Bryam Vargas via B4 Relay wrote:
>From: Bryam Vargas <hexlabsecurity@proton.me>
>
>On the SMC-D DMB-merge (nocopy) path, smc_cdc_msg_recv_action() advances
>conn->sndbuf_space from the peer's consumer cursor:
>
> diff_tx = smc_curs_diff(sndbuf_desc->len, &tx_curs_fin,
> &local_rx_ctrl.cons);
> atomic_add(diff_tx, &conn->sndbuf_space);
>
>The consumer cursor is wire-controlled and unvalidated, and
>smc_curs_diff()'s differing-wrap branch can return more than
>sndbuf_desc->len, so a forged cursor drives sndbuf_space past the send
>buffer (and across many CDC messages can overflow the signed counter
>negative). smc_tx_sendmsg() reads it as the available write space and
>performs a wrap-around copy whose second chunk (copylen - first_chunk,
>written at ring offset 0) is never re-bounded to sndbuf_desc->len,
>writing user data past the send buffer -- a heap out-of-bounds write of
>attacker-influenced length and content.
Hi Bryam,
I think this is the same as patch #2.
Best regards,
Dust
>
>Bound the write space to sndbuf_desc->len where it is consumed, treating
>a negative (sign-overflowed) value as out of range too, so the copy
>length can never exceed the ring. This enforces the documented
>0 <= sndbuf_space <= sndbuf_desc->len invariant at the producer,
>race-free against the CDC tasklet that advances sndbuf_space.
>
>Fixes: cc0ab806fc52 ("net/smc: adapt cursor update when sndbuf and peer DMB are merged")
>Cc: stable@vger.kernel.org
>Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
>---
> net/smc/smc_tx.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
>diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
>index 3144b4b1fe29..5916f02060fb 100644
>--- a/net/smc/smc_tx.c
>+++ b/net/smc/smc_tx.c
>@@ -233,6 +233,19 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> /* initialize variables for 1st iteration of subsequent loop */
> /* could be just 1 byte, even after smc_tx_wait above */
> writespace = atomic_read(&conn->sndbuf_space);
>+ /* sndbuf_space is advanced from the peer's wire-controlled
>+ * consumer cursor on the SMC-D DMB-merge path; a forged cursor
>+ * can inflate it past the send buffer, or overflow the signed
>+ * accumulator to a negative value across many CDC messages
>+ * (which a plain "> len" check would miss before the size_t
>+ * cast below turns it huge). Bound it to the send buffer in
>+ * either case so the wrap-around write cannot run past
>+ * sndbuf_desc->len. This enforces the documented
>+ * 0 <= sndbuf_space <= sndbuf_desc->len invariant at the
>+ * producer, race-free against the CDC tasklet.
>+ */
>+ if (writespace < 0 || writespace > conn->sndbuf_desc->len)
>+ writespace = conn->sndbuf_desc->len;
> /* not more than what user space asked for */
> copylen = min_t(size_t, send_remaining, writespace);
> /* determine start of sndbuf */
>
>--
>2.43.0
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
2026-06-18 14:29 ` Dust Li
@ 2026-06-18 22:11 ` Bryam Vargas
2026-06-19 3:26 ` Dust Li
0 siblings, 1 reply; 19+ messages in thread
From: Bryam Vargas @ 2026-06-18 22:11 UTC (permalink / raw)
To: Dust Li
Cc: Wenjia Zhang, D . Wythe, Sidraya Jayagond, Eric Dumazet,
David S . Miller, Mahanta Jambigi, Wen Gu, Simon Horman,
Ursula Braun, Stefan Raspl, Tony Lu, Paolo Abeni, Jakub Kicinski,
netdev, linux-s390, linux-rdma, linux-kernel
On Thu, 18 Jun 2026 22:29:20 +0800, Dust Li wrote:
> once we detect that the peer is misbehaving, I think the right action is
> to abort the connection and record the event, rather than silently clamp.
[...]
> u32 prod_count = ntohs(cdc->prod.count);
> ...
> cdc->prod.wrap > 1 || cdc->cons.wrap > 1) {
Thanks for taking a look, Dust. I'm on board with the direction for net-next --
aborting and recording a bad CDC is cleaner than clamping something we already know
we can't trust, and as you say, the clamp just papers over the peer bug. So: minimal
clamp stays for -stable, and net-next gets the wire-boundary check + abort (through
abort_work, with an smc_stats counter and a ratelimited warn).
A few things I ran into on the check itself, though:
- count is __be32, so it wants ntohl() rather than ntohs() -- ntohs() ends up reading
the wrong half.
- I'd drop the wrap > 1 tests. wrap is a free-running counter (smc_curs_add does
wrap++), so a connection that legitimately wraps its RMB ends up with wrap > 1; and
since it's a __be16 read raw, on little-endian wrap==1 already reads as 0x0100 and
we'd abort on the very first wrap. I don't think there's a sane upper bound to put
on wrap.
- the check is typed for SMC-R, but the SMC-D path hands a host-order smcd_cdc_msg to
smc_cdc_msg_recv() cast as smc_cdc_msg (smc_cdc.c:456), so ntohl/ntohs would
double-swap it there. The simplest thing I found is one check on the host cursor
right after smc_cdc_msg_to_host(), before the diff/atomic_add block -- that covers
SMC-R and SMC-D in one place.
Minor: >= len rather than > len (count is an offset in [0,len)), and peer_rmbe_size
is signed so worth guarding. The cons vs peer_rmbe_size bound looks right to me.
Happy to spin it whichever way you prefer.
Bryam
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
2026-06-18 16:03 ` Dust Li
@ 2026-06-18 22:11 ` Bryam Vargas
2026-06-19 3:31 ` Dust Li
0 siblings, 1 reply; 19+ messages in thread
From: Bryam Vargas @ 2026-06-18 22:11 UTC (permalink / raw)
To: Dust Li
Cc: Wenjia Zhang, D . Wythe, Sidraya Jayagond, Eric Dumazet,
David S . Miller, Mahanta Jambigi, Wen Gu, Simon Horman,
Ursula Braun, Stefan Raspl, Tony Lu, Paolo Abeni, Jakub Kicinski,
netdev, linux-s390, linux-rdma, linux-kernel
On Fri, 19 Jun 2026 00:03:17 +0800, Dust Li wrote:
> Once we validate the CDC message at the input boundary (as in the
> previous patch), bytes_to_rcv can never exceed rmb_desc->len, so
> this check becomes unreachable. So I don't think this patch is needed.
This one I'd actually like to keep, and let me walk through why -- I don't think the
boundary check closes it.
bytes_to_rcv isn't set to a cursor count, it's a running accumulator:
smc_cdc_msg_recv_action does atomic_add(diff_prod, &bytes_to_rcv), where
diff_prod = smc_curs_diff(rmb_desc->len, old, new). So bounding each cursor's count at
the boundary doesn't bound the sum of the deltas.
The differing-wrap branch of smc_curs_diff returns (len - old.count) + new.count,
which is up to 2*len-1 even when both cursors pass count <= len. With len=16, a prod
going (0,0) -> (1,15) gives diff=31, so bytes_to_rcv is already 31 > len after one
message; alternating wrap 0<->1 at count=15 keeps adding ~len and eventually wraps the
atomic_t negative. I have an A/B for this -- happy to send it along.
So to make this truly unreachable from the boundary check, we'd need to bound
prod - cons <= len there, not just the absolute count. The consumer-side clamp is two
lines and race-free against the tasklet, so my preference would be to keep it as a
backstop -- but if you'd rather fold it into a stronger boundary check instead, I'm
open to that.
Bryam
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
2026-06-18 16:08 ` Dust Li
@ 2026-06-18 22:11 ` Bryam Vargas
0 siblings, 0 replies; 19+ messages in thread
From: Bryam Vargas @ 2026-06-18 22:11 UTC (permalink / raw)
To: Dust Li
Cc: Wenjia Zhang, D . Wythe, Sidraya Jayagond, Eric Dumazet,
David S . Miller, Mahanta Jambigi, Wen Gu, Simon Horman,
Ursula Braun, Stefan Raspl, Tony Lu, Paolo Abeni, Jakub Kicinski,
netdev, linux-s390, linux-rdma, linux-kernel
On Fri, 19 Jun 2026 00:08:15 +0800, Dust Li wrote:
> I think this is the same as patch #2.
Same story as 2/3, just on the SMC-D send side: sndbuf_space accumulates
diff_tx = smc_curs_diff(sndbuf_desc->len, tx_curs_fin, cons) from the peer's consumer
cursor, so a cons alternating wrap 0<->1 walks it past sndbuf_desc->len (and negative
over time), and smc_tx_sendmsg's wrap-around write then runs off the end of the
buffer. The boundary count check doesn't bound diff_tx here either, so I'd keep the
same two-line bound. The same A/B covers it.
Bryam
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
2026-06-18 22:11 ` Bryam Vargas
@ 2026-06-19 3:26 ` Dust Li
0 siblings, 0 replies; 19+ messages in thread
From: Dust Li @ 2026-06-19 3:26 UTC (permalink / raw)
To: Bryam Vargas
Cc: Wenjia Zhang, D . Wythe, Sidraya Jayagond, Eric Dumazet,
David S . Miller, Mahanta Jambigi, Wen Gu, Simon Horman,
Ursula Braun, Stefan Raspl, Tony Lu, Paolo Abeni, Jakub Kicinski,
netdev, linux-s390, linux-rdma, linux-kernel
On 2026-06-18 22:11:05, Bryam Vargas wrote:
>On Thu, 18 Jun 2026 22:29:20 +0800, Dust Li wrote:
>> once we detect that the peer is misbehaving, I think the right action is
>> to abort the connection and record the event, rather than silently clamp.
>[...]
>> u32 prod_count = ntohs(cdc->prod.count);
>> ...
>> cdc->prod.wrap > 1 || cdc->cons.wrap > 1) {
>
>Thanks for taking a look, Dust. I'm on board with the direction for net-next --
>aborting and recording a bad CDC is cleaner than clamping something we already know
>we can't trust, and as you say, the clamp just papers over the peer bug. So: minimal
>clamp stays for -stable, and net-next gets the wire-boundary check + abort (through
>abort_work, with an smc_stats counter and a ratelimited warn).
That's greate. Then I think we can move on in this direction.
>
>A few things I ran into on the check itself, though:
>
>- count is __be32, so it wants ntohl() rather than ntohs() -- ntohs() ends up reading
> the wrong half.
Right
>
>- I'd drop the wrap > 1 tests. wrap is a free-running counter (smc_curs_add does
> wrap++), so a connection that legitimately wraps its RMB ends up with wrap > 1; and
> since it's a __be16 read raw, on little-endian wrap==1 already reads as 0x0100 and
> we'd abort on the very first wrap. I don't think there's a sane upper bound to put
> on wrap.
Agree
>
>- the check is typed for SMC-R, but the SMC-D path hands a host-order smcd_cdc_msg to
> smc_cdc_msg_recv() cast as smc_cdc_msg (smc_cdc.c:456), so ntohl/ntohs would
> double-swap it there. The simplest thing I found is one check on the host cursor
> right after smc_cdc_msg_to_host(), before the diff/atomic_add block -- that covers
> SMC-R and SMC-D in one place.
Agree
>
>Minor: >= len rather than > len (count is an offset in [0,len)), and peer_rmbe_size
>is signed so worth guarding. The cons vs peer_rmbe_size bound looks right to me.
No problem
Best regards,
Dust
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
2026-06-18 22:11 ` Bryam Vargas
@ 2026-06-19 3:31 ` Dust Li
0 siblings, 0 replies; 19+ messages in thread
From: Dust Li @ 2026-06-19 3:31 UTC (permalink / raw)
To: Bryam Vargas
Cc: Wenjia Zhang, D . Wythe, Sidraya Jayagond, Eric Dumazet,
David S . Miller, Mahanta Jambigi, Wen Gu, Simon Horman,
Ursula Braun, Stefan Raspl, Tony Lu, Paolo Abeni, Jakub Kicinski,
netdev, linux-s390, linux-rdma, linux-kernel
On 2026-06-18 22:11:12, Bryam Vargas wrote:
>On Fri, 19 Jun 2026 00:03:17 +0800, Dust Li wrote:
>> Once we validate the CDC message at the input boundary (as in the
>> previous patch), bytes_to_rcv can never exceed rmb_desc->len, so
>> this check becomes unreachable. So I don't think this patch is needed.
>
>This one I'd actually like to keep, and let me walk through why -- I don't think the
>boundary check closes it.
>
>bytes_to_rcv isn't set to a cursor count, it's a running accumulator:
>smc_cdc_msg_recv_action does atomic_add(diff_prod, &bytes_to_rcv), where
>diff_prod = smc_curs_diff(rmb_desc->len, old, new). So bounding each cursor's count at
>the boundary doesn't bound the sum of the deltas.
>
>The differing-wrap branch of smc_curs_diff returns (len - old.count) + new.count,
>which is up to 2*len-1 even when both cursors pass count <= len. With len=16, a prod
>going (0,0) -> (1,15) gives diff=31, so bytes_to_rcv is already 31 > len after one
>message; alternating wrap 0<->1 at count=15 keeps adding ~len and eventually wraps the
>atomic_t negative. I have an A/B for this -- happy to send it along.
Glad to see you A/B test, I think we can decide after we see the real
issue.
>
>So to make this truly unreachable from the boundary check, we'd need to bound
>prod - cons <= len there, not just the absolute count. The consumer-side clamp is two
>lines and race-free against the tasklet, so my preference would be to keep it as a
>backstop -- but if you'd rather fold it into a stronger boundary check instead, I'm
>open to that.
Another thing I'd worry about is if this really happens, should we also
abort the connection like what we did in patch #1 ?
Best regards,
Dust
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-06-19 3:31 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-14 8:23 [PATCH v3 0/3] net/smc: bound wire-controlled CDC cursors against the local buffers Bryam Vargas via B4 Relay
2026-06-14 8:23 ` Bryam Vargas
2026-06-14 8:23 ` [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB Bryam Vargas via B4 Relay
2026-06-14 8:23 ` Bryam Vargas
2026-06-15 8:23 ` sashiko-bot
2026-06-18 14:29 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
2026-06-19 3:26 ` Dust Li
2026-06-14 8:23 ` [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg() Bryam Vargas via B4 Relay
2026-06-14 8:23 ` Bryam Vargas
2026-06-15 8:23 ` sashiko-bot
2026-06-18 16:03 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
2026-06-19 3:31 ` Dust Li
2026-06-14 8:23 ` [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg() Bryam Vargas via B4 Relay
2026-06-14 8:23 ` Bryam Vargas
2026-06-18 16:08 ` Dust Li
2026-06-18 22:11 ` Bryam Vargas
2026-06-17 23:24 ` [PATCH v3 0/3] net/smc: bound wire-controlled CDC cursors against the local buffers Jakub Kicinski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.