From: Jakub Kicinski <kuba@kernel.org>
To: hexlabsecurity@proton.me
Cc: Jakub Kicinski <kuba@kernel.org>,
alibuda@linux.alibaba.com, dust.li@linux.alibaba.com,
sidraya@linux.ibm.com, wenjia@linux.ibm.com,
mjambigi@linux.ibm.com, tonylu@linux.alibaba.com,
guwen@linux.alibaba.com, davem@davemloft.net,
edumazet@google.com, pabeni@redhat.com, horms@kernel.org,
raspl@linux.ibm.com, ubraun@linux.ibm.com,
linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net/smc: bound the peer producer cursor on SMC-D and SMC-R CDC receive
Date: Sat, 13 Jun 2026 17:31:11 -0700 [thread overview]
Message-ID: <20260614003111.383195-1-kuba@kernel.org> (raw)
In-Reply-To: <20260610084803.186516-1-hexlabsecurity@proton.me>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/smc: bound the peer producer cursor on SMC-D and SMC-R CDC receive
This patch attempts to close an out-of-bounds read in the SMC CDC
receive path by clamping the peer producer cursor and the
bytes_to_rcv accumulator against rmb_desc->len. A couple of
questions on the approach below.
> diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
> index 619b3bab3824..738c45fd5cd0 100644
> --- a/net/smc/smc_cdc.c
> +++ b/net/smc/smc_cdc.c
> @@ -382,6 +382,8 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
> smp_mb__before_atomic();
> atomic_add(diff_prod, &conn->bytes_to_rcv);
> /* guarantee 0 <= bytes_to_rcv <= rmb_desc->len */
> + if (atomic_read(&conn->bytes_to_rcv) > conn->rmb_desc->len)
> + atomic_set(&conn->bytes_to_rcv, conn->rmb_desc->len);
[Critical]
Is this clamp safe against a concurrent smc_rx_recvmsg() on another
CPU?
smc_cdc_msg_recv() runs the BH path under bh_lock_sock() (slock with
BH disabled), while smc_rx_recvmsg() runs under lock_sock(), which
sets sk_lock.owned but releases the slock for the body of the
syscall. The body contains:
/* update cursors */
if (!(flags & MSG_PEEK)) {
smp_mb__before_atomic();
atomic_sub(copylen, &conn->bytes_to_rcv);
smp_mb__after_atomic();
...
}
so the BH on CPU0 and the recvmsg body on CPU1 can run concurrently
on the same connection.
The new clamp here is a non-atomic read-then-set:
atomic_add(diff_prod, &conn->bytes_to_rcv);
if (atomic_read(&conn->bytes_to_rcv) > conn->rmb_desc->len)
atomic_set(&conn->bytes_to_rcv, conn->rmb_desc->len);
With L = rmb_desc->len, the commit message notes that the
differing-wrap branch of smc_curs_diff() can return up to
(L - old.count) + new.count, i.e. close to 2L, even after the count
itself is clamped. Can the following interleaving happen?
CPU0 (BH) CPU1 (recvmsg)
atomic_add(diff_prod) -> X (~2L)
readable =
atomic_read(&bytes_to_rcv) = X
copylen = min(remaining, X)
chunk_len = min(copylen,
L - cons.count)
second chunk = copylen - chunk_len
memcpy_to_msg(rcvbuf_base + 0,
second_chunk_len)
atomic_read() -> X
atomic_set(&bytes_to_rcv, L)
With cons.count near L/2 and copylen near 2L, the second chunk reads
roughly 3L/2 bytes starting at rcvbuf_base + 0, walking past the end
of the RMB allocation. Is that not the same OOB read this patch is
meant to close?
A second variant of the same race seems to silently corrupt the
counter:
CPU0 (BH) CPU1 (recvmsg)
atomic_add(diff_prod) -> X > L
atomic_sub(Y) -> X - Y
atomic_read() -> X - Y
(X - Y still > L?)
atomic_set(L)
If X - Y is still above L, CPU0 overwrites the consumer's decrement
and bytes_to_rcv ends up at L even though Y bytes were already
consumed, leaving a persistent over-report that future recvmsg calls
use as readable.
smp_mb__after_atomic() does not turn this read-then-set into an RMW;
atomic_set() is a relaxed store. Would clamping diff_prod against
(L - atomic_read(&bytes_to_rcv)) before the atomic_add (or using an
atomic_cmpxchg loop) close both windows?
> smp_mb__after_atomic();
> smc->sk.sk_data_ready(&smc->sk);
> } else {
> diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
> index 696cc11f2303..7fa6e0d3817f 100644
> --- a/net/smc/smc_cdc.h
> +++ b/net/smc/smc_cdc.h
> @@ -230,6 +230,12 @@ static inline void smc_cdc_cursor_to_host(union smc_host_cursor *local,
> smc_curs_copy_net(&net, peer, conn);
> temp.count = ntohl(net.count);
> temp.wrap = ntohs(net.wrap);
> + /* the peer producer cursor is wire-controlled; bound the SMC-R count to
> + * our RMB before it is used as a raw index by the urgent path, mirroring
> + * the SMC-D conversion in smcd_cdc_msg_to_host().
> + */
> + if (temp.count > conn->rmb_desc->len)
> + temp.count = conn->rmb_desc->len;
[High]
Should this clamp live in the shared helper, or only on the prod
call site?
smc_cdc_cursor_to_host() is invoked twice from smcr_cdc_msg_to_host():
smc_cdc_cursor_to_host(&local->prod, &peer->prod, conn);
smc_cdc_cursor_to_host(&local->cons, &peer->cons, conn);
The producer cursor is bounded by our local RMB (conn->rmb_desc->len),
but the consumer cursor lives in the peer's RMB space and is bounded
by conn->peer_rmbe_size, which is set independently in
smcr_conn_save_peer_info() from the peer's CLC bufsize.
smc_cdc_msg_recv_action() confirms the asymmetry:
diff_cons = smc_curs_diff(conn->peer_rmbe_size, &cons_old,
&conn->local_rx_ctrl.cons);
...
atomic_add(diff_cons, &conn->peer_rmbe_space);
so peer_rmbe_size (not rmb_desc->len) is the right bound for the cons
cursor.
Whenever peer_rmbe_size > rmb_desc->len, legitimate peer cons.count
values in (rmb_desc->len, peer_rmbe_size] are silently truncated to
rmb_desc->len here, diff_cons is under-credited and
peer_rmbe_space stays artificially low, which can stall TX flow
control on perfectly well-behaved peers.
The SMC-D companion in this same patch is structured the way the
prod/cons asymmetry suggests -- it clamps prod and copies cons
unmodified:
> @@ -260,6 +266,12 @@ static inline void smcd_cdc_msg_to_host(struct smc_host_cdc_msg *local,
>
> temp.wrap = peer->prod.wrap;
> temp.count = peer->prod.count;
> + /* the peer producer cursor is wire-controlled; a count past our RMB is
> + * used as a raw index by the urgent path (smc_cdc_handle_urg_data_arrival)
> + * and as a length by the recv path. Bound it to the RMB.
> + */
> + if (temp.count > conn->rmb_desc->len)
> + temp.count = conn->rmb_desc->len;
> smc_curs_copy(&local->prod, &temp, conn);
>
> temp.wrap = peer->cons.wrap;
Should the SMC-R hunk be moved out of the shared smc_cdc_cursor_to_host()
helper and onto the prod call site in smcr_cdc_msg_to_host(), to match
the SMC-D shape?
--
pw-bot: cr
prev parent reply other threads:[~2026-06-14 0:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 8:48 [PATCH] net/smc: bound the peer producer cursor on SMC-D and SMC-R CDC receive Bryam Vargas
2026-06-14 0:31 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260614003111.383195-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=alibuda@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=dust.li@linux.alibaba.com \
--cc=edumazet@google.com \
--cc=guwen@linux.alibaba.com \
--cc=hexlabsecurity@proton.me \
--cc=horms@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mjambigi@linux.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=raspl@linux.ibm.com \
--cc=sidraya@linux.ibm.com \
--cc=tonylu@linux.alibaba.com \
--cc=ubraun@linux.ibm.com \
--cc=wenjia@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox