* Re: [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock
2026-06-29 9:51 [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock Sechang Lim
@ 2026-06-30 9:52 ` sashiko-bot
2026-07-01 15:27 ` Paolo Abeni
2026-07-02 3:14 ` D. Wythe
2 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-06-30 9:52 UTC (permalink / raw)
To: Sechang Lim
Cc: Heiko Carstens, linux-s390, Vasily Gorbik, Alexander Gordeev, bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Unconditional overwrite of sk_user_data during SMC fallback causes sk_psock memory leak and critical type confusion (control flow hijack).
--
commit c092e3d84654761aba7753030c182ca2840fc519
Author: Sechang Lim <rhkrqnwk98@gmail.com>
net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock
A passive-open child inherits the listener's smc_clcsock_data_ready callback.
This commit reserves the child's sk_user_data with a NULL pointer tagged
SK_USER_DATA_NOCOPY to prevent sockmap from claiming the slot, and releases
it at accept.
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index b5db69073e20f..78f162344fe3a 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
[ ... ]
> @@ -1786,6 +1791,9 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> if (lsmc->clcsk_error_report)
> new_clcsock->sk->sk_error_report = lsmc->clcsk_error_report;
> }
> + /* release the slot reserved in smc_tcp_syn_recv_sock() */
> + rcu_assign_sk_user_data(new_clcsock->sk, NULL);
[Severity: Critical]
This is a pre-existing issue, but does clearing this reservation to NULL
explicitly re-open the window for a BPF sockmap program to attach to the
socket during the subsequent SMC handshake?
If a BPF SOCK_OPS program attaches the established TCP socket to a sockmap,
sk_psock_init() can set sk_user_data to a new sk_psock and hijack the socket
callbacks. Should the SMC handshake fail and fall back to TCP, the fallback
path appears to blindly overwrite sk_user_data:
net/smc/af_smc.c:smc_fback_replace_callbacks() {
...
write_lock_bh(&clcsk->sk_callback_lock);
__rcu_assign_sk_user_data_with_flags(clcsk, smc, SK_USER_DATA_NOCOPY);
...
}
Can this lead to a type confusion if the hijacked sockmap callbacks are left
active? When TCP data arrives, smc_fback_data_ready would forward to the
sockmap callback which dereferences sk_user_data expecting a struct sk_psock,
but receives a struct smc_sock instead.
> + write_unlock_bh(&new_clcsock->sk->sk_callback_lock);
>
> (*new_smc)->clcsock = new_clcsock;
> out:
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260629095140.679754-1-rhkrqnwk98@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock
2026-06-29 9:51 [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock Sechang Lim
2026-06-30 9:52 ` sashiko-bot
@ 2026-07-01 15:27 ` Paolo Abeni
2026-07-02 3:17 ` D. Wythe
2026-07-02 3:14 ` D. Wythe
2 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2026-07-01 15:27 UTC (permalink / raw)
To: Sechang Lim, D . Wythe, Dust Li, Sidraya Jayagond, Wenjia Zhang,
David S . Miller, Eric Dumazet, Jakub Kicinski
Cc: Jiayuan Chen, Mahanta Jambigi, Tony Lu, Wen Gu, Simon Horman,
Karsten Graul, Guvenc Gulce, Ursula Braun, linux-rdma, linux-s390,
netdev, linux-kernel, bpf
On 6/29/26 11:51 AM, Sechang Lim wrote:
> A passive-open child inherits the listener's smc_clcsock_data_ready().
> sk_clone_lock() clears its sk_user_data to NULL because the listener tagged
> it SK_USER_DATA_NOCOPY. Until accept restores the callback, a BPF sock_ops
> program can add the established child to a sockmap, and sk_psock_init()
> installs a sk_psock into the NULL sk_user_data. The inherited callback then
> reads it back through smc_clcsock_user_data(), which strips only NOCOPY,
> takes the sk_psock for an smc_sock, and dereferences a clcsk_* field past
> its end:
>
> BUG: KASAN: slab-out-of-bounds in smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> Read of size 8 at addr ffff8880013b8674 by task syz.6.12484/67930
> <IRQ>
> smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> tcp_urg+0x24d/0x360 net/ipv4/tcp_input.c:6264
> tcp_rcv_state_process+0x280d/0x4940 net/ipv4/tcp_input.c:7336
> tcp_child_process+0x371/0xa50 net/ipv4/tcp_minisocks.c:1002
> tcp_v4_rcv+0x1eaa/0x2a00 net/ipv4/tcp_ipv4.c:2186
> [...]
> </IRQ>
>
> Allocated by task 67930:
> sk_psock_init+0x142/0x740 net/core/skmsg.c:766
> sock_hash_update_common+0xd3/0x990 net/core/sock_map.c:1010
> bpf_sock_hash_update+0x114/0x170 net/core/sock_map.c:1229
> __cgroup_bpf_run_filter_sock_ops+0x74/0xa0 kernel/bpf/cgroup.c:1727
> tcp_init_transfer+0x1085/0x1100 net/ipv4/tcp_input.c:6693
> [...]
>
> Resolve the conflict on the write path. Reserve the child's sk_user_data
> with a NULL pointer tagged SK_USER_DATA_NOCOPY so sk_psock_init() returns
> -EBUSY, and release it at accept. smc_clcsock_user_data() still strips the
> tag to NULL, so the inherited callback stays a no-op.
>
> Fixes: a60a2b1e0af1 ("net/smc: reduce active tcp_listen workers")
> Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> ---
> v3:
> - reserve sk_user_data on the write path instead of the read-side check (D. Wythe)
>
> v2:
> - https://lore.kernel.org/netdev/20260619150342.3626224-1-rhkrqnwk98@gmail.com/
>
> v1:
> - https://lore.kernel.org/netdev/20260614120931.4041687-1-rhkrqnwk98@gmail.com/
>
> net/smc/af_smc.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index b5db69073e20..78f162344fe3 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -154,7 +154,11 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> own_req, opt_child_init);
> /* child must not inherit smc or its ops */
> if (child) {
> - rcu_assign_sk_user_data(child, NULL);
> + /* reserve sk_user_data so sockmap cannot claim the slot */
> + write_lock_bh(&child->sk_callback_lock);
> + __rcu_assign_sk_user_data_with_flags(child, NULL,
> + SK_USER_DATA_NOCOPY);
> + write_unlock_bh(&child->sk_callback_lock);
>
> /* v4-mapped sockets don't inherit parent ops. Don't restore. */
> if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> @@ -1773,6 +1777,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> /* new clcsock has inherited the smc listen-specific sk_data_ready
> * function; switch it back to the original sk_data_ready function
> */
> + write_lock_bh(&new_clcsock->sk->sk_callback_lock);
> new_clcsock->sk->sk_data_ready = lsmc->clcsk_data_ready;
>
> /* if new clcsock has also inherited the fallback-specific callback
> @@ -1786,6 +1791,9 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> if (lsmc->clcsk_error_report)
> new_clcsock->sk->sk_error_report = lsmc->clcsk_error_report;
> }
> + /* release the slot reserved in smc_tcp_syn_recv_sock() */
> + rcu_assign_sk_user_data(new_clcsock->sk, NULL);
> + write_unlock_bh(&new_clcsock->sk->sk_callback_lock);
Sashiko reports that this still cause problem on fallback.
@Wythe, I understand from previous discussion that you would prefer to
address such issues separately (and thus you are fine with the patch in
the current form). Could you please confirm?
/P
>
> (*new_smc)->clcsock = new_clcsock;
> out:
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock
2026-07-01 15:27 ` Paolo Abeni
@ 2026-07-02 3:17 ` D. Wythe
0 siblings, 0 replies; 5+ messages in thread
From: D. Wythe @ 2026-07-02 3:17 UTC (permalink / raw)
To: Paolo Abeni
Cc: Sechang Lim, D . Wythe, Dust Li, Sidraya Jayagond, Wenjia Zhang,
David S . Miller, Eric Dumazet, Jakub Kicinski, Jiayuan Chen,
Mahanta Jambigi, Tony Lu, Wen Gu, Simon Horman, Karsten Graul,
Guvenc Gulce, Ursula Braun, linux-rdma, linux-s390, netdev,
linux-kernel, bpf
On Wed, Jul 01, 2026 at 05:27:11PM +0200, Paolo Abeni wrote:
> On 6/29/26 11:51 AM, Sechang Lim wrote:
> > A passive-open child inherits the listener's smc_clcsock_data_ready().
> > sk_clone_lock() clears its sk_user_data to NULL because the listener tagged
> > it SK_USER_DATA_NOCOPY. Until accept restores the callback, a BPF sock_ops
> > program can add the established child to a sockmap, and sk_psock_init()
> > installs a sk_psock into the NULL sk_user_data. The inherited callback then
> > reads it back through smc_clcsock_user_data(), which strips only NOCOPY,
> > takes the sk_psock for an smc_sock, and dereferences a clcsk_* field past
> > its end:
> >
> > BUG: KASAN: slab-out-of-bounds in smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> > Read of size 8 at addr ffff8880013b8674 by task syz.6.12484/67930
> > <IRQ>
> > smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> > tcp_urg+0x24d/0x360 net/ipv4/tcp_input.c:6264
> > tcp_rcv_state_process+0x280d/0x4940 net/ipv4/tcp_input.c:7336
> > tcp_child_process+0x371/0xa50 net/ipv4/tcp_minisocks.c:1002
> > tcp_v4_rcv+0x1eaa/0x2a00 net/ipv4/tcp_ipv4.c:2186
> > [...]
> > </IRQ>
> >
> > Allocated by task 67930:
> > sk_psock_init+0x142/0x740 net/core/skmsg.c:766
> > sock_hash_update_common+0xd3/0x990 net/core/sock_map.c:1010
> > bpf_sock_hash_update+0x114/0x170 net/core/sock_map.c:1229
> > __cgroup_bpf_run_filter_sock_ops+0x74/0xa0 kernel/bpf/cgroup.c:1727
> > tcp_init_transfer+0x1085/0x1100 net/ipv4/tcp_input.c:6693
> > [...]
> >
> > Resolve the conflict on the write path. Reserve the child's sk_user_data
> > with a NULL pointer tagged SK_USER_DATA_NOCOPY so sk_psock_init() returns
> > -EBUSY, and release it at accept. smc_clcsock_user_data() still strips the
> > tag to NULL, so the inherited callback stays a no-op.
> >
> > Fixes: a60a2b1e0af1 ("net/smc: reduce active tcp_listen workers")
> > Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> > ---
> > v3:
> > - reserve sk_user_data on the write path instead of the read-side check (D. Wythe)
> >
> > v2:
> > - https://lore.kernel.org/netdev/20260619150342.3626224-1-rhkrqnwk98@gmail.com/
> >
> > v1:
> > - https://lore.kernel.org/netdev/20260614120931.4041687-1-rhkrqnwk98@gmail.com/
> >
> > net/smc/af_smc.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > index b5db69073e20..78f162344fe3 100644
> > --- a/net/smc/af_smc.c
> > +++ b/net/smc/af_smc.c
> > @@ -154,7 +154,11 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > own_req, opt_child_init);
> > /* child must not inherit smc or its ops */
> > if (child) {
> > - rcu_assign_sk_user_data(child, NULL);
> > + /* reserve sk_user_data so sockmap cannot claim the slot */
> > + write_lock_bh(&child->sk_callback_lock);
> > + __rcu_assign_sk_user_data_with_flags(child, NULL,
> > + SK_USER_DATA_NOCOPY);
> > + write_unlock_bh(&child->sk_callback_lock);
> >
> > /* v4-mapped sockets don't inherit parent ops. Don't restore. */
> > if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> > @@ -1773,6 +1777,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> > /* new clcsock has inherited the smc listen-specific sk_data_ready
> > * function; switch it back to the original sk_data_ready function
> > */
> > + write_lock_bh(&new_clcsock->sk->sk_callback_lock);
> > new_clcsock->sk->sk_data_ready = lsmc->clcsk_data_ready;
> >
> > /* if new clcsock has also inherited the fallback-specific callback
> > @@ -1786,6 +1791,9 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> > if (lsmc->clcsk_error_report)
> > new_clcsock->sk->sk_error_report = lsmc->clcsk_error_report;
> > }
> > + /* release the slot reserved in smc_tcp_syn_recv_sock() */
> > + rcu_assign_sk_user_data(new_clcsock->sk, NULL);
> > + write_unlock_bh(&new_clcsock->sk->sk_callback_lock);
>
> Sashiko reports that this still cause problem on fallback.
>
> @Wythe, I understand from previous discussion that you would prefer to
> address such issues separately (and thus you are fine with the patch in
> the current form). Could you please confirm?
No, I do not agree with this patch in any form. As I replied to it, I do
not think this is the right direction.
D. Wythe
>
> /P
> >
> > (*new_smc)->clcsock = new_clcsock;
> > out:
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock
2026-06-29 9:51 [PATCH net v3] net/smc: fix out-of-bounds read when sk_user_data holds a sk_psock Sechang Lim
2026-06-30 9:52 ` sashiko-bot
2026-07-01 15:27 ` Paolo Abeni
@ 2026-07-02 3:14 ` D. Wythe
2 siblings, 0 replies; 5+ messages in thread
From: D. Wythe @ 2026-07-02 3:14 UTC (permalink / raw)
To: Sechang Lim
Cc: D . Wythe, Dust Li, Sidraya Jayagond, Wenjia Zhang,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jiayuan Chen, Mahanta Jambigi, Tony Lu, Wen Gu, Simon Horman,
Karsten Graul, Guvenc Gulce, Ursula Braun, linux-rdma, linux-s390,
netdev, linux-kernel, bpf
On Mon, Jun 29, 2026 at 09:51:33AM +0000, Sechang Lim wrote:
> A passive-open child inherits the listener's smc_clcsock_data_ready().
> sk_clone_lock() clears its sk_user_data to NULL because the listener tagged
> it SK_USER_DATA_NOCOPY. Until accept restores the callback, a BPF sock_ops
> program can add the established child to a sockmap, and sk_psock_init()
> installs a sk_psock into the NULL sk_user_data. The inherited callback then
> reads it back through smc_clcsock_user_data(), which strips only NOCOPY,
> takes the sk_psock for an smc_sock, and dereferences a clcsk_* field past
> its end:
>
> BUG: KASAN: slab-out-of-bounds in smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> Read of size 8 at addr ffff8880013b8674 by task syz.6.12484/67930
> <IRQ>
> smc_clcsock_data_ready+0x84/0x200 net/smc/af_smc.c:2637
> tcp_urg+0x24d/0x360 net/ipv4/tcp_input.c:6264
> tcp_rcv_state_process+0x280d/0x4940 net/ipv4/tcp_input.c:7336
> tcp_child_process+0x371/0xa50 net/ipv4/tcp_minisocks.c:1002
> tcp_v4_rcv+0x1eaa/0x2a00 net/ipv4/tcp_ipv4.c:2186
> [...]
> </IRQ>
>
> Allocated by task 67930:
> sk_psock_init+0x142/0x740 net/core/skmsg.c:766
> sock_hash_update_common+0xd3/0x990 net/core/sock_map.c:1010
> bpf_sock_hash_update+0x114/0x170 net/core/sock_map.c:1229
> __cgroup_bpf_run_filter_sock_ops+0x74/0xa0 kernel/bpf/cgroup.c:1727
> tcp_init_transfer+0x1085/0x1100 net/ipv4/tcp_input.c:6693
> [...]
>
> Resolve the conflict on the write path. Reserve the child's sk_user_data
> with a NULL pointer tagged SK_USER_DATA_NOCOPY so sk_psock_init() returns
> -EBUSY, and release it at accept. smc_clcsock_user_data() still strips the
> tag to NULL, so the inherited callback stays a no-op.
>
> Fixes: a60a2b1e0af1 ("net/smc: reduce active tcp_listen workers")
> Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> ---
> v3:
> - reserve sk_user_data on the write path instead of the read-side check (D. Wythe)
>
> v2:
> - https://lore.kernel.org/netdev/20260619150342.3626224-1-rhkrqnwk98@gmail.com/
>
> v1:
> - https://lore.kernel.org/netdev/20260614120931.4041687-1-rhkrqnwk98@gmail.com/
>
> net/smc/af_smc.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index b5db69073e20..78f162344fe3 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -154,7 +154,11 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> own_req, opt_child_init);
> /* child must not inherit smc or its ops */
> if (child) {
> - rcu_assign_sk_user_data(child, NULL);
> + /* reserve sk_user_data so sockmap cannot claim the slot */
> + write_lock_bh(&child->sk_callback_lock);
> + __rcu_assign_sk_user_data_with_flags(child, NULL,
> + SK_USER_DATA_NOCOPY);
> + write_unlock_bh(&child->sk_callback_lock);
>
> /* v4-mapped sockets don't inherit parent ops. Don't restore. */
> if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> @@ -1773,6 +1777,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> /* new clcsock has inherited the smc listen-specific sk_data_ready
> * function; switch it back to the original sk_data_ready function
> */
> + write_lock_bh(&new_clcsock->sk->sk_callback_lock);
> new_clcsock->sk->sk_data_ready = lsmc->clcsk_data_ready;
>
> /* if new clcsock has also inherited the fallback-specific callback
> @@ -1786,6 +1791,9 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, struct smc_sock **new_smc)
> if (lsmc->clcsk_error_report)
> new_clcsock->sk->sk_error_report = lsmc->clcsk_error_report;
> }
> + /* release the slot reserved in smc_tcp_syn_recv_sock() */
> + rcu_assign_sk_user_data(new_clcsock->sk, NULL);
> + write_unlock_bh(&new_clcsock->sk->sk_callback_lock);
>
> (*new_smc)->clcsock = new_clcsock;
> out:
> --
I do not think this is the right direction.
You have now sent three versions of essentially the same patch, but I
still do not see a clear explanation of why this approach is supposed to
solve the problem you described. At this point, it looks like you are
repeatedly posting changes without demonstrating that you fully
understand the ownership and lifetime rules involved here.
This is not the approach I suggested. My preference is to keep the
sk_user_data reserved from the time the clcsock is created, by not
setting NOCOPY, so that it remains inherited by the passive socket.
That avoids the release-then-reacquire pattern entirely. The problem
is more complicated than it may appear, and if you choose this direction,
you also need to account for how SMC fallback continues to work with
sockmap.
If you do not think you can fully address that approach, then feel free
to pursue other options that you are able to handle, but I will not ack
any workaround version. You may seek ack from other maintainers.
D. Wythe
> 2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread