* [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
@ 2026-02-17 22:38 Fernando Fernandez Mancera
2026-02-17 22:47 ` Gerd Rausch
2026-02-18 10:02 ` [net,v2] " Simon Horman
0 siblings, 2 replies; 6+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-17 22:38 UTC (permalink / raw)
To: netdev
Cc: rds-devel, linux-rdma, gerd.rausch, horms, pabeni, kuba, edumazet,
davem, allison.henderson, Fernando Fernandez Mancera,
syzbot+5efae91f60932839f0a5
syzbot reported a recursive lock warning in rds_tcp_get_peer_sport() as
it calls inet6_getname() which acquires the socket lock that was already
held by __release_sock().
kworker/u8:6/2985 is trying to acquire lock:
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
but task is already holding lock:
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
lock_sock_nested+0x48/0x100 net/core/sock.c:3780
lock_sock include/net/sock.h:1709 [inline]
inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
__tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
sk_backlog_rcv include/net/sock.h:1185 [inline]
__release_sock+0x1b8/0x3a0 net/core/sock.c:3213
Reading from the socket struct directly is safe from both possible
paths. For rds_tcp_accept_one(), the socket was just accepted and no
concurrent access is possible yet. For rds_tcp_conn_slots_available()
the lock is already held because we are in the receiving path.
Note that it is also safe to call rds_tcp_conn_slots_available() from
rds_conn_shutdown() because the fan-out is disabled.
Fixes: 9d27a0fb122f ("net/rds: Trigger rds_send_ping() more than once")
Reported-by: syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
v2: clarified commit message and add a comment around
rds_conn_shutdown() path
---
net/rds/connection.c | 3 +++
net/rds/tcp_listen.c | 28 +++++-----------------------
2 files changed, 8 insertions(+), 23 deletions(-)
diff --git a/net/rds/connection.c b/net/rds/connection.c
index 185f73b01694..a542f94c0214 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
rcu_read_unlock();
}
+ /* we do not hold the socket lock here but it is safe because
+ * fan-out is disabled when calling conn_slots_available()
+ */
if (conn->c_trans->conn_slots_available)
conn->c_trans->conn_slots_available(conn, false);
}
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 6fb5c928b8fd..a36e5dfd6c66 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
static int
rds_tcp_get_peer_sport(struct socket *sock)
{
- union {
- struct sockaddr_storage storage;
- struct sockaddr addr;
- struct sockaddr_in sin;
- struct sockaddr_in6 sin6;
- } saddr;
- int sport;
-
- if (kernel_getpeername(sock, &saddr.addr) >= 0) {
- switch (saddr.addr.sa_family) {
- case AF_INET:
- sport = ntohs(saddr.sin.sin_port);
- break;
- case AF_INET6:
- sport = ntohs(saddr.sin6.sin6_port);
- break;
- default:
- sport = -1;
- }
- } else {
- sport = -1;
- }
+ struct sock *sk = sock->sk;
+
+ if (!sk)
+ return -1;
- return sport;
+ return ntohs(inet_sk(sk)->inet_dport);
}
/* rds_tcp_accept_one_path(): if accepting on cp_index > 0, make sure the
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
2026-02-17 22:38 [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available Fernando Fernandez Mancera
@ 2026-02-17 22:47 ` Gerd Rausch
2026-02-18 10:02 ` [net,v2] " Simon Horman
1 sibling, 0 replies; 6+ messages in thread
From: Gerd Rausch @ 2026-02-17 22:47 UTC (permalink / raw)
To: Fernando Fernandez Mancera, netdev
Cc: rds-devel, linux-rdma, horms, pabeni, kuba, edumazet, davem,
allison.henderson, syzbot+5efae91f60932839f0a5
On 2026-02-17 14:38, Fernando Fernandez Mancera wrote:
> syzbot reported a recursive lock warning in rds_tcp_get_peer_sport() as
> it calls inet6_getname() which acquires the socket lock that was already
> held by __release_sock().
>
> kworker/u8:6/2985 is trying to acquire lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>
> but task is already holding lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
> lock_sock_nested+0x48/0x100 net/core/sock.c:3780
> lock_sock include/net/sock.h:1709 [inline]
> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
> sk_backlog_rcv include/net/sock.h:1185 [inline]
> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
>
> Reading from the socket struct directly is safe from both possible
> paths. For rds_tcp_accept_one(), the socket was just accepted and no
> concurrent access is possible yet. For rds_tcp_conn_slots_available()
> the lock is already held because we are in the receiving path.
>
> Note that it is also safe to call rds_tcp_conn_slots_available() from
> rds_conn_shutdown() because the fan-out is disabled.
>
> Fixes: 9d27a0fb122f ("net/rds: Trigger rds_send_ping() more than once")
> Reported-by: syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> ---
> v2: clarified commit message and add a comment around
> rds_conn_shutdown() path
> ---
> net/rds/connection.c | 3 +++
> net/rds/tcp_listen.c | 28 +++++-----------------------
> 2 files changed, 8 insertions(+), 23 deletions(-)
>
> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index 185f73b01694..a542f94c0214 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
> rcu_read_unlock();
> }
>
> + /* we do not hold the socket lock here but it is safe because
> + * fan-out is disabled when calling conn_slots_available()
> + */
> if (conn->c_trans->conn_slots_available)
> conn->c_trans->conn_slots_available(conn, false);
> }
> diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> index 6fb5c928b8fd..a36e5dfd6c66 100644
> --- a/net/rds/tcp_listen.c
> +++ b/net/rds/tcp_listen.c
> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
> static int
> rds_tcp_get_peer_sport(struct socket *sock)
> {
> - union {
> - struct sockaddr_storage storage;
> - struct sockaddr addr;
> - struct sockaddr_in sin;
> - struct sockaddr_in6 sin6;
> - } saddr;
> - int sport;
> -
> - if (kernel_getpeername(sock, &saddr.addr) >= 0) {
> - switch (saddr.addr.sa_family) {
> - case AF_INET:
> - sport = ntohs(saddr.sin.sin_port);
> - break;
> - case AF_INET6:
> - sport = ntohs(saddr.sin6.sin6_port);
> - break;
> - default:
> - sport = -1;
> - }
> - } else {
> - sport = -1;
> - }
> + struct sock *sk = sock->sk;
> +
> + if (!sk)
> + return -1;
>
> - return sport;
> + return ntohs(inet_sk(sk)->inet_dport);
> }
>
> /* rds_tcp_accept_one_path(): if accepting on cp_index > 0, make sure the
Looks good to me.
Reviewed-by: Gerd Rausch <gerd.rausch@oracle.com>
Thanks,
Gerd
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net,v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
2026-02-17 22:38 [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available Fernando Fernandez Mancera
2026-02-17 22:47 ` Gerd Rausch
@ 2026-02-18 10:02 ` Simon Horman
2026-02-18 17:13 ` Fernando Fernandez Mancera
1 sibling, 1 reply; 6+ messages in thread
From: Simon Horman @ 2026-02-18 10:02 UTC (permalink / raw)
To: fmancera
Cc: Simon Horman, pabeni, davem, allison.henderson,
syzbot+5efae91f60932839f0a5, kuba, rds-devel, gerd.rausch, netdev,
edumazet, linux-rdma
This is an AI-generated review of your patch. The human sending this
email says "I'm passing this on, not becuase of the first coment
regarding fan-out, which I think seems reasonable in it's current
form. But because of the second comment, at the end, regarding the
receive path. I'm not sure about that one either way, but it seems
worth passing on to you."
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/rds: fix recursive lock in rds_tcp_conn_slots_available
This commit addresses a recursive lock warning reported by syzbot in
rds_tcp_get_peer_sport() by replacing kernel_getpeername() calls with
direct socket struct access to read the peer port number.
Closes: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index 185f73b01694..a542f94c0214 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
> rcu_read_unlock();
> }
>
> + /* we do not hold the socket lock here but it is safe because
> + * fan-out is disabled when calling conn_slots_available()
> + */
Is this comment potentially misleading about the safety reasoning?
The comment says "we do not hold the socket lock here but it is safe
because fan-out is disabled." This is factually accurate - with fan_out
set to false, the code path that reads from the socket (via
rds_tcp_get_peer_sport) is never reached.
However, the phrasing could confuse readers about what happens if fan_out
were true but the lock weren't held. In that scenario, the new code
(reading inet_dport directly) would still be safe because inet_dport is an
atomic-width read of a stable field.
The comment conflates two independent safety properties: (1) the socket
isn't accessed at all when fan_out is false, and (2) if it were accessed,
the new implementation would still be safe without the lock.
Could the comment more precisely explain the safety reasoning?
> if (conn->c_trans->conn_slots_available)
> conn->c_trans->conn_slots_available(conn, false);
> }
[ ... ]
> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
> static int
> rds_tcp_get_peer_sport(struct socket *sock)
> {
> - union {
> - struct sockaddr_storage storage;
> - struct sockaddr addr;
> - struct sockaddr_in sin;
> - struct sockaddr_in6 sin6;
> - } saddr;
> - int sport;
> -
> - if (kernel_getpeername(sock, &saddr.addr) >= 0) {
> - switch (saddr.addr.sa_family) {
> - case AF_INET:
> - sport = ntohs(saddr.sin.sin_port);
> - break;
> - case AF_INET6:
> - sport = ntohs(saddr.sin6.sin6_port);
> - break;
> - default:
> - sport = -1;
> - }
> - } else {
> - sport = -1;
> - }
> + struct sock *sk = sock->sk;
> +
> + if (!sk)
> + return -1;
>
> - return sport;
> + return ntohs(inet_sk(sk)->inet_dport);
> }
The commit message states "For rds_tcp_conn_slots_available() the lock is
already held because we are in the receiving path." Is this claim accurate
across all receive paths?
Looking at the two receive sub-paths:
1. In the backlog processing path (shown in the syzbot stack trace), the
socket IS owned because __release_sock() runs while the lock_sock
ownership bit is still set:
tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
-> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
-> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
-> rds_tcp_read_sock()
2. However, rds_tcp_data_ready() can also be called directly from the
normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
where the socket lock is NOT held. In this path, rds_tcp_read_sock()
calls tcp_read_sock() without lock_sock.
The fix is still correct in both cases because inet_dport is a stable
atomic-width field, but the claim "the lock is already held" is not
universally true for all receiving paths.
Should the commit message be more precise about when the lock is held?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net,v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
2026-02-18 10:02 ` [net,v2] " Simon Horman
@ 2026-02-18 17:13 ` Fernando Fernandez Mancera
2026-02-18 20:08 ` Allison Henderson
2026-02-19 8:44 ` Simon Horman
0 siblings, 2 replies; 6+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-18 17:13 UTC (permalink / raw)
To: Simon Horman
Cc: pabeni, davem, allison.henderson, syzbot+5efae91f60932839f0a5,
kuba, rds-devel, gerd.rausch, netdev, edumazet, linux-rdma
On 2/18/26 11:02 AM, Simon Horman wrote:
> This is an AI-generated review of your patch. The human sending this
> email says "I'm passing this on, not becuase of the first coment
> regarding fan-out, which I think seems reasonable in it's current
> form. But because of the second comment, at the end, regarding the
> receive path. I'm not sure about that one either way, but it seems
> worth passing on to you."
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net/rds: fix recursive lock in rds_tcp_conn_slots_available
>
> This commit addresses a recursive lock warning reported by syzbot in
> rds_tcp_get_peer_sport() by replacing kernel_getpeername() calls with
> direct socket struct access to read the peer port number.
>
> Closes: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
>
>> diff --git a/net/rds/connection.c b/net/rds/connection.c
>> index 185f73b01694..a542f94c0214 100644
>> --- a/net/rds/connection.c
>> +++ b/net/rds/connection.c
>> @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
>> rcu_read_unlock();
>> }
>>
>> + /* we do not hold the socket lock here but it is safe because
>> + * fan-out is disabled when calling conn_slots_available()
>> + */
>
> Is this comment potentially misleading about the safety reasoning?
>
> The comment says "we do not hold the socket lock here but it is safe
> because fan-out is disabled." This is factually accurate - with fan_out
> set to false, the code path that reads from the socket (via
> rds_tcp_get_peer_sport) is never reached.
>
> However, the phrasing could confuse readers about what happens if fan_out
> were true but the lock weren't held. In that scenario, the new code
> (reading inet_dport directly) would still be safe because inet_dport is an
> atomic-width read of a stable field.
>
> The comment conflates two independent safety properties: (1) the socket
> isn't accessed at all when fan_out is false, and (2) if it were accessed,
> the new implementation would still be safe without the lock.
>
> Could the comment more precisely explain the safety reasoning?
>
>> if (conn->c_trans->conn_slots_available)
>> conn->c_trans->conn_slots_available(conn, false);
>> }
>
> [ ... ]
>
>> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
>> static int
>> rds_tcp_get_peer_sport(struct socket *sock)
>> {
>> - union {
>> - struct sockaddr_storage storage;
>> - struct sockaddr addr;
>> - struct sockaddr_in sin;
>> - struct sockaddr_in6 sin6;
>> - } saddr;
>> - int sport;
>> -
>> - if (kernel_getpeername(sock, &saddr.addr) >= 0) {
>> - switch (saddr.addr.sa_family) {
>> - case AF_INET:
>> - sport = ntohs(saddr.sin.sin_port);
>> - break;
>> - case AF_INET6:
>> - sport = ntohs(saddr.sin6.sin6_port);
>> - break;
>> - default:
>> - sport = -1;
>> - }
>> - } else {
>> - sport = -1;
>> - }
>> + struct sock *sk = sock->sk;
>> +
>> + if (!sk)
>> + return -1;
>>
>> - return sport;
>> + return ntohs(inet_sk(sk)->inet_dport);
>> }
>
> The commit message states "For rds_tcp_conn_slots_available() the lock is
> already held because we are in the receiving path." Is this claim accurate
> across all receive paths?
>
> Looking at the two receive sub-paths:
>
> 1. In the backlog processing path (shown in the syzbot stack trace), the
> socket IS owned because __release_sock() runs while the lock_sock
> ownership bit is still set:
>
> tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
> -> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
> -> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
> -> rds_tcp_read_sock()
>
> 2. However, rds_tcp_data_ready() can also be called directly from the
> normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
> where the socket lock is NOT held. In this path, rds_tcp_read_sock()
> calls tcp_read_sock() without lock_sock.
>
> The fix is still correct in both cases because inet_dport is a stable
> atomic-width field, but the claim "the lock is already held" is not
> universally true for all receiving paths.
>
> Should the commit message be more precise about when the lock is held?
>
While I think that is right, the relevant part is the atomicity. The
operation is safe but it requires a READ_ONCE() annotation probably.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net,v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
2026-02-18 17:13 ` Fernando Fernandez Mancera
@ 2026-02-18 20:08 ` Allison Henderson
2026-02-19 8:44 ` Simon Horman
1 sibling, 0 replies; 6+ messages in thread
From: Allison Henderson @ 2026-02-18 20:08 UTC (permalink / raw)
To: fmancera@suse.de, horms@kernel.org
Cc: linux-rdma@vger.kernel.org, davem@davemloft.net,
rds-devel@oss.oracle.com,
syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com,
pabeni@redhat.com, kuba@kernel.org, edumazet@google.com,
Gerd Rausch, netdev@vger.kernel.org
On Wed, 2026-02-18 at 18:13 +0100, Fernando Fernandez Mancera wrote:
> On 2/18/26 11:02 AM, Simon Horman wrote:
> > This is an AI-generated review of your patch. The human sending this
> > email says "I'm passing this on, not becuase of the first coment
> > regarding fan-out, which I think seems reasonable in it's current
> > form. But because of the second comment, at the end, regarding the
> > receive path. I'm not sure about that one either way, but it seems
> > worth passing on to you."
> >
> > For local reproduction steps see https://urldefense.com/v3/__https://netdev-ai.bots.linux.dev/ai-local.html__;!!ACWV5N9M2RV99hQ!OFxijXXvSE4oyVYnhr_zyp6-qAfgsSKiMWZTnGd_BGt104foZhsT76vVFHHiM3ICA_M6mEis8uulA-aB7LN3gFQ$
> > ---
> > net/rds: fix recursive lock in rds_tcp_conn_slots_available
> >
> > This commit addresses a recursive lock warning reported by syzbot in
> > rds_tcp_get_peer_sport() by replacing kernel_getpeername() calls with
> > direct socket struct access to read the peer port number.
> >
> > Closes: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5__;!!ACWV5N9M2RV99hQ!OFxijXXvSE4oyVYnhr_zyp6-qAfgsSKiMWZTnGd_BGt104foZhsT76vVFHHiM3ICA_M6mEis8uulA-aB_My2gjY$
> >
> > > diff --git a/net/rds/connection.c b/net/rds/connection.c
> > > index 185f73b01694..a542f94c0214 100644
> > > --- a/net/rds/connection.c
> > > +++ b/net/rds/connection.c
> > > @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
> > > rcu_read_unlock();
> > > }
> > >
> > > + /* we do not hold the socket lock here but it is safe because
> > > + * fan-out is disabled when calling conn_slots_available()
> > > + */
> >
> > Is this comment potentially misleading about the safety reasoning?
> >
> > The comment says "we do not hold the socket lock here but it is safe
> > because fan-out is disabled." This is factually accurate - with fan_out
> > set to false, the code path that reads from the socket (via
> > rds_tcp_get_peer_sport) is never reached.
> >
> > However, the phrasing could confuse readers about what happens if fan_out
> > were true but the lock weren't held. In that scenario, the new code
> > (reading inet_dport directly) would still be safe because inet_dport is an
> > atomic-width read of a stable field.
> >
> > The comment conflates two independent safety properties: (1) the socket
> > isn't accessed at all when fan_out is false, and (2) if it were accessed,
> > the new implementation would still be safe without the lock.
> >
> > Could the comment more precisely explain the safety reasoning?
> >
> > > if (conn->c_trans->conn_slots_available)
> > > conn->c_trans->conn_slots_available(conn, false);
> > > }
> >
> > [ ... ]
> >
> > > @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
> > > static int
> > > rds_tcp_get_peer_sport(struct socket *sock)
> > > {
> > > - union {
> > > - struct sockaddr_storage storage;
> > > - struct sockaddr addr;
> > > - struct sockaddr_in sin;
> > > - struct sockaddr_in6 sin6;
> > > - } saddr;
> > > - int sport;
> > > -
> > > - if (kernel_getpeername(sock, &saddr.addr) >= 0) {
> > > - switch (saddr.addr.sa_family) {
> > > - case AF_INET:
> > > - sport = ntohs(saddr.sin.sin_port);
> > > - break;
> > > - case AF_INET6:
> > > - sport = ntohs(saddr.sin6.sin6_port);
> > > - break;
> > > - default:
> > > - sport = -1;
> > > - }
> > > - } else {
> > > - sport = -1;
> > > - }
> > > + struct sock *sk = sock->sk;
> > > +
> > > + if (!sk)
> > > + return -1;
> > >
> > > - return sport;
> > > + return ntohs(inet_sk(sk)->inet_dport);
> > > }
> >
> > The commit message states "For rds_tcp_conn_slots_available() the lock is
> > already held because we are in the receiving path." Is this claim accurate
> > across all receive paths?
> >
> > Looking at the two receive sub-paths:
> >
> > 1. In the backlog processing path (shown in the syzbot stack trace), the
> > socket IS owned because __release_sock() runs while the lock_sock
> > ownership bit is still set:
> >
> > tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
> > -> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
> > -> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
> > -> rds_tcp_read_sock()
> >
> > 2. However, rds_tcp_data_ready() can also be called directly from the
> > normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
> > where the socket lock is NOT held. In this path, rds_tcp_read_sock()
> > calls tcp_read_sock() without lock_sock.
> >
> > The fix is still correct in both cases because inet_dport is a stable
> > atomic-width field, but the claim "the lock is already held" is not
> > universally true for all receiving paths.
> >
> > Should the commit message be more precise about when the lock is held?
> >
>
> While I think that is right, the relevant part is the atomicity. The
> operation is safe but it requires a READ_ONCE() annotation probably.
>
That makes sense. Can you send a v3 with the READ_ONCE around "inet_sk(sk)->inet_dport" and update the commit message?
Thank you.
Allison
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [net,v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
2026-02-18 17:13 ` Fernando Fernandez Mancera
2026-02-18 20:08 ` Allison Henderson
@ 2026-02-19 8:44 ` Simon Horman
1 sibling, 0 replies; 6+ messages in thread
From: Simon Horman @ 2026-02-19 8:44 UTC (permalink / raw)
To: Fernando Fernandez Mancera
Cc: pabeni, davem, allison.henderson, syzbot+5efae91f60932839f0a5,
kuba, rds-devel, gerd.rausch, netdev, edumazet, linux-rdma
On Wed, Feb 18, 2026 at 06:13:56PM +0100, Fernando Fernandez Mancera wrote:
> On 2/18/26 11:02 AM, Simon Horman wrote:
...
> > The commit message states "For rds_tcp_conn_slots_available() the lock is
> > already held because we are in the receiving path." Is this claim accurate
> > across all receive paths?
> >
> > Looking at the two receive sub-paths:
> >
> > 1. In the backlog processing path (shown in the syzbot stack trace), the
> > socket IS owned because __release_sock() runs while the lock_sock
> > ownership bit is still set:
> >
> > tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
> > -> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
> > -> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
> > -> rds_tcp_read_sock()
> >
> > 2. However, rds_tcp_data_ready() can also be called directly from the
> > normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
> > where the socket lock is NOT held. In this path, rds_tcp_read_sock()
> > calls tcp_read_sock() without lock_sock.
> >
> > The fix is still correct in both cases because inet_dport is a stable
> > atomic-width field, but the claim "the lock is already held" is not
> > universally true for all receiving paths.
> >
> > Should the commit message be more precise about when the lock is held?
> >
>
> While I think that is right, the relevant part is the atomicity. The
> operation is safe but it requires a READ_ONCE() annotation probably.
Thanks. I don't have a deep understanding of this.
But I agree that seems correct.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-19 8:45 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-17 22:38 [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available Fernando Fernandez Mancera
2026-02-17 22:47 ` Gerd Rausch
2026-02-18 10:02 ` [net,v2] " Simon Horman
2026-02-18 17:13 ` Fernando Fernandez Mancera
2026-02-18 20:08 ` Allison Henderson
2026-02-19 8:44 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox