public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Allison Henderson <allison.henderson@oracle.com>
To: "fmancera@suse.de" <fmancera@suse.de>,
	"horms@kernel.org" <horms@kernel.org>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"rds-devel@oss.oracle.com" <rds-devel@oss.oracle.com>,
	"syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com"
	<syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"edumazet@google.com" <edumazet@google.com>,
	Gerd Rausch <gerd.rausch@oracle.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [net,v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available
Date: Wed, 18 Feb 2026 20:08:04 +0000	[thread overview]
Message-ID: <d06205dc4c9983eaf2cc79ed7e8283ee50be8eb2.camel@oracle.com> (raw)
In-Reply-To: <59c133d4-9e5c-4eee-95c2-4a8877b052be@suse.de>

On Wed, 2026-02-18 at 18:13 +0100, Fernando Fernandez Mancera wrote:
> On 2/18/26 11:02 AM, Simon Horman wrote:
> > This is an AI-generated review of your patch. The human sending this
> > email says "I'm passing this on, not becuase of the first coment
> > regarding fan-out, which I think seems reasonable in it's current
> > form. But because of the second comment, at the end, regarding the
> > receive path. I'm not sure about that one either way, but it seems
> > worth passing on to you."
> > 
> > For local reproduction steps see https://urldefense.com/v3/__https://netdev-ai.bots.linux.dev/ai-local.html__;!!ACWV5N9M2RV99hQ!OFxijXXvSE4oyVYnhr_zyp6-qAfgsSKiMWZTnGd_BGt104foZhsT76vVFHHiM3ICA_M6mEis8uulA-aB7LN3gFQ$ 
> > ---
> > net/rds: fix recursive lock in rds_tcp_conn_slots_available
> > 
> > This commit addresses a recursive lock warning reported by syzbot in
> > rds_tcp_get_peer_sport() by replacing kernel_getpeername() calls with
> > direct socket struct access to read the peer port number.
> > 
> > Closes: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5__;!!ACWV5N9M2RV99hQ!OFxijXXvSE4oyVYnhr_zyp6-qAfgsSKiMWZTnGd_BGt104foZhsT76vVFHHiM3ICA_M6mEis8uulA-aB_My2gjY$ 
> > 
> > > diff --git a/net/rds/connection.c b/net/rds/connection.c
> > > index 185f73b01694..a542f94c0214 100644
> > > --- a/net/rds/connection.c
> > > +++ b/net/rds/connection.c
> > > @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
> > >   		rcu_read_unlock();
> > >   	}
> > > 
> > > +	/* we do not hold the socket lock here but it is safe because
> > > +	 * fan-out is disabled when calling conn_slots_available()
> > > +	 */
> > 
> > Is this comment potentially misleading about the safety reasoning?
> > 
> > The comment says "we do not hold the socket lock here but it is safe
> > because fan-out is disabled." This is factually accurate - with fan_out
> > set to false, the code path that reads from the socket (via
> > rds_tcp_get_peer_sport) is never reached.
> > 
> > However, the phrasing could confuse readers about what happens if fan_out
> > were true but the lock weren't held. In that scenario, the new code
> > (reading inet_dport directly) would still be safe because inet_dport is an
> > atomic-width read of a stable field.
> > 
> > The comment conflates two independent safety properties: (1) the socket
> > isn't accessed at all when fan_out is false, and (2) if it were accessed,
> > the new implementation would still be safe without the lock.
> > 
> > Could the comment more precisely explain the safety reasoning?
> > 
> > >   	if (conn->c_trans->conn_slots_available)
> > >   		conn->c_trans->conn_slots_available(conn, false);
> > >   }
> > 
> > [ ... ]
> > 
> > > @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
> > >   static int
> > >   rds_tcp_get_peer_sport(struct socket *sock)
> > >   {
> > > -	union {
> > > -		struct sockaddr_storage storage;
> > > -		struct sockaddr addr;
> > > -		struct sockaddr_in sin;
> > > -		struct sockaddr_in6 sin6;
> > > -	} saddr;
> > > -	int sport;
> > > -
> > > -	if (kernel_getpeername(sock, &saddr.addr) >= 0) {
> > > -		switch (saddr.addr.sa_family) {
> > > -		case AF_INET:
> > > -			sport = ntohs(saddr.sin.sin_port);
> > > -			break;
> > > -		case AF_INET6:
> > > -			sport = ntohs(saddr.sin6.sin6_port);
> > > -			break;
> > > -		default:
> > > -			sport = -1;
> > > -		}
> > > -	} else {
> > > -		sport = -1;
> > > -	}
> > > +	struct sock *sk = sock->sk;
> > > +
> > > +	if (!sk)
> > > +		return -1;
> > > 
> > > -	return sport;
> > > +	return ntohs(inet_sk(sk)->inet_dport);
> > >   }
> > 
> > The commit message states "For rds_tcp_conn_slots_available() the lock is
> > already held because we are in the receiving path." Is this claim accurate
> > across all receive paths?
> > 
> > Looking at the two receive sub-paths:
> > 
> > 1. In the backlog processing path (shown in the syzbot stack trace), the
> > socket IS owned because __release_sock() runs while the lock_sock
> > ownership bit is still set:
> > 
> > tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
> >    -> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
> >    -> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
> >    -> rds_tcp_read_sock()
> > 
> > 2. However, rds_tcp_data_ready() can also be called directly from the
> > normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
> > where the socket lock is NOT held. In this path, rds_tcp_read_sock()
> > calls tcp_read_sock() without lock_sock.
> > 
> > The fix is still correct in both cases because inet_dport is a stable
> > atomic-width field, but the claim "the lock is already held" is not
> > universally true for all receiving paths.
> > 
> > Should the commit message be more precise about when the lock is held?
> > 
> 
> While I think that is right, the relevant part is the atomicity. The 
> operation is safe but it requires a READ_ONCE() annotation probably.
> 

That makes sense.  Can you send a v3 with the READ_ONCE around "inet_sk(sk)->inet_dport" and update the commit message?
Thank you.

Allison


  reply	other threads:[~2026-02-18 20:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 22:38 [PATCH net v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available Fernando Fernandez Mancera
2026-02-17 22:47 ` Gerd Rausch
2026-02-18 10:02 ` [net,v2] " Simon Horman
2026-02-18 17:13   ` Fernando Fernandez Mancera
2026-02-18 20:08     ` Allison Henderson [this message]
2026-02-19  8:44     ` Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d06205dc4c9983eaf2cc79ed7e8283ee50be8eb2.camel@oracle.com \
    --to=allison.henderson@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fmancera@suse.de \
    --cc=gerd.rausch@oracle.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rds-devel@oss.oracle.com \
    --cc=syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox