From: Simon Horman <horms@kernel.org>
To: Maoyi Xie <maoyixie.tju@gmail.com>
Cc: achender@kernel.org, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com,
linux-kernel@vger.kernel.org, maoyi.xie@ntu.edu.sg,
praveen.kakkolangara@aumovio.com
Subject: Re: [PATCH net v2] rds: filter RDS_INFO_* getsockopt by caller's netns
Date: Tue, 12 May 2026 11:37:47 +0100 [thread overview]
Message-ID: <20260512103747.GJ27589@horms.kernel.org> (raw)
In-Reply-To: <CAHPEe=FDQUTZn5QVfYiqf_p1OwiUOehe49WLXHGWzB+hgnnWrw@mail.gmail.com>
On Mon, May 11, 2026 at 02:41:58PM +0800, Maoyi Xie wrote:
> Hi Simon,
>
> Thanks for the review.
>
> > Does this early-out check using the global rds_sock_count break the
> > namespace isolation and force callers to over-allocate memory?
>
> Both effects are present. The size returned via lens to a probing
> caller is still the global count. A caller in an isolated ns that
> sizes its buffer to that value can also see ENOSPC on the second
> call. The precheck compares against the global count. The data
> written only covers entries in the caller's ns.
>
> v3 addresses this. Each handler now does a first pass to count
> entries in the caller's ns. The precheck uses that count. A second
> pass fills the buffer. The change applies to four handlers:
> rds_sock_info and rds6_sock_info in net/rds/af_rds.c, plus
> rds_tcp_tc_info and rds6_tcp_tc_info in net/rds/tcp.c. lens->nr
> now reflects the ns scoped count on both probe and full reads.
>
> Re-verified on a KASAN VM. One AF_RDS socket is bound in init_net
> to 127.0.0.1:4242. The attacker is the same process after
> unshare(CLONE_NEWUSER | CLONE_NEWNET) and uid_map "0 0 1".
>
> [init] count-probe rc=-1 errno=ENOSPC optlen_after=28 entries=1
> [init] full-read rc=0 len=28 entries=1 (127.0.0.1:4242)
> [attacker] count-probe rc=0 optlen_after=0 entries=0
> [attacker] full-read rc=0 len=0 entries=0
>
> Pre-v3 the precheck returned the global count of 1 to the attacker
> via lens->nr on the zero-length probe. v3 returns 0.
Thanks. I will look over the updated code.
> > Can concurrent getsockopt calls trigger a NULL pointer dereference
> > here?
>
> Yes, the window looks reachable.
>
> The writer takes rds_tcp_tc_list_lock and calls list_add_tail. It
> releases the lock. Only after that it assigns tc->t_sock = sock.
> The reader takes the same lock, walks the list, and dereferences
> inet_sk(tc->t_sock->sk). There is no NULL check on the read side.
> A reader that enters between the writer's spin_unlock and the
> t_sock store observes a list entry whose t_sock is still NULL.
>
> The companion restore_callbacks path is safe. list_del_init is
> inside the lock. A reader holding the lock cannot observe the
> unlinked entry. The matching tc->t_sock = NULL outside the lock
> is then harmless. Another reader in the same file at line 676
> already checks !tc->t_sock before use.
>
> The smallest fix is to move tc->t_sock = sock into the
> rds_tcp_tc_list_lock critical section in rds_tcp_set_callbacks,
> just before list_add_tail. The list insertion and the t_sock
> store then become atomic from the reader's view. The diff is
> one line moved. It does not affect the callback_lock side
> effects below.
>
> This is independent of the netns filter. I have not built a
> runtime PoC. The window is short. Does the code analysis above
> match your reading? If yes, I can send this as a separate patch
> with a Fixes tag.
Likewise, Thanks.
I agree that should be sufficient to address this problem.
And that is is appropriate to post it as a separate patch for with a Fixes tag.
prev parent reply other threads:[~2026-05-12 10:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-07 8:13 [PATCH net v2] rds: filter RDS_INFO_* getsockopt by caller's netns Maoyi Xie
2026-05-10 14:54 ` Simon Horman
2026-05-11 6:41 ` Maoyi Xie
2026-05-12 10:37 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512103747.GJ27589@horms.kernel.org \
--to=horms@kernel.org \
--cc=achender@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maoyi.xie@ntu.edu.sg \
--cc=maoyixie.tju@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=praveen.kakkolangara@aumovio.com \
--cc=rds-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox