netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <edumazet@google.com>
Cc: <allison.henderson@oracle.com>, <davem@davemloft.net>,
	<kuba@kernel.org>, <kuni1840@gmail.com>, <kuniyu@amazon.com>,
	<linux-rdma@vger.kernel.org>, <netdev@vger.kernel.org>,
	<pabeni@redhat.com>, <rds-devel@oss.oracle.com>,
	<syzkaller@googlegroups.com>
Subject: Re: [PATCH v2 net 4/5] rds: tcp: Fix use-after-free of net in reqsk_timer_handler().
Date: Tue, 27 Feb 2024 18:24:20 -0800	[thread overview]
Message-ID: <20240228022420.27327-1-kuniyu@amazon.com> (raw)
In-Reply-To: <CANn89iJErUHpaAqs=qzuD_WxqtBC1rqSh3n9sJ_zJKwHyPORmg@mail.gmail.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 27 Feb 2024 13:06:07 +0100
> On Tue, Feb 27, 2024 at 2:12 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > syzkaller reported a warning of netns tracker [0] followed by KASAN
> > splat [1] and another ref tracker warning [1].
> >
> > syzkaller could not find a repro, but in the log, the only suspicious
> > sequence was as follows:
> >
> >   18:26:22 executing program 1:
> >   r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
> >   ...
> >   connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0, @loopback}, 0x1c) (async)
> >
> > The notable thing here is 0x4001 in connect(), which is RDS_TCP_PORT.
> >
> > So, the scenario would be:
> >
> >   1. unshare(CLONE_NEWNET) creates a per netns tcp listener in
> >       rds_tcp_listen_init().
> >   2. syz-executor connect()s to it and creates a reqsk.
> >   3. syz-executor exit()s immediately.
> >   4. netns is dismantled.  [0]
> >   5. reqsk timer is fired, and UAF happens while freeing reqsk.  [1]
> >   6. listener is freed after RCU grace period.  [2]
> >
> > Basically, reqsk assumes that the listener guarantees netns safety
> > until all reqsk timers are expired by holding the listener's refcount.
> > However, this was not the case for kernel sockets.
> >
> > Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in
> > inet_twsk_purge()") fixed this issue only for per-netns ehash, but
> > the issue still exists for the global ehash.
> >
> > We can apply the same fix, but this issue is specific to RDS.
> >
> > Instead of iterating ehash and purging reqsk during netns dismantle,
> > let's hold netns refcount for the kernel listener.
> >
> >
> 
> > Reported-by: syzkaller <syzkaller@googlegroups.com>
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  net/rds/tcp_listen.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
> > index 05008ce5c421..2d40e523322c 100644
> > --- a/net/rds/tcp_listen.c
> > +++ b/net/rds/tcp_listen.c
> > @@ -274,8 +274,8 @@ struct socket *rds_tcp_listen_init(struct net *net, bool isv6)
> >         int addr_len;
> >         int ret;
> >
> > -       ret = sock_create_kern(net, isv6 ? PF_INET6 : PF_INET, SOCK_STREAM,
> > -                              IPPROTO_TCP, &sock);
> > +       ret = __sock_create(net, isv6 ? PF_INET6 : PF_INET, SOCK_STREAM,
> > +                           IPPROTO_TCP, &sock, SOCKET_KERN_NET_REF);
> >         if (ret < 0) {
> >                 rdsdebug("could not create %s listener socket: %d\n",
> >                          isv6 ? "IPv6" : "IPv4", ret);
> 
> If RDS module keeps a listener alive, not attached to a user process,
> netns dismantle will never occur.
> 
> I think we have to cleanup SYN_RECV sockets in inet_twsk_purge()

Ah.. yes, __init_net ops hook must not take net ref..
I'll go that way in v3.


> 
> Yes, it removes one optimization you did.
> 
> Perhaps add a counter of all kernel sockets that were ever attached to
> a netns in order to decide to apply the optimization.
> (keeping a precise count of SYN_RECV would be too expensive)

I'll work on the follow-up for net-next after the right fix is merged.

Thanks!

  reply	other threads:[~2024-02-28  2:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27  1:10 [PATCH v2 net 0/5] tcp/rds: Fix use-after-free around kernel TCP reqsk Kuniyuki Iwashima
2024-02-27  1:10 ` [PATCH v2 net 1/5] tcp: Restart iteration after removing reqsk in inet_twsk_purge() Kuniyuki Iwashima
2024-02-27  1:10 ` [PATCH v2 net 2/5] Revert "tcp: Clean up kernel listener's reqsk in inet_twsk_purge()" Kuniyuki Iwashima
2024-02-27  1:10 ` [PATCH v2 net 3/5] net: Convert @kern of __sock_create() to enum Kuniyuki Iwashima
2024-02-29 21:51   ` David Laight
2024-02-27  1:10 ` [PATCH v2 net 4/5] rds: tcp: Fix use-after-free of net in reqsk_timer_handler() Kuniyuki Iwashima
2024-02-27 12:06   ` Eric Dumazet
2024-02-28  2:24     ` Kuniyuki Iwashima [this message]
2024-02-27  1:10 ` [PATCH v2 net 5/5] tcp: Add assertion for reqsk->rsk_listener->sk_net_refcnt Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240228022420.27327-1-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=allison.henderson@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rds-devel@oss.oracle.com \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).