All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arjan van de Ven <arjan@linux.intel.com>
To: netdev@vger.kernel.org
Cc: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com,
	davem@davemloft.net, dsahern@kernel.org, edumazet@google.com,
	horms@kernel.org, kuba@kernel.org, linux-kernel@vger.kernel.org,
	pabeni@redhat.com, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [net?] general protection fault in kernel_sock_shutdown (4)
Date: Fri, 24 Apr 2026 09:47:17 -0700	[thread overview]
Message-ID: <20260424164733.356003-1-arjan@linux.intel.com> (raw)
In-Reply-To: <69ea344f.a00a0220.17a17.0040.GAE@google.com>

This report was analysed with the help of an automated kernel crash
analysis assistant. The analysis below is tentative and should be
reviewed by a human before any action is taken.

Decoded Backtrace
-----------------

1. kernel_sock_shutdown -- crash site (net/socket.c:3785)

  3783  int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
  3784  {
  3785      return READ_ONCE(sock->ops)->shutdown(sock, how);
              /* CRASH: sock->ops is NULL (R12 = 0x0); KASAN traps
                 null-ptr-deref at offset 0x68 = offsetof(proto_ops, shutdown) */
  3786  }

  Register context at crash:
    RBX = 0xffff888058587240  (struct socket *sock)
    R12 = 0x0000000000000000  (sock->ops, loaded from RBX+0x20 -- NULL)
    RDI = 0x0000000000000068  (= NULL + 0x68, address of shutdown fn ptr)
    RBP = 0x0000000000000002  (how = SHUT_RDWR)

2. udp_tunnel_sock_release (net/ipv4/udp_tunnel_core.c:196-202)

  196  void udp_tunnel_sock_release(struct socket *sock)
  197  {
  198      rcu_assign_sk_user_data(sock->sk, NULL);
  199      synchronize_rcu();
  200      kernel_sock_shutdown(sock, SHUT_RDWR);  /* <- calls crash site */
  201      sock_release(sock);
  202  }

3. rxe_release_udp_tunnel inlined (drivers/infiniband/sw/rxe/rxe_net.c:290-293)

  290  static void rxe_release_udp_tunnel(struct socket *sk)
  291  {
  292      if (sk)
  293          udp_tunnel_sock_release(sk);
  294  }

4. rxe_sock_put (drivers/infiniband/sw/rxe/rxe_net.c:632-643)

  632  static void rxe_sock_put(struct sock *sk,
  633                            void (*set_sk)(struct net *, struct sock *),
  634                            struct net *net)
  635  {
  636      if (refcount_read(&sk->sk_refcnt) > SK_REF_FOR_TUNNEL) {
  637          __sock_put(sk);
  638      } else {
  639          rxe_release_udp_tunnel(sk->sk_socket);  /* <- release BEFORE clear */
  640          sk = NULL;
  641          set_sk(net, sk);                         /* <- clear AFTER (too late) */
  642      }
  643  }

  Caller: rxe_net_del (rxe_net.c:644-666), triggered via:
    nldev_dellink -> rxe_dellink -> rxe_net_del -> rxe_sock_put

Tentative Analysis
------------------

sock->ops is set to NULL by sock_release() (net/socket.c:726) after
calling ops->release(sock). The crash in kernel_sock_shutdown() means
the socket was already passed to sock_release() before this call.

Two independent code paths can release the same UDP tunnel socket stored
in the per-network-namespace rxe_ns_sock structure:

 Path 1 -- namespace teardown (rxe_ns.c, rxe_ns_exit()):
   rcu_assign_pointer(ns_sk->rxe_sk4, NULL);   /* clears pointer FIRST */
   udp_tunnel_sock_release(sk->sk_socket);      /* then releases */

 Path 2 -- RDMA link delete (rxe_net.c, rxe_net_del() -> rxe_sock_put()):
   sk = rxe_ns_pernet_sk4(net);                 /* reads pointer (no ownership) */
   rxe_release_udp_tunnel(sk->sk_socket);       /* releases FIRST */
   set_sk(net, NULL);                           /* clears AFTER */

The following TOCTOU (time-of-check time-of-use) race is possible when
namespace teardown and RDMA link deletion occur concurrently:

  Thread A (rxe_net_del):
    rxe_ns_pernet_sk4() -> sk = X  (non-NULL)

  Thread B (rxe_ns_exit):
    rcu_assign_pointer(sk4, NULL)
    udp_tunnel_sock_release(X->sk_socket)
      sock_release(X->sk_socket)
        X->sk_socket->ops = NULL       <- clears ops

  Thread A (rxe_net_del) continues:
    rxe_sock_put(sk=X, ...)
      rxe_release_udp_tunnel(X->sk_socket)
        kernel_sock_shutdown(X->sk_socket, SHUT_RDWR)
          READ_ONCE(sock->ops)->shutdown(...)
                                       <- CRASH: sock->ops == NULL

The bug was introduced by two commits in March 2026 that added
per-network-namespace support to the Soft RoCE (RXE) driver:

  13f2a53c2a71e  RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
  f1327abd6abed  RDMA/rxe: Support RDMA link creation and destruction per
                 net namespace

Neither commit provides synchronisation between the two teardown paths.

Potential Solution
------------------

Replace rxe_ns_pernet_sk4() calls in rxe_net_del() (and rxe_notify())
with an atomic exchange that simultaneously reads and clears the pernet
pointer, so only one of the two teardown paths can ever obtain a
non-NULL socket pointer:

  struct sock *rxe_ns_pernet_take_sk4(struct net *net)
  {
      struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
      return unrcu_pointer(xchg(&ns_sk->rxe_sk4, RCU_INITIALIZER(NULL)));
  }

Whichever path (rxe_ns_exit or rxe_net_del) wins the xchg gets the
socket and releases it; the loser gets NULL and skips the release.

More information
----------------

Oops-Analysis: https://lore.kernel.org/r/69ea344f.a00a0220.17a17.0040.GAE@google.com
Assisted-by: linux-kernel-oops-x86 skill (Claude Sonnet 4.6)

  parent reply	other threads:[~2026-04-24 16:46 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 15:01 [syzbot] [net?] general protection fault in kernel_sock_shutdown (4) syzbot
2026-04-23 17:41 ` Jakub Kicinski
2026-04-24 16:47 ` Arjan van de Ven [this message]
2026-04-24 18:08 ` Arjan van de Ven
2026-04-25  1:12 ` Arjan van de Ven
2026-04-25  1:14   ` Kuniyuki Iwashima
2026-05-06 13:48 ` [syzbot] [rdma] " syzbot
2026-05-06 14:28   ` Zhu Yanjun
2026-05-06 15:19     ` Kuniyuki Iwashima
2026-05-07  1:30   ` Hillf Danton
2026-05-07  1:57     ` syzbot
2026-05-07  3:52 ` syzbot
2026-05-07 10:12   ` Edward Adam Davis
2026-05-07 12:02     ` syzbot
2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
2026-05-07 13:25     ` Zhu Yanjun
2026-05-07 13:40       ` Edward Adam Davis
2026-05-07 14:11         ` Zhu Yanjun
2026-05-13 18:17     ` Leon Romanovsky
2026-05-13 23:46       ` Jason Gunthorpe
2026-05-14  7:31         ` Edward Adam Davis
2026-05-14 11:50           ` Jason Gunthorpe
2026-05-14 13:58             ` David Ahern
2026-05-14 14:14               ` Jason Gunthorpe
2026-05-14 14:26                 ` David Ahern
2026-05-14 15:46                   ` Zhu Yanjun
2026-05-14  5:15   ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260424164733.356003-1-arjan@linux.intel.com \
    --to=arjan@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.