public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@google.com>
To: Josef Bacik <josef@toxicpanda.com>, Jens Axboe <axboe@kernel.dk>,
	 "David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	 Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	 Kuniyuki Iwashima <kuni1840@gmail.com>,
	linux-block@vger.kernel.org, nbd@other.debian.org,
	 netdev@vger.kernel.org
Subject: [PATCH 3/5] net: Introduce lock_sock_try().
Date: Wed, 25 Mar 2026 06:38:24 +0000	[thread overview]
Message-ID: <20260325063843.1790782-4-kuniyu@google.com> (raw)
In-Reply-To: <20260325063843.1790782-1-kuniyu@google.com>

syzbot has reported 100+ possible deadlock splats involving NBD,
typically following this pattern:

  lock_sock(sk)
  -> GFP_KERNEL memory allocation
     -> fs reclaim
       -> lock_sock(sk) at NBD

Before calling sock_sendmsg() or sock_recvmsg(), NBD sets
sk->sk_allocation to GFP_NOIO to prevent fs reclaim from being
triggered during memory allocation for the backend socket.

However, even after a socket is passed to NBD, it remains
exposed to userspace and thus can exercise various slow paths
under lock_sock(), where GFP_KERNEL is used directly instead
of sk->sk_allocation, leading to the deadlock.

Some of those paths do not currently have a reference to struct
sock, and plumbing the sk pointer through the call chain just to
fix the allocation flags would be extremely cumbersome.

Even with that, lockdep would not be happy because such a path
could be exercised before passing the socket to NBD, and then
lockdep would learn that the path could trigger fs reclaim.

Additionally, since the socket is exposed to userspace, we
cannot change the lockdep key (even for sk->sk_lock.dep_map,
due to lock_sock_fast()).

We could spread memalloc_noio_{save,restore} over the networking
code, but we want to avoid that and solve it in the NBD layer,
which requires the trylock variant of lock_sock().

Let's introduce lock_sock_try() for that purpose.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 include/net/sock.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 6c9a83016e95..69e4b8d17afb 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1710,6 +1710,37 @@ static inline void lock_sock(struct sock *sk)
 }
 
 void __lock_sock(struct sock *sk);
+
+/**
+ * lock_sock_try - trylock version of lock_sock
+ * @sk: socket
+ *
+ * Use of this function is strongly discouraged.
+ *
+ * It is primarily intended for NBD, where the driver must avoid
+ * deadlock during fs reclaim caused by the backend socket remaining
+ * exposed to userspace even after being handed over to NBD,
+ * which _is_ bad but too late to change.
+ *
+ * Return: true if the lock was acquired, false otherwise.
+ */
+static inline bool lock_sock_try(struct sock *sk)
+{
+	if (!spin_trylock_bh(&sk->sk_lock.slock))
+		return false;
+
+	if (sk->sk_lock.owned) {
+		spin_unlock_bh(&sk->sk_lock.slock);
+		return false;
+	}
+
+	sk->sk_lock.owned = 1;
+	spin_unlock_bh(&sk->sk_lock.slock);
+
+	mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_);
+	return true;
+}
+
 void __release_sock(struct sock *sk);
 void release_sock(struct sock *sk);
 
-- 
2.53.0.1018.g2bb0e51243-goog


  parent reply	other threads:[~2026-03-25  6:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-25  6:38 [PATCH 0/5] nbd: Fix deadlock during fs reclaim under lock_sock() Kuniyuki Iwashima
2026-03-25  6:38 ` [PATCH 1/5] nbd: Remove redundant sock->ops->shutdown() check in nbd_get_socket() Kuniyuki Iwashima
2026-03-25  6:38 ` [PATCH 2/5] nbd: Reject unconnected sockets " Kuniyuki Iwashima
2026-03-25  6:38 ` Kuniyuki Iwashima [this message]
2026-03-25  6:38 ` [PATCH 4/5] inet: Add inet_shutdown_locked() Kuniyuki Iwashima
2026-03-25  6:38 ` [PATCH 5/5] nbd: Use lock_sock_try() for TCP sendmsg() and shutdown() Kuniyuki Iwashima
2026-03-26 19:13   ` kernel test robot
2026-03-26 19:38     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260325063843.1790782-4-kuniyu@google.com \
    --to=kuniyu@google.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=nbd@other.debian.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox