From: David Howells <dhowells@redhat.com>
To: Christian Brauner <christian@brauner.io>,
Steve French <sfrench@samba.org>
Cc: David Howells <dhowells@redhat.com>,
Paulo Alcantara <pc@manguebit.com>,
netfs@lists.linux.dev, linux-afs@lists.infradead.org,
linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
ceph-devel@vger.kernel.org, v9fs@lists.linux.dev,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Paulo Alcantara <pc@manguebit.org>
Subject: [PATCH v2 14/16] smb: client: fix potential deadlock when reconnecting channels
Date: Wed, 25 Jun 2025 17:42:09 +0100 [thread overview]
Message-ID: <20250625164213.1408754-15-dhowells@redhat.com> (raw)
In-Reply-To: <20250625164213.1408754-1-dhowells@redhat.com>
From: Paulo Alcantara <pc@manguebit.org>
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock:
ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200
but task is already holding lock:
ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&ret_buf->chan_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_setup_session+0x81/0x4b0
cifs_get_smb_ses+0x771/0x900
cifs_mount_get_session+0x7e/0x170
cifs_mount+0x92/0x2d0
cifs_smb3_do_mount+0x161/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&ret_buf->ses_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_match_super+0x101/0x320
sget+0xab/0x270
cifs_smb3_do_mount+0x1e0/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}:
check_noncircular+0x95/0xc0
check_prev_add+0x115/0x2f0
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_signal_cifsd_for_reconnect+0x134/0x200
__cifs_reconnect+0x8f/0x500
cifs_handle_standard+0x112/0x280
cifs_demultiplex_thread+0x64d/0xbc0
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
other info that might help us debug this:
Chain exists of:
&tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ret_buf->chan_lock);
lock(&ret_buf->ses_lock);
lock(&ret_buf->chan_lock);
lock(&tcp_ses->srv_lock);
*** DEADLOCK ***
3 locks held by cifsd/6055:
#0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
#1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
#2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
Cc: linux-cifs@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Fixes: d7d7a66aacd6 ("cifs: avoid use of global locks for high contention data")
Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/smb/client/cifsglob.h | 1 +
fs/smb/client/connect.c | 53 +++++++++++++++++++++++-----------------
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index da963294a573..fdd95e5100cd 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -709,6 +709,7 @@ inc_rfc1001_len(void *buf, int count)
struct TCP_Server_Info {
struct list_head tcp_ses_list;
struct list_head smb_ses_list;
+ struct list_head rlist; /* reconnect list */
spinlock_t srv_lock; /* protect anything here that is not protected */
__u64 conn_id; /* connection identifier (useful for debugging) */
int srv_count; /* reference counter */
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index c48869c29e15..864de24e23d5 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -124,6 +124,13 @@ static void smb2_query_server_interfaces(struct work_struct *work)
(SMB_INTERFACE_POLL_INTERVAL * HZ));
}
+#define set_need_reco(server) \
+do { \
+ guard(spinlock)(&server->srv_lock); \
+ if (server->tcpStatus != CifsExiting) \
+ server->tcpStatus = CifsNeedReconnect; \
+} while (0)
+
/*
* Update the tcpStatus for the server.
* This is used to signal the cifsd thread to call cifs_reconnect
@@ -137,39 +144,41 @@ void
cifs_signal_cifsd_for_reconnect(struct TCP_Server_Info *server,
bool all_channels)
{
- struct TCP_Server_Info *pserver;
+ struct TCP_Server_Info *nserver;
struct cifs_ses *ses;
+ LIST_HEAD(reco);
int i;
- /* If server is a channel, select the primary channel */
- pserver = SERVER_IS_CHAN(server) ? server->primary_server : server;
-
/* if we need to signal just this channel */
if (!all_channels) {
- spin_lock(&server->srv_lock);
- if (server->tcpStatus != CifsExiting)
- server->tcpStatus = CifsNeedReconnect;
- spin_unlock(&server->srv_lock);
+ set_need_reco(server);
return;
}
- spin_lock(&cifs_tcp_ses_lock);
- list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
- if (cifs_ses_exiting(ses))
- continue;
- spin_lock(&ses->chan_lock);
- for (i = 0; i < ses->chan_count; i++) {
- if (!ses->chans[i].server)
+ if (SERVER_IS_CHAN(server))
+ server = server->primary_server;
+ scoped_guard(spinlock, &cifs_tcp_ses_lock) {
+ set_need_reco(server);
+ list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) {
+ guard(spinlock)(&ses->ses_lock);
+ if (ses->ses_status == SES_EXITING)
continue;
-
- spin_lock(&ses->chans[i].server->srv_lock);
- if (ses->chans[i].server->tcpStatus != CifsExiting)
- ses->chans[i].server->tcpStatus = CifsNeedReconnect;
- spin_unlock(&ses->chans[i].server->srv_lock);
+ guard(spinlock)(&ses->chan_lock);
+ for (i = 1; i < ses->chan_count; i++) {
+ nserver = ses->chans[i].server;
+ if (!nserver)
+ continue;
+ nserver->srv_count++;
+ list_add(&nserver->rlist, &reco);
+ }
}
- spin_unlock(&ses->chan_lock);
}
- spin_unlock(&cifs_tcp_ses_lock);
+
+ list_for_each_entry_safe(server, nserver, &reco, rlist) {
+ list_del_init(&server->rlist);
+ set_need_reco(server);
+ cifs_put_tcp_session(server, 0);
+ }
}
/*
next prev parent reply other threads:[~2025-06-25 16:43 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 16:41 [PATCH v2 00/16] netfs, cifs: Fixes to retry-related code and RDMA support David Howells
2025-06-25 16:41 ` [PATCH v2 01/16] netfs: Fix hang due to missing case in final DIO read result collection David Howells
2025-06-28 5:16 ` Steve French
2025-06-25 16:41 ` [PATCH v2 02/16] netfs: Put double put of request David Howells
2025-06-25 16:41 ` [PATCH v2 03/16] netfs: Provide helpers to perform NETFS_RREQ_IN_PROGRESS flag wangling David Howells
2025-06-25 16:41 ` [PATCH v2 04/16] netfs: Fix looping in wait functions David Howells
2025-06-25 16:42 ` [PATCH v2 05/16] netfs: Fix ref leak on inserted extra subreq in write retry David Howells
2025-06-25 16:42 ` [PATCH v2 06/16] smb: client: set missing retry flag in smb2_writev_callback() David Howells
2025-06-25 16:42 ` [PATCH v2 07/16] smb: client: set missing retry flag in cifs_readv_callback() David Howells
2025-06-25 16:42 ` [PATCH v2 08/16] smb: client: set missing retry flag in cifs_writev_callback() David Howells
2025-06-25 16:42 ` [PATCH v2 09/16] smb: client: fix regression with native SMB symlinks David Howells
2025-06-25 16:42 ` [PATCH v2 10/16] smb: client: fix warning when reconnecting channel David Howells
2025-06-25 16:42 ` [PATCH v2 11/16] smb: client: let smbd_post_send_iter() respect the peers max_send_size and transmit all data David Howells
2025-06-25 16:42 ` [PATCH v2 12/16] cifs: Fix reading into an ITER_FOLIOQ from the smbdirect code David Howells
2025-06-25 17:23 ` Stefan Metzmacher
2025-06-25 17:55 ` [PATCH v3 " David Howells
2025-06-25 18:07 ` Stefan Metzmacher
2025-06-25 18:24 ` Tom Talpey
2025-06-25 18:53 ` David Howells
2025-06-25 16:42 ` [PATCH v2 13/16] cifs: Fix the smbd_reponse slab to allow usercopy David Howells
2025-06-25 16:42 ` David Howells [this message]
2025-06-25 16:42 ` [PATCH v2 15/16] netfs: Renumber the NETFS_RREQ_* flags to make traces easier to read David Howells
2025-06-25 16:42 ` [PATCH v2 16/16] netfs: Update tracepoints in a number of ways David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250625164213.1408754-15-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=christian@brauner.io \
--cc=linux-afs@lists.infradead.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netfs@lists.linux.dev \
--cc=pc@manguebit.com \
--cc=pc@manguebit.org \
--cc=sfrench@samba.org \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).