linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect
@ 2025-08-13  1:32 Wang Zhaolong
  2025-08-13 16:53 ` Steve French
  2025-08-14 13:47 ` Paulo Alcantara
  0 siblings, 2 replies; 4+ messages in thread
From: Wang Zhaolong @ 2025-08-13  1:32 UTC (permalink / raw)
  To: sfrench, pc
  Cc: linux-cifs, samba-technical, linux-kernel, chengzhihao1, yi.zhang,
	yangerkun

An AA deadlock occurs when network interruption during mount triggers
DFS reconnection logic that calls iterate_supers_type().

The detailed call process is as follows:

      mount.cifs
-------------------------
path_mount
  do_new_mount
    vfs_get_tree
      smb3_get_tree
        cifs_smb3_do_mount
          sget
            alloc_super
              down_write_nested(&s->s_umount, ..);  // Hold lock
          cifs_root_iget
            cifs_get_inode_info
              smb2_query_path_info
                smb2_compound_op
                  SMB2_open_init
                    smb2_plain_req_init
                      smb2_reconnect           // Trigger reconnection
                        cifs_tree_connect
                          cifs_get_dfs_tcon_super
                            __cifs_get_super
                              iterate_supers_type
                                super_lock_shared
                                  super_lock
                                    __super_lock
                                      down_read(&sb->s_umount); // Deadlock
    do_new_mount_fc
      up_write(&sb->s_umount);  // Release lock

During mount phase, if reconnection is triggered, the foreground mount
process may enter smb2_reconnect prior to the reconnect worker being
scheduled, leading to a deadlock when subsequent DFS tree connect
attempts reacquire the s_umount lock.

The essential condition for triggering the issue is that the API
iterate_supers_type() reacquires the s_umount lock. Therefore, one
possible solution is to avoid using iterate_supers_type() and instead
directly access the superblock through internal data structures.

This patch fixes the problem by:
- Add vfs_sb back-pointer to cifs_sb_info for direct access
- Protect list traversal with existing tcon->sb_list_lock
- Use atomic operations to safely manage super block references
- Remove complex callback-based iteration in favor of simple loop
- Rename cifs_put_tcp_super() to cifs_put_super() to avoid confusion

Fixes: 3ae872de4107 ("smb: client: fix shared DFS root mounts with different prefixes")
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
---

V3:
 - Adjust the trace diagram for the super_lock_shared() section to align with
   the latest mainline call flow. 
V2:
 - Adjust the trace diagram in the commit message to indicate when the lock
   is released

 fs/smb/client/cifs_fs_sb.h |  1 +
 fs/smb/client/cifsfs.c     |  1 +
 fs/smb/client/cifsproto.h  |  2 +-
 fs/smb/client/dfs.c        |  2 +-
 fs/smb/client/misc.c       | 84 ++++++++++++++------------------------
 5 files changed, 34 insertions(+), 56 deletions(-)

diff --git a/fs/smb/client/cifs_fs_sb.h b/fs/smb/client/cifs_fs_sb.h
index 5e8d163cb5f8..8c513e4c0efe 100644
--- a/fs/smb/client/cifs_fs_sb.h
+++ b/fs/smb/client/cifs_fs_sb.h
@@ -49,10 +49,11 @@
 
 struct cifs_sb_info {
 	struct rb_root tlink_tree;
 	struct list_head tcon_sb_link;
 	spinlock_t tlink_tree_lock;
+	struct super_block *vfs_sb;
 	struct tcon_link *master_tlink;
 	struct nls_table *local_nls;
 	struct smb3_fs_context *ctx;
 	atomic_t active;
 	unsigned int mnt_cifs_flags;
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index 3bd85ab2deb1..383f651eb43f 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -939,10 +939,11 @@ cifs_get_root(struct smb3_fs_context *ctx, struct super_block *sb)
 
 static int cifs_set_super(struct super_block *sb, void *data)
 {
 	struct cifs_mnt_data *mnt_data = data;
 	sb->s_fs_info = mnt_data->cifs_sb;
+	mnt_data->cifs_sb->vfs_sb = sb;
 	return set_anon_super(sb, NULL);
 }
 
 struct dentry *
 cifs_smb3_do_mount(struct file_system_type *fs_type,
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index c34c533b2efa..6415bb961c1e 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -678,11 +678,11 @@ int copy_path_name(char *dst, const char *src);
 int smb2_parse_query_directory(struct cifs_tcon *tcon, struct kvec *rsp_iov,
 			       int resp_buftype,
 			       struct cifs_search_info *srch_inf);
 
 struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon);
-void cifs_put_tcp_super(struct super_block *sb);
+void cifs_put_super(struct super_block *sb);
 int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix);
 char *extract_hostname(const char *unc);
 char *extract_sharename(const char *unc);
 int parse_reparse_point(struct reparse_data_buffer *buf,
 			u32 plen, struct cifs_sb_info *cifs_sb,
diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
index f65a8a90ba27..55bcdde4fe26 100644
--- a/fs/smb/client/dfs.c
+++ b/fs/smb/client/dfs.c
@@ -446,11 +446,11 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon)
 				     &tl);
 	free_dfs_info_param(&ref);
 
 out:
 	kfree(tree);
-	cifs_put_tcp_super(sb);
+	cifs_put_super(sb);
 
 	if (rc) {
 		spin_lock(&tcon->tc_lock);
 		if (tcon->status == TID_IN_TCON)
 			tcon->status = TID_NEED_TCON;
diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
index da23cc12a52c..3b6920a52daa 100644
--- a/fs/smb/client/misc.c
+++ b/fs/smb/client/misc.c
@@ -1108,84 +1108,60 @@ int copy_path_name(char *dst, const char *src)
 	/* we count the trailing nul */
 	name_len++;
 	return name_len;
 }
 
-struct super_cb_data {
-	void *data;
-	struct super_block *sb;
-};
-
-static void tcon_super_cb(struct super_block *sb, void *arg)
+static struct super_block *cifs_get_tcon_super(struct cifs_tcon *tcon)
 {
-	struct super_cb_data *sd = arg;
+	struct super_block *sb;
 	struct cifs_sb_info *cifs_sb;
-	struct cifs_tcon *t1 = sd->data, *t2;
 
-	if (sd->sb)
-		return;
+	if (!tcon)
+		return NULL;
 
-	cifs_sb = CIFS_SB(sb);
-	t2 = cifs_sb_master_tcon(cifs_sb);
-
-	spin_lock(&t2->tc_lock);
-	if ((t1->ses == t2->ses ||
-	     t1->ses->dfs_root_ses == t2->ses->dfs_root_ses) &&
-	    t1->ses->server == t2->ses->server &&
-	    t2->origin_fullpath &&
-	    dfs_src_pathname_equal(t2->origin_fullpath, t1->origin_fullpath))
-		sd->sb = sb;
-	spin_unlock(&t2->tc_lock);
-}
+	spin_lock(&tcon->sb_list_lock);
+	list_for_each_entry(cifs_sb, &tcon->cifs_sb_list, tcon_sb_link) {
 
-static struct super_block *__cifs_get_super(void (*f)(struct super_block *, void *),
-					    void *data)
-{
-	struct super_cb_data sd = {
-		.data = data,
-		.sb = NULL,
-	};
-	struct file_system_type **fs_type = (struct file_system_type *[]) {
-		&cifs_fs_type, &smb3_fs_type, NULL,
-	};
-
-	for (; *fs_type; fs_type++) {
-		iterate_supers_type(*fs_type, f, &sd);
-		if (sd.sb) {
-			/*
-			 * Grab an active reference in order to prevent automounts (DFS links)
-			 * of expiring and then freeing up our cifs superblock pointer while
-			 * we're doing failover.
-			 */
-			cifs_sb_active(sd.sb);
-			return sd.sb;
-		}
+		if (!cifs_sb->vfs_sb)
+			continue;
+
+		sb = cifs_sb->vfs_sb;
+
+		/* Safely increment s_active only if it's not zero.
+		 *
+		 * When s_active == 0, the super block is being deactivated
+		 * and should not be used. This prevents UAF scenarios
+		 * where we might grab a reference to a super block that's
+		 * in the middle of destruction.
+		 */
+		if (!atomic_add_unless(&sb->s_active, 1, 0))
+			continue;
+
+		spin_unlock(&tcon->sb_list_lock);
+		return sb;
 	}
-	pr_warn_once("%s: could not find dfs superblock\n", __func__);
-	return ERR_PTR(-EINVAL);
-}
+	spin_unlock(&tcon->sb_list_lock);
 
-static void __cifs_put_super(struct super_block *sb)
-{
-	if (!IS_ERR_OR_NULL(sb))
-		cifs_sb_deactive(sb);
+	return NULL;
 }
 
 struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon)
 {
 	spin_lock(&tcon->tc_lock);
 	if (!tcon->origin_fullpath) {
 		spin_unlock(&tcon->tc_lock);
 		return ERR_PTR(-ENOENT);
 	}
 	spin_unlock(&tcon->tc_lock);
-	return __cifs_get_super(tcon_super_cb, tcon);
+
+	return cifs_get_tcon_super(tcon);
 }
 
-void cifs_put_tcp_super(struct super_block *sb)
+void cifs_put_super(struct super_block *sb)
 {
-	__cifs_put_super(sb);
+	if (!IS_ERR_OR_NULL(sb))
+		deactivate_super(sb);
 }
 
 #ifdef CONFIG_CIFS_DFS_UPCALL
 int match_target_ip(struct TCP_Server_Info *server,
 		    const char *host, size_t hostlen,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect
  2025-08-13  1:32 [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect Wang Zhaolong
@ 2025-08-13 16:53 ` Steve French
  2025-08-14 13:47 ` Paulo Alcantara
  1 sibling, 0 replies; 4+ messages in thread
From: Steve French @ 2025-08-13 16:53 UTC (permalink / raw)
  To: Wang Zhaolong
  Cc: pc, linux-cifs, samba-technical, linux-kernel, chengzhihao1,
	yi.zhang, yangerkun

Tentatively merged into cifs-2.6.git for-next but still pending more
testing and reviews

On Tue, Aug 12, 2025 at 8:39 PM Wang Zhaolong
<wangzhaolong@huaweicloud.com> wrote:
>
> An AA deadlock occurs when network interruption during mount triggers
> DFS reconnection logic that calls iterate_supers_type().
>
> The detailed call process is as follows:
>
>       mount.cifs
> -------------------------
> path_mount
>   do_new_mount
>     vfs_get_tree
>       smb3_get_tree
>         cifs_smb3_do_mount
>           sget
>             alloc_super
>               down_write_nested(&s->s_umount, ..);  // Hold lock
>           cifs_root_iget
>             cifs_get_inode_info
>               smb2_query_path_info
>                 smb2_compound_op
>                   SMB2_open_init
>                     smb2_plain_req_init
>                       smb2_reconnect           // Trigger reconnection
>                         cifs_tree_connect
>                           cifs_get_dfs_tcon_super
>                             __cifs_get_super
>                               iterate_supers_type
>                                 super_lock_shared
>                                   super_lock
>                                     __super_lock
>                                       down_read(&sb->s_umount); // Deadlock
>     do_new_mount_fc
>       up_write(&sb->s_umount);  // Release lock
>
> During mount phase, if reconnection is triggered, the foreground mount
> process may enter smb2_reconnect prior to the reconnect worker being
> scheduled, leading to a deadlock when subsequent DFS tree connect
> attempts reacquire the s_umount lock.
>
> The essential condition for triggering the issue is that the API
> iterate_supers_type() reacquires the s_umount lock. Therefore, one
> possible solution is to avoid using iterate_supers_type() and instead
> directly access the superblock through internal data structures.
>
> This patch fixes the problem by:
> - Add vfs_sb back-pointer to cifs_sb_info for direct access
> - Protect list traversal with existing tcon->sb_list_lock
> - Use atomic operations to safely manage super block references
> - Remove complex callback-based iteration in favor of simple loop
> - Rename cifs_put_tcp_super() to cifs_put_super() to avoid confusion
>
> Fixes: 3ae872de4107 ("smb: client: fix shared DFS root mounts with different prefixes")
> Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
> ---
>
> V3:
>  - Adjust the trace diagram for the super_lock_shared() section to align with
>    the latest mainline call flow.
> V2:
>  - Adjust the trace diagram in the commit message to indicate when the lock
>    is released
>
>  fs/smb/client/cifs_fs_sb.h |  1 +
>  fs/smb/client/cifsfs.c     |  1 +
>  fs/smb/client/cifsproto.h  |  2 +-
>  fs/smb/client/dfs.c        |  2 +-
>  fs/smb/client/misc.c       | 84 ++++++++++++++------------------------
>  5 files changed, 34 insertions(+), 56 deletions(-)
>
> diff --git a/fs/smb/client/cifs_fs_sb.h b/fs/smb/client/cifs_fs_sb.h
> index 5e8d163cb5f8..8c513e4c0efe 100644
> --- a/fs/smb/client/cifs_fs_sb.h
> +++ b/fs/smb/client/cifs_fs_sb.h
> @@ -49,10 +49,11 @@
>
>  struct cifs_sb_info {
>         struct rb_root tlink_tree;
>         struct list_head tcon_sb_link;
>         spinlock_t tlink_tree_lock;
> +       struct super_block *vfs_sb;
>         struct tcon_link *master_tlink;
>         struct nls_table *local_nls;
>         struct smb3_fs_context *ctx;
>         atomic_t active;
>         unsigned int mnt_cifs_flags;
> diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
> index 3bd85ab2deb1..383f651eb43f 100644
> --- a/fs/smb/client/cifsfs.c
> +++ b/fs/smb/client/cifsfs.c
> @@ -939,10 +939,11 @@ cifs_get_root(struct smb3_fs_context *ctx, struct super_block *sb)
>
>  static int cifs_set_super(struct super_block *sb, void *data)
>  {
>         struct cifs_mnt_data *mnt_data = data;
>         sb->s_fs_info = mnt_data->cifs_sb;
> +       mnt_data->cifs_sb->vfs_sb = sb;
>         return set_anon_super(sb, NULL);
>  }
>
>  struct dentry *
>  cifs_smb3_do_mount(struct file_system_type *fs_type,
> diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
> index c34c533b2efa..6415bb961c1e 100644
> --- a/fs/smb/client/cifsproto.h
> +++ b/fs/smb/client/cifsproto.h
> @@ -678,11 +678,11 @@ int copy_path_name(char *dst, const char *src);
>  int smb2_parse_query_directory(struct cifs_tcon *tcon, struct kvec *rsp_iov,
>                                int resp_buftype,
>                                struct cifs_search_info *srch_inf);
>
>  struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon);
> -void cifs_put_tcp_super(struct super_block *sb);
> +void cifs_put_super(struct super_block *sb);
>  int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix);
>  char *extract_hostname(const char *unc);
>  char *extract_sharename(const char *unc);
>  int parse_reparse_point(struct reparse_data_buffer *buf,
>                         u32 plen, struct cifs_sb_info *cifs_sb,
> diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
> index f65a8a90ba27..55bcdde4fe26 100644
> --- a/fs/smb/client/dfs.c
> +++ b/fs/smb/client/dfs.c
> @@ -446,11 +446,11 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon)
>                                      &tl);
>         free_dfs_info_param(&ref);
>
>  out:
>         kfree(tree);
> -       cifs_put_tcp_super(sb);
> +       cifs_put_super(sb);
>
>         if (rc) {
>                 spin_lock(&tcon->tc_lock);
>                 if (tcon->status == TID_IN_TCON)
>                         tcon->status = TID_NEED_TCON;
> diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
> index da23cc12a52c..3b6920a52daa 100644
> --- a/fs/smb/client/misc.c
> +++ b/fs/smb/client/misc.c
> @@ -1108,84 +1108,60 @@ int copy_path_name(char *dst, const char *src)
>         /* we count the trailing nul */
>         name_len++;
>         return name_len;
>  }
>
> -struct super_cb_data {
> -       void *data;
> -       struct super_block *sb;
> -};
> -
> -static void tcon_super_cb(struct super_block *sb, void *arg)
> +static struct super_block *cifs_get_tcon_super(struct cifs_tcon *tcon)
>  {
> -       struct super_cb_data *sd = arg;
> +       struct super_block *sb;
>         struct cifs_sb_info *cifs_sb;
> -       struct cifs_tcon *t1 = sd->data, *t2;
>
> -       if (sd->sb)
> -               return;
> +       if (!tcon)
> +               return NULL;
>
> -       cifs_sb = CIFS_SB(sb);
> -       t2 = cifs_sb_master_tcon(cifs_sb);
> -
> -       spin_lock(&t2->tc_lock);
> -       if ((t1->ses == t2->ses ||
> -            t1->ses->dfs_root_ses == t2->ses->dfs_root_ses) &&
> -           t1->ses->server == t2->ses->server &&
> -           t2->origin_fullpath &&
> -           dfs_src_pathname_equal(t2->origin_fullpath, t1->origin_fullpath))
> -               sd->sb = sb;
> -       spin_unlock(&t2->tc_lock);
> -}
> +       spin_lock(&tcon->sb_list_lock);
> +       list_for_each_entry(cifs_sb, &tcon->cifs_sb_list, tcon_sb_link) {
>
> -static struct super_block *__cifs_get_super(void (*f)(struct super_block *, void *),
> -                                           void *data)
> -{
> -       struct super_cb_data sd = {
> -               .data = data,
> -               .sb = NULL,
> -       };
> -       struct file_system_type **fs_type = (struct file_system_type *[]) {
> -               &cifs_fs_type, &smb3_fs_type, NULL,
> -       };
> -
> -       for (; *fs_type; fs_type++) {
> -               iterate_supers_type(*fs_type, f, &sd);
> -               if (sd.sb) {
> -                       /*
> -                        * Grab an active reference in order to prevent automounts (DFS links)
> -                        * of expiring and then freeing up our cifs superblock pointer while
> -                        * we're doing failover.
> -                        */
> -                       cifs_sb_active(sd.sb);
> -                       return sd.sb;
> -               }
> +               if (!cifs_sb->vfs_sb)
> +                       continue;
> +
> +               sb = cifs_sb->vfs_sb;
> +
> +               /* Safely increment s_active only if it's not zero.
> +                *
> +                * When s_active == 0, the super block is being deactivated
> +                * and should not be used. This prevents UAF scenarios
> +                * where we might grab a reference to a super block that's
> +                * in the middle of destruction.
> +                */
> +               if (!atomic_add_unless(&sb->s_active, 1, 0))
> +                       continue;
> +
> +               spin_unlock(&tcon->sb_list_lock);
> +               return sb;
>         }
> -       pr_warn_once("%s: could not find dfs superblock\n", __func__);
> -       return ERR_PTR(-EINVAL);
> -}
> +       spin_unlock(&tcon->sb_list_lock);
>
> -static void __cifs_put_super(struct super_block *sb)
> -{
> -       if (!IS_ERR_OR_NULL(sb))
> -               cifs_sb_deactive(sb);
> +       return NULL;
>  }
>
>  struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon)
>  {
>         spin_lock(&tcon->tc_lock);
>         if (!tcon->origin_fullpath) {
>                 spin_unlock(&tcon->tc_lock);
>                 return ERR_PTR(-ENOENT);
>         }
>         spin_unlock(&tcon->tc_lock);
> -       return __cifs_get_super(tcon_super_cb, tcon);
> +
> +       return cifs_get_tcon_super(tcon);
>  }
>
> -void cifs_put_tcp_super(struct super_block *sb)
> +void cifs_put_super(struct super_block *sb)
>  {
> -       __cifs_put_super(sb);
> +       if (!IS_ERR_OR_NULL(sb))
> +               deactivate_super(sb);
>  }
>
>  #ifdef CONFIG_CIFS_DFS_UPCALL
>  int match_target_ip(struct TCP_Server_Info *server,
>                     const char *host, size_t hostlen,
> --
> 2.39.2
>
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect
  2025-08-13  1:32 [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect Wang Zhaolong
  2025-08-13 16:53 ` Steve French
@ 2025-08-14 13:47 ` Paulo Alcantara
  2025-08-15  2:55   ` Wang Zhaolong
  1 sibling, 1 reply; 4+ messages in thread
From: Paulo Alcantara @ 2025-08-14 13:47 UTC (permalink / raw)
  To: Wang Zhaolong, sfrench
  Cc: linux-cifs, samba-technical, linux-kernel, chengzhihao1, yi.zhang,
	yangerkun

Wang Zhaolong <wangzhaolong@huaweicloud.com> writes:

> An AA deadlock occurs when network interruption during mount triggers
> DFS reconnection logic that calls iterate_supers_type().
>
> The detailed call process is as follows:
>
>       mount.cifs
> -------------------------
> path_mount
>   do_new_mount
>     vfs_get_tree
>       smb3_get_tree
>         cifs_smb3_do_mount
>           sget
>             alloc_super
>               down_write_nested(&s->s_umount, ..);  // Hold lock
>           cifs_root_iget
>             cifs_get_inode_info
>               smb2_query_path_info
>                 smb2_compound_op
>                   SMB2_open_init
>                     smb2_plain_req_init
>                       smb2_reconnect           // Trigger reconnection
>                         cifs_tree_connect
>                           cifs_get_dfs_tcon_super
>                             __cifs_get_super
>                               iterate_supers_type
>                                 super_lock_shared
>                                   super_lock
>                                     __super_lock
>                                       down_read(&sb->s_umount); // Deadlock
>     do_new_mount_fc
>       up_write(&sb->s_umount);  // Release lock
>
> During mount phase, if reconnection is triggered, the foreground mount
> process may enter smb2_reconnect prior to the reconnect worker being
> scheduled, leading to a deadlock when subsequent DFS tree connect
> attempts reacquire the s_umount lock.
>
> The essential condition for triggering the issue is that the API
> iterate_supers_type() reacquires the s_umount lock. Therefore, one
> possible solution is to avoid using iterate_supers_type() and instead
> directly access the superblock through internal data structures.
>
> This patch fixes the problem by:
> - Add vfs_sb back-pointer to cifs_sb_info for direct access
> - Protect list traversal with existing tcon->sb_list_lock
> - Use atomic operations to safely manage super block references
> - Remove complex callback-based iteration in favor of simple loop
> - Rename cifs_put_tcp_super() to cifs_put_super() to avoid confusion
>
> Fixes: 3ae872de4107 ("smb: client: fix shared DFS root mounts with different prefixes")
> Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
> ---
>
> V3:
>  - Adjust the trace diagram for the super_lock_shared() section to align with
>    the latest mainline call flow. 
> V2:
>  - Adjust the trace diagram in the commit message to indicate when the lock
>    is released
>
>  fs/smb/client/cifs_fs_sb.h |  1 +
>  fs/smb/client/cifsfs.c     |  1 +
>  fs/smb/client/cifsproto.h  |  2 +-
>  fs/smb/client/dfs.c        |  2 +-
>  fs/smb/client/misc.c       | 84 ++++++++++++++------------------------
>  5 files changed, 34 insertions(+), 56 deletions(-)

Thanks for the patch.  Unfortunately this introduces a NULL ptr
dereference with DFS multiuser mounts:

[  725.383434] ==================================================================                                                  
[  725.383647] BUG: KASAN: null-ptr-deref in cifs_tree_connect+0x23c/0xc10 [cifs]                                                  
[  725.383983] Read of size 8 at addr 0000000000000680 by task kworker/2:1/82                                                      
[  725.384189]                                                                                                                     
[  725.384241] CPU: 2 UID: 0 PID: 82 Comm: kworker/2:1 Not tainted 6.17.0-rc1 #2 PREEMPT(voluntary)                                
[  725.384245] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014                                   
[  725.384246] Workqueue: cifsiod smb2_reconnect_server [cifs]                                                                     
[  725.384360] Call Trace:                                                                                                         
[  725.384361]  <TASK>                                                                                                             
[  725.384362]  dump_stack_lvl+0x5d/0x80                                                                                           
[  725.384368]  kasan_report+0xdf/0x1a0                                                                                            
[  725.384371]  ? cifs_tree_connect+0x23c/0xc10 [cifs]                                                                             
[  725.384543]  cifs_tree_connect+0x23c/0xc10 [cifs]                                                                               
[  725.384650]  ? __lock_acquire+0xc46/0x19d0                                                                                      
[  725.384653]  ? rcu_lockdep_current_cpu_online+0x62/0xb0                                                                         
[  725.384656]  ? __pfx_cifs_tree_connect+0x10/0x10 [cifs]                                                                         
[  725.384762]  ? lock_acquire+0x157/0x2e0                                                                                         
[  725.384765]  ? lock_acquire+0x157/0x2e0                                                                                         
[  725.384767]  ? __pfx_do_raw_spin_trylock+0x10/0x10                                                                              
[  725.384769]  ? find_held_lock+0x32/0x90                                                                                         
[  725.384772]  ? cifs_mark_open_files_invalid+0xe5/0x130 [cifs]                                                                   
[  725.384887]  ? local_clock_noinstr+0xd/0xd0                                                                                     
[  725.384889]  ? smb2_reconnect+0x31f/0xe10 [cifs]                                                                                
[  725.385004]  ? local_clock+0x15/0x30                                                                                            
[  725.385006]  ? lock_release+0x29b/0x390                                                                                         
[  725.385009]  smb2_reconnect+0x347/0xe10 [cifs]                                                                                  
[  725.385116]  smb2_reconnect_server+0x413/0xd30 [cifs]                                                                           
[  725.385225]  ? __pfx_smb2_reconnect_server+0x10/0x10 [cifs]                                                                     
[  725.385331]  ? local_clock_noinstr+0xd/0xd0                                                                                     
[  725.385333]  ? local_clock+0x15/0x30                                                                                            
[  725.385334]  ? lock_release+0x29b/0x390                                                                                         
[  725.385337]  process_one_work+0x4c5/0xa10                                                                                       
[  725.385341]  ? __pfx_process_one_work+0x10/0x10                                                                                 
[  725.385343]  ? __list_add_valid_or_report+0x37/0x120                                                                            
[  725.385349]  worker_thread+0x2f1/0x5a0                                                                                          
[  725.385351]  ? __kthread_parkme+0xde/0x100                                                                                      
[  725.385354]  ? __pfx_worker_thread+0x10/0x10                                                                                    
[  725.385356]  kthread+0x1fe/0x380                                                                                                
[  725.385358]  ? kthread+0x10f/0x380                                                                                              
[  725.385360]  ? __pfx_kthread+0x10/0x10                                                                                          
[  725.385363]  ? local_clock_noinstr+0x3e/0xd0                                                                                    
[  725.385365]  ? ret_from_fork+0x1b/0x1f0                                                                                         
[  725.385367]  ? local_clock+0x15/0x30                                                                                            
[  725.385369]  ? lock_release+0x29b/0x390                                                                                         
[  725.385370]  ? rcu_is_watching+0x20/0x50                                                                                        
[  725.385373]  ? __pfx_kthread+0x10/0x10                                                                                          
[  725.385375]  ret_from_fork+0x15b/0x1f0                                                                                          
[  725.385377]  ? __pfx_kthread+0x10/0x10                                                                                          
[  725.385380]  ret_from_fork_asm+0x1a/0x30
[  725.385384]  </TASK>

$ ./scripts/faddr2line --list fs/smb/client/cifs.o cifs_tree_connect+0x23c
cifs_tree_connect+0x23c/0xc10:

CIFS_SB at /home/pc/g/linux/fs/smb/client/cifsglob.h:1624
 1619   }
 1620 
 1621   static inline struct cifs_sb_info *
 1622   CIFS_SB(struct super_block *sb)
 1623   {
>1624<          return sb->s_fs_info;
 1625   }
 1626 
 1627   static inline struct cifs_sb_info *
 1628   CIFS_FILE_SB(struct file *file)
 1629   {

(inlined by) cifs_tree_connect at /home/pc/g/linux/fs/smb/client/dfs.c:435
 430                    goto out;
 431            }
 432 
 433            sb = cifs_get_dfs_tcon_super(tcon);
 434            if (!IS_ERR(sb))
>435<                   cifs_sb = CIFS_SB(sb);
 436 
 437            /* Tree connect to last share in @tcon->tree_name if no DFS referral */
 438            if (!server->leaf_fullpath ||
 439                dfs_cache_noreq_find(server->leaf_fullpath + 1, &ref, &tl)) {
 440                    rc = ops->tree_connect(xid, tcon->ses, tcon->tree_name,

You can reproduce with the following:

$ kinit administrator@ZELDA.TEST
$ su testuser -c 'kinit administrator@ZELDA.TEST'
$ mount.cifs //w22-dc1.zelda.test/dfstest/link2 /mnt/1 -o sec=krb5,multiuser,echo_interval=10
$ mount -t cifs
//w22-dc1.zelda.test/dfstest/link2 on /mnt/1 type cifs (rw,relatime,vers=3.1.1,sec=krb5,cruid=0,cache=strict,upcall_target=app,multiuser,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.124.33,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,noperm,reparse=nfs,nativesocket,symlink=native,rsize=4194304,wsize=4194304,bsize=1048576,retrans=1,echo_interval=10,actimeo=1,closetimeo=1)
$ su testuser -c 'ls /mnt/1'
dir1  dir10  dir3  dir5  dir6  dir8  dir9  target1_file.txt  tsub
# disconnect target server 192.168.124.33 and then wait for oops...

The problem seems related to the tcon created for testuser not having a
link to the CIFS superblock, hence the oops when reconnecting the tcon.

Could you please verify?  Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect
  2025-08-14 13:47 ` Paulo Alcantara
@ 2025-08-15  2:55   ` Wang Zhaolong
  0 siblings, 0 replies; 4+ messages in thread
From: Wang Zhaolong @ 2025-08-15  2:55 UTC (permalink / raw)
  To: Paulo Alcantara, sfrench
  Cc: linux-cifs, samba-technical, linux-kernel, chengzhihao1, yi.zhang,
	yangerkun


> 
> $ ./scripts/faddr2line --list fs/smb/client/cifs.o cifs_tree_connect+0x23c
> cifs_tree_connect+0x23c/0xc10:
> 
> CIFS_SB at /home/pc/g/linux/fs/smb/client/cifsglob.h:1624
>   1619   }
>   1620
>   1621   static inline struct cifs_sb_info *
>   1622   CIFS_SB(struct super_block *sb)
>   1623   {
>> 1624<          return sb->s_fs_info;
>   1625   }
>   1626
>   1627   static inline struct cifs_sb_info *
>   1628   CIFS_FILE_SB(struct file *file)
>   1629   {
> 
> (inlined by) cifs_tree_connect at /home/pc/g/linux/fs/smb/client/dfs.c:435
>   430                    goto out;
>   431            }
>   432
>   433            sb = cifs_get_dfs_tcon_super(tcon);
>   434            if (!IS_ERR(sb))
>> 435<                   cifs_sb = CIFS_SB(sb);
>   436
>   437            /* Tree connect to last share in @tcon->tree_name if no DFS referral */
>   438            if (!server->leaf_fullpath ||
>   439                dfs_cache_noreq_find(server->leaf_fullpath + 1, &ref, &tl)) {
>   440                    rc = ops->tree_connect(xid, tcon->ses, tcon->tree_name,
> 
> You can reproduce with the following:
> 
> $ kinit administrator@ZELDA.TEST
> $ su testuser -c 'kinit administrator@ZELDA.TEST'
> $ mount.cifs //w22-dc1.zelda.test/dfstest/link2 /mnt/1 -o sec=krb5,multiuser,echo_interval=10
> $ mount -t cifs
> //w22-dc1.zelda.test/dfstest/link2 on /mnt/1 type cifs (rw,relatime,vers=3.1.1,sec=krb5,cruid=0,cache=strict,upcall_target=app,multiuser,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.124.33,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,noperm,reparse=nfs,nativesocket,symlink=native,rsize=4194304,wsize=4194304,bsize=1048576,retrans=1,echo_interval=10,actimeo=1,closetimeo=1)
> $ su testuser -c 'ls /mnt/1'
> dir1  dir10  dir3  dir5  dir6  dir8  dir9  target1_file.txt  tsub
> # disconnect target server 192.168.124.33 and then wait for oops...
> 
> The problem seems related to the tcon created for testuser not having a
> link to the CIFS superblock, hence the oops when reconnecting the tcon.
> 
> Could you please verify?  Thanks.


Thanks for your detailed feedback and for pointing out the issue

In my current implementation, I assumed that sb->s_fs_info was already
initialized during this process, which led to the issue. I will carefully
address this problem and ensure the link to the CIFS superblock is properly
established. I'll revise the patch accordingly and send out a V4 version as
soon as possible.

Thanks again for your review and support!

Best regards,
Wang Zhaolong


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-15  2:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13  1:32 [PATCH V3] smb: client: Fix mount deadlock by avoiding super block iteration in DFS reconnect Wang Zhaolong
2025-08-13 16:53 ` Steve French
2025-08-14 13:47 ` Paulo Alcantara
2025-08-15  2:55   ` Wang Zhaolong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).