public inbox for linux-cifs@vger.kernel.org
 help / color / mirror / Atom feed
From: Paulo Alcantara <pc@manguebit.com>
To: nspmangalore@gmail.com, smfrench@gmail.com,
	bharathsm.hsk@gmail.com, linux-cifs@vger.kernel.org
Cc: Shyam Prasad N <sprasad@microsoft.com>
Subject: Re: [PATCH 13/14] cifs: display the endpoint IP details in DebugData
Date: Wed, 01 Nov 2023 11:12:41 -0300	[thread overview]
Message-ID: <d1c99946663662e7160bf1ed0a6b2dc6.pc@manguebit.com> (raw)
In-Reply-To: <notmuch-sha1-260ef7fe7af7face0e1486229c0fda5149fe14e2>

Paulo Alcantara <pc@manguebit.com> writes:

>> @@ -515,7 +573,18 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
>>  				seq_printf(m, "\n\n\tExtra Channels: %zu ",
>>  					   ses->chan_count-1);
>>  				for (j = 1; j < ses->chan_count; j++) {
>> +					/*
>> +					 * kernel_getsockname can block inside
>> +					 * cifs_dump_channel. so drop the lock first
>> +					 */
>> +					server->srv_count++;
>> +					spin_unlock(&cifs_tcp_ses_lock);
>> +
>>  					cifs_dump_channel(m, j, &ses->chans[j]);
>> +
>> +					cifs_put_tcp_session(server, 0);
>> +					spin_lock(&cifs_tcp_ses_lock);
>
> Here you are re-acquiring @cifs_tcp_ses_lock spinlock under
> @ses->chan_lock, which will introduce deadlocks in threads calling
> cifs_match_super(), cifs_signal_cifsd_for_reconnect(),
> cifs_mark_tcp_ses_conns_for_reconnect(), cifs_find_smb_ses(), ...

A simple reproducer

  $ mount.cifs //srv/share /mnt -o ...,multichannel
  $ cat /proc/fs/cifs/DebugData
  
  [ 1293.512572] BUG: sleeping function called from invalid context at net/core/sock.c:3507
  [ 1293.513915] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1068, name: cat
  [ 1293.515381] preempt_count: 1, expected: 0
  [ 1293.516321] RCU nest depth: 0, expected: 0
  [ 1293.517294] 3 locks held by cat/1068:
  [ 1293.518165]  #0: ffff88800818fc48 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x59/0x470
  [ 1293.519383]  #1: ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs]
  [ 1293.520865]  #2: ffff888011c9a540 (sk_lock-AF_INET-CIFS){+.+.}-{0:0}, at: inet_getname+0x29/0xa0
  [ 1293.522098] CPU: 3 PID: 1068 Comm: cat Not tainted 6.6.0-rc7 #2
  [ 1293.522901] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [ 1293.524368] Call Trace:
  [ 1293.524711]  <TASK>
  [ 1293.525015]  dump_stack_lvl+0x64/0x80
  [ 1293.525519]  __might_resched+0x173/0x280
  [ 1293.526059]  lock_sock_nested+0x43/0x80
  [ 1293.526578]  ? inet_getname+0x29/0xa0
  [ 1293.527097]  inet_getname+0x29/0xa0
  [ 1293.527584]  cifs_debug_data_proc_show+0xcf9/0xee0 [cifs]
  [ 1293.528360]  seq_read_iter+0x118/0x470
  [ 1293.528877]  proc_reg_read_iter+0x53/0x90
  [ 1293.529419]  ? srso_alias_return_thunk+0x5/0x7f
  [ 1293.530037]  vfs_read+0x201/0x350
  [ 1293.530507]  ksys_read+0x75/0x100
  [ 1293.530968]  do_syscall_64+0x3f/0x90
  [ 1293.531461]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  [ 1293.532138] RIP: 0033:0x7f71d767e381
  [ 1293.532630] Code: ff ff eb c3 e8 0e ea 01 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 90 90 80 3d a5 f6 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 48 83 ec 28 48 89 54
  [ 1293.535095] RSP: 002b:00007ffc312d65a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [ 1293.536106] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f71d767e381
  [ 1293.537056] RDX: 0000000000020000 RSI: 00007f71d74f8000 RDI: 0000000000000003
  [ 1293.538003] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000
  [ 1293.538957] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f71d74f8000
  [ 1293.539908] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000020000
  [ 1293.540877]  </TASK>
  [ 1293.541233] 
  [ 1293.541449] ======================================================
  [ 1293.542270] WARNING: possible circular locking dependency detected
  [ 1293.543098] 6.6.0-rc7 #2 Tainted: G        W         
  [ 1293.543782] ------------------------------------------------------
  [ 1293.544606] cat/1068 is trying to acquire lock:
  [ 1293.545214] ffffffffc015b5f8 (&cifs_tcp_ses_lock){+.+.}-{2:2}, at: cifs_put_tcp_session+0x1c/0x180 [cifs]
  [ 1293.546516] 
  [ 1293.546516] but task is already holding lock:
  [ 1293.547292] ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs]
  [ 1293.548454] 
  [ 1293.548454] which lock already depends on the new lock.
  [ 1293.548454] 
  [ 1293.549350] 
  [ 1293.549350] the existing dependency chain (in reverse order) is:
  [ 1293.550183] 
  [ 1293.550183] -> #1 (&ret_buf->chan_lock){+.+.}-{2:2}:
  [ 1293.550899]        _raw_spin_lock+0x34/0x80
  [ 1293.551401]        cifs_debug_data_proc_show+0x555/0xee0 [cifs]
  [ 1293.552082]        seq_read_iter+0x118/0x470
  [ 1293.552556]        proc_reg_read_iter+0x53/0x90
  [ 1293.553054]        vfs_read+0x201/0x350
  [ 1293.553490]        ksys_read+0x75/0x100
  [ 1293.553925]        do_syscall_64+0x3f/0x90
  [ 1293.554389]        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  [ 1293.555004] 
  [ 1293.555004] -> #0 (&cifs_tcp_ses_lock){+.+.}-{2:2}:
  [ 1293.555709]        __lock_acquire+0x1521/0x2660
  [ 1293.556218]        lock_acquire+0xbf/0x2b0
  [ 1293.556680]        _raw_spin_lock+0x34/0x80
  [ 1293.557148]        cifs_put_tcp_session+0x1c/0x180 [cifs]
  [ 1293.557773]        cifs_debug_data_proc_show+0xd15/0xee0 [cifs]
  [ 1293.558463]        seq_read_iter+0x118/0x470
  [ 1293.558945]        proc_reg_read_iter+0x53/0x90
  [ 1293.559450]        vfs_read+0x201/0x350
  [ 1293.559882]        ksys_read+0x75/0x100
  [ 1293.560317]        do_syscall_64+0x3f/0x90
  [ 1293.560773]        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  [ 1293.561390] 
  [ 1293.561390] other info that might help us debug this:
  [ 1293.561390] 
  [ 1293.562267]  Possible unsafe locking scenario:
  [ 1293.562267] 
  [ 1293.562927]        CPU0                    CPU1
  [ 1293.563394]        ----                    ----
  [ 1293.563754]   lock(&ret_buf->chan_lock);
  [ 1293.564068]                                lock(&cifs_tcp_ses_lock);
  [ 1293.564573]                                lock(&ret_buf->chan_lock);
  [ 1293.565077]   lock(&cifs_tcp_ses_lock);
  [ 1293.565387] 
  [ 1293.565387]  *** DEADLOCK ***
  [ 1293.565387] 
  [ 1293.565852] 2 locks held by cat/1068:
  [ 1293.566147]  #0: ffff88800818fc48 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x59/0x470
  [ 1293.566767]  #1: ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs]
  [ 1293.567611] 
  [ 1293.567611] stack backtrace:
  [ 1293.567954] CPU: 3 PID: 1068 Comm: cat Tainted: G        W          6.6.0-rc7 #2
  [ 1293.568536] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [ 1293.569387] Call Trace:
  [ 1293.569585]  <TASK>
  [ 1293.569755]  dump_stack_lvl+0x4a/0x80
  [ 1293.570047]  check_noncircular+0x14e/0x170
  [ 1293.570373]  ? save_trace+0x3e/0x390
  [ 1293.570659]  __lock_acquire+0x1521/0x2660
  [ 1293.570982]  lock_acquire+0xbf/0x2b0
  [ 1293.571268]  ? cifs_put_tcp_session+0x1c/0x180 [cifs]
  [ 1293.571687]  _raw_spin_lock+0x34/0x80
  [ 1293.571977]  ? cifs_put_tcp_session+0x1c/0x180 [cifs]
  [ 1293.572394]  cifs_put_tcp_session+0x1c/0x180 [cifs]
  [ 1293.572795]  cifs_debug_data_proc_show+0xd15/0xee0 [cifs]
  [ 1293.573241]  seq_read_iter+0x118/0x470
  [ 1293.573546]  proc_reg_read_iter+0x53/0x90
  [ 1293.573861]  ? srso_alias_return_thunk+0x5/0x7f
  [ 1293.574218]  vfs_read+0x201/0x350
  [ 1293.574489]  ksys_read+0x75/0x100
  [ 1293.574752]  do_syscall_64+0x3f/0x90
  [ 1293.575030]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  [ 1293.575428] RIP: 0033:0x7f71d767e381
  [ 1293.575716] Code: ff ff eb c3 e8 0e ea 01 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 90 90 80 3d a5 f6 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 48 83 ec 28 48 89 54
  [ 1293.577151] RSP: 002b:00007ffc312d65a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [ 1293.577736] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f71d767e381
  [ 1293.578286] RDX: 0000000000020000 RSI: 00007f71d74f8000 RDI: 0000000000000003
  [ 1293.578839] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000
  [ 1293.579391] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f71d74f8000
  [ 1293.579951] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000020000
  [ 1293.580511]  </TASK>

  parent reply	other threads:[~2023-11-01 14:12 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-30 11:00 [PATCH 01/14] cifs: print server capabilities in DebugData nspmangalore
2023-10-30 11:00 ` [PATCH 02/14] cifs: add xid to query server interface call nspmangalore
2023-10-31  5:35   ` Bharath SM
2023-10-30 11:00 ` [PATCH 03/14] cifs: reconnect helper should set reconnect for the right channel nspmangalore
2023-10-31 15:27   ` Paulo Alcantara
2023-10-31 18:29     ` Steve French
2023-10-30 11:00 ` [PATCH 04/14] cifs: do not reset chan_max if multichannel is not supported at mount nspmangalore
2023-11-01  2:57   ` Steve French
2023-11-01  3:14   ` Steve French
2023-10-30 11:00 ` [PATCH 05/14] cifs: force interface update before a fresh session setup nspmangalore
2023-11-01  3:14   ` Steve French
2023-10-30 11:00 ` [PATCH 06/14] cifs: handle cases where a channel is closed nspmangalore
2023-11-01  3:09   ` Steve French
2023-11-02 12:26     ` Shyam Prasad N
2023-10-30 11:00 ` [PATCH 07/14] cifs: distribute channels across interfaces based on speed nspmangalore
2023-10-30 11:00 ` [PATCH 08/14] cifs: account for primary channel in the interface list nspmangalore
2023-11-08 15:44   ` Paulo Alcantara
2023-11-08 18:16     ` Steve French
2023-11-08 19:03       ` Paulo Alcantara
2023-10-30 11:00 ` [PATCH 09/14] cifs: add a back pointer to cifs_sb from tcon nspmangalore
2023-11-01  3:30   ` Steve French
2023-11-03 21:03   ` Paulo Alcantara
2023-11-06 16:12     ` Shyam Prasad N
2023-11-06 17:04       ` Shyam Prasad N
     [not found]         ` <CAH2r5msQLTcdiHBrOKd+q6LPPHW_Jj3QbpFZyZ48CJbrtDqC5w@mail.gmail.com>
     [not found]           ` <CAH2r5mt4hC5x2w2D46y13j_OtjkJk9_ZaeGXbb7YKukffBk2LQ@mail.gmail.com>
2023-11-06 19:36             ` Fwd: " Steve French
2023-11-08 15:24         ` Paulo Alcantara
2023-11-08 16:11           ` Steve French
2023-10-30 11:00 ` [PATCH 10/14] cifs: reconnect work should have reference on server struct nspmangalore
2023-11-16 17:10   ` Paulo Alcantara
     [not found]     ` <CAH2r5mtDeP323Z8=9WjCCYVVb9B2AmO5Q4PDtcMz8wxVUCVRBA@mail.gmail.com>
2023-11-16 19:35       ` Paulo Alcantara
2023-10-30 11:00 ` [PATCH 11/14] cifs: handle when server starts supporting multichannel nspmangalore
2023-11-01  3:30   ` Steve French
2023-11-01 15:52   ` Paulo Alcantara
2023-11-04  7:50     ` Shyam Prasad N
2023-11-02 20:28   ` Paulo Alcantara
2023-11-03  0:43     ` Steve French
2023-11-03 20:32       ` Paulo Alcantara
     [not found]       ` <notmuch-sha1-c3bfa7f4ae0bb24c5ee7cfddb408c2fbeca5d8f7>
2023-11-08 16:02         ` Paulo Alcantara
2023-11-08 19:25           ` Steve French
2023-11-08 19:31             ` Paulo Alcantara
2023-10-30 11:00 ` [PATCH 12/14] cifs: handle when server stops " nspmangalore
2023-11-08 16:35   ` Paulo Alcantara
     [not found]   ` <notmuch-sha1-9ed0289358ca5c90903408ad9c0ac0310afee598>
2023-11-08 19:13     ` Paulo Alcantara
2023-11-08 19:41       ` Paulo Alcantara
2023-11-09 11:44         ` Shyam Prasad N
2023-11-09 13:28           ` Paulo Alcantara
2023-11-09 13:49             ` Shyam Prasad N
2023-11-10  4:09               ` Shyam Prasad N
2023-11-11 17:23                 ` Paulo Alcantara
2023-11-12 18:52                   ` Steve French
     [not found]                   ` <CAH2r5mvG3zLBxknPOuaz9=GarZO6n6bhcduiZHHfiqVYZYJiVQ@mail.gmail.com>
2023-11-12 19:32                     ` Paulo Alcantara
2023-10-30 11:00 ` [PATCH 13/14] cifs: display the endpoint IP details in DebugData nspmangalore
2023-10-31 15:18   ` Paulo Alcantara
     [not found]   ` <notmuch-sha1-260ef7fe7af7face0e1486229c0fda5149fe14e2>
2023-11-01 14:12     ` Paulo Alcantara [this message]
2023-11-01 14:19       ` Steve French
2023-11-04  7:44       ` Shyam Prasad N
2023-11-04 19:00         ` Paulo Alcantara
2023-10-30 12:34 ` [PATCH 01/14] cifs: print server capabilities " Bharath SM
2023-10-30 12:40   ` Shyam Prasad N
2023-10-30 12:51     ` Shyam Prasad N
2023-10-30 14:54 ` Steve French

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d1c99946663662e7160bf1ed0a6b2dc6.pc@manguebit.com \
    --to=pc@manguebit.com \
    --cc=bharathsm.hsk@gmail.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=nspmangalore@gmail.com \
    --cc=smfrench@gmail.com \
    --cc=sprasad@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox