Linux CIFS filesystem development
 help / color / mirror / Atom feed
* [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
@ 2024-11-08 22:29 Paul Aurich
  2024-11-08 22:29 ` [PATCH 1/5] smb: cached directories can be more than root file handle Paul Aurich
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

The SMB client cached directory functionality has a few problems around
flaky/lost server connections, which manifest as a pair of BUGs when
eventually unmounting the server connection:

    [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
    [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)

Based on bisection, these issues started with the lease directory cache
handling introduced in commit ebe98f1447bb ("cifs: enable caching of
directories for which a lease is held"), and go away if I mount with
'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
(v6.1.x), but the issues persist even in current git versions.  I think the
situation was improved (occurrence frequency went down) with
commit 5c86919455c1 ("smb: client: fix use-after-free in
smb2_query_info_compound()").


I'm able to reproduce the "Dentry still in use" errors by connecting to an
actively-used SMB share (the server organically generates lease breaks) and
leaving these running for 'a while':

- while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
- while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done

('a while' is anywhere from 10 minutes to overnight. Also, it's not the
cleanest reproducer, but I stopped iterating once I had something that was
even remotely reliable for me...)

This series attempts to fix these, as well as a use-after-free that could
occur because open_cached_dir() explicitly frees the cached_fid, rather than
relying on reference counting.


The last patch in this series (smb: During umount, flush any pending lease
breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
The issue there:

Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
the server for a cached directory.  When this happens, the cfid is removed
from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
cifs_kill_sb continues on before the queued work puts the dentry, we trigger
the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
this, but some thoughts:

1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
   entire workqueue is potentially doing more work that necessary.  Should
   there be a workqueue that's more appropriately scoped?
2. With an unresponsive server, this causes umount (even umount -l) to hang
   (waiting for SMB2_close calls), and when I test with backports on a 6.1
   kernel, appears to cause a deadlock between kill_sb and some cifs
   reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
   was addressed by changes to fs/super.c, so not really an SMB problem, but
   just an indication that flush_waitqueue isn't the right solution).
3. Should cached_dir_lease_break() drop the dentry before queueing work
   (and if so, is it OK to do this under the spinlock, or should the spinlock
   be dropped first)?
4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
   while looping?


Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
v6.1 for stable, but it's not a clean backport because some other important
fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
smb2_query_info_compound()") weren't picked up.

Paul Aurich (5):
  smb: cached directories can be more than root file handle
  smb: Don't leak cfid when reconnect races with open_cached_dir
  smb: prevent use-after-free due to open_cached_dir error paths
  smb: No need to wait for work when cleaning up cached directories
  smb: During umount, flush any pending lease breaks for cached dirs

 fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
 1 file changed, 47 insertions(+), 59 deletions(-)

--
2.45.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] smb: cached directories can be more than root file handle
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
@ 2024-11-08 22:29 ` Paul Aurich
  2024-11-08 22:29 ` [PATCH 2/5] smb: Don't leak cfid when reconnect races with open_cached_dir Paul Aurich
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

Update this log message since cached fids may represent things other
than the root of a mount.

Fixes: e4029e072673 ("cifs: find and use the dentry for cached non-root directories also")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
---
 fs/smb/client/cached_dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index 0ff2491c311d..adcba1335204 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -399,11 +399,11 @@ int open_cached_dir_by_dentry(struct cifs_tcon *tcon,
 		return -ENOENT;
 
 	spin_lock(&cfids->cfid_list_lock);
 	list_for_each_entry(cfid, &cfids->entries, entry) {
 		if (dentry && cfid->dentry == dentry) {
-			cifs_dbg(FYI, "found a cached root file handle by dentry\n");
+			cifs_dbg(FYI, "found a cached file handle by dentry\n");
 			kref_get(&cfid->refcount);
 			*ret_cfid = cfid;
 			spin_unlock(&cfids->cfid_list_lock);
 			return 0;
 		}
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] smb: Don't leak cfid when reconnect races with open_cached_dir
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
  2024-11-08 22:29 ` [PATCH 1/5] smb: cached directories can be more than root file handle Paul Aurich
@ 2024-11-08 22:29 ` Paul Aurich
  2024-11-08 22:29 ` [PATCH 3/5] smb: prevent use-after-free due to open_cached_dir error paths Paul Aurich
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

open_cached_dir() may either race with the tcon reconnection even before
compound_send_recv() or directly trigger a reconnection via
SMB2_open_init() or SMB_query_info_init().

The reconnection process invokes invalidate_all_cached_dirs() via
cifs_mark_open_files_invalid(), which removes all cfids from the
cfids->entries list but doesn't drop a ref if has_lease isn't true. This
results in the currently-being-constructed cfid not being on the list,
but still having a refcount of 2. It leaks if returned from
open_cached_dir().

Fix this by setting cfid->has_lease when the ref is actually taken; the
cfid will not be used by other threads until it has a valid time.

Addresses these kmemleaks:

unreferenced object 0xffff8881090c4000 (size 1024):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 32 bytes):
    00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de  ........".......
    00 ca 45 22 81 88 ff ff f8 dc 4f 04 81 88 ff ff  ..E"......O.....
  backtrace (crc 6f58c20f):
    [<ffffffff8b895a1e>] __kmalloc_cache_noprof+0x2be/0x350
    [<ffffffff8bda06e3>] open_cached_dir+0x993/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
unreferenced object 0xffff8881044fdcf8 (size 8):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 8 bytes):
    00 cc cc cc cc cc cc cc                          ........
  backtrace (crc 10c106a9):
    [<ffffffff8b89a3d3>] __kmalloc_node_track_caller_noprof+0x363/0x480
    [<ffffffff8b7d7256>] kstrdup+0x36/0x60
    [<ffffffff8bda0700>] open_cached_dir+0x9b0/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

And addresses these BUG splats when unmounting the SMB filesystem:

BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
WARNING: CPU: 3 PID: 3433 at fs/dcache.c:1536 umount_check+0xd0/0x100
Modules linked in:
CPU: 3 UID: 0 PID: 3433 Comm: bash Not tainted 6.12.0-rc4-g850925a8133c-dirty #49
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:umount_check+0xd0/0x100
Code: 8d 7c 24 40 e8 31 5a f4 ff 49 8b 54 24 40 41 56 49 89 e9 45 89 e8 48 89 d9 41 57 48 89 de 48 c7 c7 80 e7 db ac e8 f0 72 9a ff <0f> 0b 58 31 c0 5a 5b 5d 41 5c 41 5d 41 5e 41 5f e9 2b e5 5d 01 41
RSP: 0018:ffff88811cc27978 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888140590ba0 RCX: ffffffffaaf20bae
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881f6fb6f40
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed1023984ee3
R10: ffff88811cc2771f R11: 00000000016cfcc0 R12: ffff888134383e08
R13: 0000000000000002 R14: ffff8881462ec668 R15: ffffffffaceab4c0
FS:  00007f23bfa98740(0000) GS:ffff8881f6f80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556de4a6f808 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 d_walk+0x6a/0x530
 shrink_dcache_for_umount+0x6a/0x200
 generic_shutdown_super+0x52/0x2a0
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 0d 11 93 0d 00 f7 d8 64 89 01 b8 ff ff ff ff eb bf 0f 1f 44 00 00 b8 50 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 92 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffee9138598 EFLAGS: 00000246 ORIG_RAX: 0000000000000050
RAX: 0000000000000000 RBX: 0000558f1803e9a0 RCX: 00007f23bfb93ae7
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000558f1803e9a0
RBP: 0000558f1803e600 R08: 0000000000000007 R09: 0000558f17fab610
R10: d91d5ec34ab757b0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000015 R15: 0000000000000000
 </TASK>
irq event stamp: 1163486
hardirqs last  enabled at (1163485): [<ffffffffac98d344>] _raw_spin_unlock_irqrestore+0x34/0x60
hardirqs last disabled at (1163486): [<ffffffffac97dcfc>] __schedule+0xc7c/0x19a0
softirqs last  enabled at (1163482): [<ffffffffab79a3ee>] __smb_send_rqst+0x3de/0x990
softirqs last disabled at (1163480): [<ffffffffac2314f1>] release_sock+0x21/0xf0
---[ end trace 0000000000000000 ]---

VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 3433 Comm: bash Tainted: G        W          6.12.0-rc4-g850925a8133c-dirty #49
Tainted: [W]=WARN
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0

This reproduces eventually with an SMB mount and two shells running
these loops concurrently

- while true; do
      cd ~; sleep 1;
      for i in {1..3}; do cd /mnt/test/subdir;
          echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1;
      done;
      echo ...;
  done
- while true; do
      iptables -F OUTPUT; mount -t cifs -a;
      for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done;
      iptables -I OUTPUT -p tcp --dport 445 -j DROP;
      sleep 10
      echo "unmounting"; umount -l -t cifs -a; echo "done unmounting";
      sleep 20
      echo "recovering"; iptables -F OUTPUT;
      sleep 10;
  done

Fixes: ebe98f1447bb ("cifs: enable caching of directories for which a lease is held")
Fixes: 5c86919455c1 ("smb: client: fix use-after-free in smb2_query_info_compound()")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
---
 fs/smb/client/cached_dir.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index adcba1335204..bb9d4c284ce5 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -57,10 +57,20 @@ static struct cached_fid *find_or_create_cached_dir(struct cached_fids *cfids,
 	cfid->cfids = cfids;
 	cfids->num_entries++;
 	list_add(&cfid->entry, &cfids->entries);
 	cfid->on_list = true;
 	kref_get(&cfid->refcount);
+	/*
+	 * Set @cfid->has_lease to true during construction so that the lease
+	 * reference can be put in cached_dir_lease_break() due to a potential
+	 * lease break right after the request is sent or while @cfid is still
+	 * being cached, or if a reconnection is triggered during construction.
+	 * Concurrent processes won't be to use it yet due to @cfid->time being
+	 * zero.
+	 */
+	cfid->has_lease = true;
+
 	spin_unlock(&cfids->cfid_list_lock);
 	return cfid;
 }
 
 static struct dentry *
@@ -174,16 +184,16 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 	if (cfid == NULL) {
 		kfree(utf16_path);
 		return -ENOENT;
 	}
 	/*
-	 * Return cached fid if it has a lease.  Otherwise, it is either a new
-	 * entry or laundromat worker removed it from @cfids->entries.  Caller
-	 * will put last reference if the latter.
+	 * Return cached fid if it is valid (has a lease and has a time).
+	 * Otherwise, it is either a new entry or laundromat worker removed it
+	 * from @cfids->entries.  Caller will put last reference if the latter.
 	 */
 	spin_lock(&cfids->cfid_list_lock);
-	if (cfid->has_lease) {
+	if (cfid->has_lease && cfid->time) {
 		spin_unlock(&cfids->cfid_list_lock);
 		*ret_cfid = cfid;
 		kfree(utf16_path);
 		return 0;
 	}
@@ -265,19 +275,10 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 	if (rc)
 		goto oshr_free;
 
 	smb2_set_related(&rqst[1]);
 
-	/*
-	 * Set @cfid->has_lease to true before sending out compounded request so
-	 * its lease reference can be put in cached_dir_lease_break() due to a
-	 * potential lease break right after the request is sent or while @cfid
-	 * is still being cached.  Concurrent processes won't be to use it yet
-	 * due to @cfid->time being zero.
-	 */
-	cfid->has_lease = true;
-
 	if (retries) {
 		smb2_set_replay(server, &rqst[0]);
 		smb2_set_replay(server, &rqst[1]);
 	}
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] smb: prevent use-after-free due to open_cached_dir error paths
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
  2024-11-08 22:29 ` [PATCH 1/5] smb: cached directories can be more than root file handle Paul Aurich
  2024-11-08 22:29 ` [PATCH 2/5] smb: Don't leak cfid when reconnect races with open_cached_dir Paul Aurich
@ 2024-11-08 22:29 ` Paul Aurich
  2024-11-08 22:29 ` [PATCH 4/5] smb: No need to wait for work when cleaning up cached directories Paul Aurich
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cifds_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty #87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
---
 fs/smb/client/cached_dir.c | 70 ++++++++++++++++----------------------
 1 file changed, 29 insertions(+), 41 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index bb9d4c284ce5..06eb19dabb0e 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -346,10 +346,11 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 oshr_free:
 	SMB2_open_free(&rqst[0]);
 	SMB2_query_info_free(&rqst[1]);
 	free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base);
 	free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base);
+out:
 	if (rc) {
 		spin_lock(&cfids->cfid_list_lock);
 		if (cfid->on_list) {
 			list_del(&cfid->entry);
 			cfid->on_list = false;
@@ -357,27 +358,18 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 		}
 		if (cfid->has_lease) {
 			/*
 			 * We are guaranteed to have two references at this
 			 * point. One for the caller and one for a potential
-			 * lease. Release the Lease-ref so that the directory
-			 * will be closed when the caller closes the cached
-			 * handle.
+			 * lease. Release one here, and the second below.
 			 */
 			cfid->has_lease = false;
-			spin_unlock(&cfids->cfid_list_lock);
 			kref_put(&cfid->refcount, smb2_close_cached_fid);
-			goto out;
 		}
 		spin_unlock(&cfids->cfid_list_lock);
-	}
-out:
-	if (rc) {
-		if (cfid->is_open)
-			SMB2_close(0, cfid->tcon, cfid->fid.persistent_fid,
-				   cfid->fid.volatile_fid);
-		free_cached_dir(cfid);
+
+		kref_put(&cfid->refcount, smb2_close_cached_fid);
 	} else {
 		*ret_cfid = cfid;
 		atomic_inc(&tcon->num_remote_opens);
 	}
 	kfree(utf16_path);
@@ -511,42 +503,38 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon)
 	list_for_each_entry_safe(cfid, q, &cfids->entries, entry) {
 		list_move(&cfid->entry, &entry);
 		cfids->num_entries--;
 		cfid->is_open = false;
 		cfid->on_list = false;
-		/* To prevent race with smb2_cached_lease_break() */
-		kref_get(&cfid->refcount);
+		if (cfid->has_lease) {
+			/*
+			 * The lease was never cancelled from the server,
+			 * so steal that reference.
+			 */
+			cfid->has_lease = false;
+		} else
+			kref_get(&cfid->refcount);
 	}
 	spin_unlock(&cfids->cfid_list_lock);
 
 	list_for_each_entry_safe(cfid, q, &entry, entry) {
 		list_del(&cfid->entry);
 		cancel_work_sync(&cfid->lease_break);
-		if (cfid->has_lease) {
-			/*
-			 * We lease was never cancelled from the server so we
-			 * need to drop the reference.
-			 */
-			spin_lock(&cfids->cfid_list_lock);
-			cfid->has_lease = false;
-			spin_unlock(&cfids->cfid_list_lock);
-			kref_put(&cfid->refcount, smb2_close_cached_fid);
-		}
-		/* Drop the extra reference opened above*/
+		/*
+		 * Drop the ref-count from above, either the lease-ref (if there
+		 * was one) or the extra one acquired.
+		 */
 		kref_put(&cfid->refcount, smb2_close_cached_fid);
 	}
 }
 
 static void
 smb2_cached_lease_break(struct work_struct *work)
 {
 	struct cached_fid *cfid = container_of(work,
 				struct cached_fid, lease_break);
 
-	spin_lock(&cfid->cfids->cfid_list_lock);
-	cfid->has_lease = false;
-	spin_unlock(&cfid->cfids->cfid_list_lock);
 	kref_put(&cfid->refcount, smb2_close_cached_fid);
 }
 
 int cached_dir_lease_break(struct cifs_tcon *tcon, __u8 lease_key[16])
 {
@@ -560,10 +548,11 @@ int cached_dir_lease_break(struct cifs_tcon *tcon, __u8 lease_key[16])
 	list_for_each_entry(cfid, &cfids->entries, entry) {
 		if (cfid->has_lease &&
 		    !memcmp(lease_key,
 			    cfid->fid.lease_key,
 			    SMB2_LEASE_KEY_SIZE)) {
+			cfid->has_lease = false;
 			cfid->time = 0;
 			/*
 			 * We found a lease remove it from the list
 			 * so no threads can access it.
 			 */
@@ -637,12 +626,18 @@ static void cfids_laundromat_worker(struct work_struct *work)
 		if (cfid->time &&
 		    time_after(jiffies, cfid->time + HZ * dir_cache_timeout)) {
 			cfid->on_list = false;
 			list_move(&cfid->entry, &entry);
 			cfids->num_entries--;
-			/* To prevent race with smb2_cached_lease_break() */
-			kref_get(&cfid->refcount);
+			if (cfid->has_lease) {
+				/*
+				 * Our lease has not yet been cancelled from the
+				 * server. Steal that reference.
+				 */
+				cfid->has_lease = false;
+			} else
+				kref_get(&cfid->refcount);
 		}
 	}
 	spin_unlock(&cfids->cfid_list_lock);
 
 	list_for_each_entry_safe(cfid, q, &entry, entry) {
@@ -650,21 +645,14 @@ static void cfids_laundromat_worker(struct work_struct *work)
 		/*
 		 * Cancel and wait for the work to finish in case we are racing
 		 * with it.
 		 */
 		cancel_work_sync(&cfid->lease_break);
-		if (cfid->has_lease) {
-			/*
-			 * Our lease has not yet been cancelled from the server
-			 * so we need to drop the reference.
-			 */
-			spin_lock(&cfids->cfid_list_lock);
-			cfid->has_lease = false;
-			spin_unlock(&cfids->cfid_list_lock);
-			kref_put(&cfid->refcount, smb2_close_cached_fid);
-		}
-		/* Drop the extra reference opened above */
+		/*
+		 * Drop the ref-count from above, either the lease-ref (if there
+		 * was one) or the extra one acquired.
+		 */
 		kref_put(&cfid->refcount, smb2_close_cached_fid);
 	}
 	queue_delayed_work(cifsiod_wq, &cfids->laundromat_work,
 			   dir_cache_timeout * HZ);
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] smb: No need to wait for work when cleaning up cached directories
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
                   ` (2 preceding siblings ...)
  2024-11-08 22:29 ` [PATCH 3/5] smb: prevent use-after-free due to open_cached_dir error paths Paul Aurich
@ 2024-11-08 22:29 ` Paul Aurich
  2024-11-08 22:29 ` [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs Paul Aurich
  2024-11-08 22:39 ` [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Steve French
  5 siblings, 0 replies; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

It isn't possible for cfids_laundromat_worker(),
invalidate_all_cached_dirs(), and cached_dir_lease_break() to race with
each other. Each holds the spinlock while walking the list of cfids, and
removes entries from the list. cfids_laundromat_worker() and
invalidate_all_cached_dirs() will never see a cfid that has pending
work.

Signed-off-by: Paul Aurich <paul@darkrain42.org>
---
 fs/smb/client/cached_dir.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index 06eb19dabb0e..de1e41abdaf2 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -516,11 +516,10 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon)
 	}
 	spin_unlock(&cfids->cfid_list_lock);
 
 	list_for_each_entry_safe(cfid, q, &entry, entry) {
 		list_del(&cfid->entry);
-		cancel_work_sync(&cfid->lease_break);
 		/*
 		 * Drop the ref-count from above, either the lease-ref (if there
 		 * was one) or the extra one acquired.
 		 */
 		kref_put(&cfid->refcount, smb2_close_cached_fid);
@@ -594,10 +593,12 @@ static struct cached_fid *init_cached_dir(const char *path)
 
 static void free_cached_dir(struct cached_fid *cfid)
 {
 	struct cached_dirent *dirent, *q;
 
+	WARN_ON(work_pending(&cfid->lease_break));
+
 	dput(cfid->dentry);
 	cfid->dentry = NULL;
 
 	/*
 	 * Delete all cached dirent names
@@ -640,15 +641,10 @@ static void cfids_laundromat_worker(struct work_struct *work)
 	}
 	spin_unlock(&cfids->cfid_list_lock);
 
 	list_for_each_entry_safe(cfid, q, &entry, entry) {
 		list_del(&cfid->entry);
-		/*
-		 * Cancel and wait for the work to finish in case we are racing
-		 * with it.
-		 */
-		cancel_work_sync(&cfid->lease_break);
 		/*
 		 * Drop the ref-count from above, either the lease-ref (if there
 		 * was one) or the extra one acquired.
 		 */
 		kref_put(&cfid->refcount, smb2_close_cached_fid);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
                   ` (3 preceding siblings ...)
  2024-11-08 22:29 ` [PATCH 4/5] smb: No need to wait for work when cleaning up cached directories Paul Aurich
@ 2024-11-08 22:29 ` Paul Aurich
  2024-11-08 22:33   ` Steve French
  2024-11-08 22:39 ` [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Steve French
  5 siblings, 1 reply; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:29 UTC (permalink / raw)
  To: linux-cifs, Steve French
  Cc: paul, Paulo Alcantara, Ronnie Sahlberg, Shyam Prasad N,
	Tom Talpey, Bharath SM

*** WORK IN PROGRESS ***

If a lease break is received right as a filesystem is being unmounted,
it's possible for cifs_kill_sb() to race with a queued
smb2_cached_lease_break(). Since the cfid is no longer on the
cfids->entries list, close_all_cached_dirs() cannot drop the dentry,
leading to the unmount to report these BUGs:

BUG: Dentry ffff88814f37e358{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!

Flush anything in the cifsiod_wq workqueue in close_all_cached_dirs.

Fixes: ebe98f1447bb ("cifs: enable caching of directories for which a lease is held")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
---
 fs/smb/client/cached_dir.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
index de1e41abdaf2..931108b3bb4a 100644
--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -482,10 +482,13 @@ void close_all_cached_dirs(struct cifs_sb_info *cifs_sb)
 		list_for_each_entry(cfid, &cfids->entries, entry) {
 			dput(cfid->dentry);
 			cfid->dentry = NULL;
 		}
 	}
+
+	/* Flush any pending lease breaks */
+	flush_workqueue(cifsiod_wq);
 }
 
 /*
  * Invalidate all cached dirs when a TCON has been reset
  * due to a session loss.
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs
  2024-11-08 22:29 ` [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs Paul Aurich
@ 2024-11-08 22:33   ` Steve French
  0 siblings, 0 replies; 13+ messages in thread
From: Steve French @ 2024-11-08 22:33 UTC (permalink / raw)
  To: Paul Aurich
  Cc: linux-cifs, Steve French, Paulo Alcantara, Ronnie Sahlberg,
	Shyam Prasad N, Tom Talpey, Bharath SM

Aah - thanks for this, I will try to test it out as well, and could be
very important if it also addresses a problem with the laundromat
cleanup race (with directory lease freeing) that I have also seen on
umount to Windows server.

Are there other patches in the series you want us to look at and/or
test (your email said patch 5/5)?


On Fri, Nov 8, 2024 at 4:29 PM Paul Aurich <paul@darkrain42.org> wrote:
>
> *** WORK IN PROGRESS ***
>
> If a lease break is received right as a filesystem is being unmounted,
> it's possible for cifs_kill_sb() to race with a queued
> smb2_cached_lease_break(). Since the cfid is no longer on the
> cfids->entries list, close_all_cached_dirs() cannot drop the dentry,
> leading to the unmount to report these BUGs:
>
> BUG: Dentry ffff88814f37e358{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
> VFS: Busy inodes after unmount of cifs (cifs)
> ------------[ cut here ]------------
> kernel BUG at fs/super.c:661!
>
> Flush anything in the cifsiod_wq workqueue in close_all_cached_dirs.
>
> Fixes: ebe98f1447bb ("cifs: enable caching of directories for which a lease is held")
> Cc: stable@vger.kernel.org
> Signed-off-by: Paul Aurich <paul@darkrain42.org>
> ---
>  fs/smb/client/cached_dir.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c
> index de1e41abdaf2..931108b3bb4a 100644
> --- a/fs/smb/client/cached_dir.c
> +++ b/fs/smb/client/cached_dir.c
> @@ -482,10 +482,13 @@ void close_all_cached_dirs(struct cifs_sb_info *cifs_sb)
>                 list_for_each_entry(cfid, &cfids->entries, entry) {
>                         dput(cfid->dentry);
>                         cfid->dentry = NULL;
>                 }
>         }
> +
> +       /* Flush any pending lease breaks */
> +       flush_workqueue(cifsiod_wq);
>  }
>
>  /*
>   * Invalidate all cached dirs when a TCON has been reset
>   * due to a session loss.
> --
> 2.45.2
>
>


--
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
                   ` (4 preceding siblings ...)
  2024-11-08 22:29 ` [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs Paul Aurich
@ 2024-11-08 22:39 ` Steve French
  2024-11-08 22:44   ` Paul Aurich
  5 siblings, 1 reply; 13+ messages in thread
From: Steve French @ 2024-11-08 22:39 UTC (permalink / raw)
  To: Paul Aurich
  Cc: linux-cifs, Steve French, Paulo Alcantara, Ronnie Sahlberg,
	Shyam Prasad N, Tom Talpey, Bharath SM, Ralph Böhme

The perf improvement allowing caching of directories (helping "ls"
then "ls" again for same dir) not just the perf improvement with "ls
"then "stat" could be very important.

Did this series also address Ralph's point about missing requesting
"H" (handle caching) flag when requesting directory leases from the
server?

On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
>
> The SMB client cached directory functionality has a few problems around
> flaky/lost server connections, which manifest as a pair of BUGs when
> eventually unmounting the server connection:
>
>     [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
>     [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
>
> Based on bisection, these issues started with the lease directory cache
> handling introduced in commit ebe98f1447bb ("cifs: enable caching of
> directories for which a lease is held"), and go away if I mount with
> 'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
> (v6.1.x), but the issues persist even in current git versions.  I think the
> situation was improved (occurrence frequency went down) with
> commit 5c86919455c1 ("smb: client: fix use-after-free in
> smb2_query_info_compound()").
>
>
> I'm able to reproduce the "Dentry still in use" errors by connecting to an
> actively-used SMB share (the server organically generates lease breaks) and
> leaving these running for 'a while':
>
> - while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
> - while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
>
> ('a while' is anywhere from 10 minutes to overnight. Also, it's not the
> cleanest reproducer, but I stopped iterating once I had something that was
> even remotely reliable for me...)
>
> This series attempts to fix these, as well as a use-after-free that could
> occur because open_cached_dir() explicitly frees the cached_fid, rather than
> relying on reference counting.
>
>
> The last patch in this series (smb: During umount, flush any pending lease
> breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
> The issue there:
>
> Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
> the server for a cached directory.  When this happens, the cfid is removed
> from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
> cifs_kill_sb continues on before the queued work puts the dentry, we trigger
> the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
> this, but some thoughts:
>
> 1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
>    entire workqueue is potentially doing more work that necessary.  Should
>    there be a workqueue that's more appropriately scoped?
> 2. With an unresponsive server, this causes umount (even umount -l) to hang
>    (waiting for SMB2_close calls), and when I test with backports on a 6.1
>    kernel, appears to cause a deadlock between kill_sb and some cifs
>    reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
>    was addressed by changes to fs/super.c, so not really an SMB problem, but
>    just an indication that flush_waitqueue isn't the right solution).
> 3. Should cached_dir_lease_break() drop the dentry before queueing work
>    (and if so, is it OK to do this under the spinlock, or should the spinlock
>    be dropped first)?
> 4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
>    while looping?
>
>
> Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
> v6.1 for stable, but it's not a clean backport because some other important
> fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
> smb2_query_info_compound()") weren't picked up.
>
> Paul Aurich (5):
>   smb: cached directories can be more than root file handle
>   smb: Don't leak cfid when reconnect races with open_cached_dir
>   smb: prevent use-after-free due to open_cached_dir error paths
>   smb: No need to wait for work when cleaning up cached directories
>   smb: During umount, flush any pending lease breaks for cached dirs
>
>  fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
>  1 file changed, 47 insertions(+), 59 deletions(-)
>
> --
> 2.45.2
>
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-08 22:39 ` [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Steve French
@ 2024-11-08 22:44   ` Paul Aurich
  2024-11-08 23:09     ` Enzo Matsumiya
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Aurich @ 2024-11-08 22:44 UTC (permalink / raw)
  To: Steve French
  Cc: linux-cifs, Steve French, Paulo Alcantara, Ronnie Sahlberg,
	Shyam Prasad N, Tom Talpey, Bharath SM, Ralph Böhme

No, this series doesn't try to address that. I was just focused on the 
issues around the lifecycle of the cfids and the dentry:s.

~Paul

On 2024-11-08 16:39:03 -0600, Steve French wrote:
>The perf improvement allowing caching of directories (helping "ls"
>then "ls" again for same dir) not just the perf improvement with "ls
>"then "stat" could be very important.
>
>Did this series also address Ralph's point about missing requesting
>"H" (handle caching) flag when requesting directory leases from the
>server?
>
>On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
>>
>> The SMB client cached directory functionality has a few problems around
>> flaky/lost server connections, which manifest as a pair of BUGs when
>> eventually unmounting the server connection:
>>
>>     [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
>>     [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
>>
>> Based on bisection, these issues started with the lease directory cache
>> handling introduced in commit ebe98f1447bb ("cifs: enable caching of
>> directories for which a lease is held"), and go away if I mount with
>> 'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
>> (v6.1.x), but the issues persist even in current git versions.  I think the
>> situation was improved (occurrence frequency went down) with
>> commit 5c86919455c1 ("smb: client: fix use-after-free in
>> smb2_query_info_compound()").
>>
>>
>> I'm able to reproduce the "Dentry still in use" errors by connecting to an
>> actively-used SMB share (the server organically generates lease breaks) and
>> leaving these running for 'a while':
>>
>> - while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
>> - while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
>>
>> ('a while' is anywhere from 10 minutes to overnight. Also, it's not the
>> cleanest reproducer, but I stopped iterating once I had something that was
>> even remotely reliable for me...)
>>
>> This series attempts to fix these, as well as a use-after-free that could
>> occur because open_cached_dir() explicitly frees the cached_fid, rather than
>> relying on reference counting.
>>
>>
>> The last patch in this series (smb: During umount, flush any pending lease
>> breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
>> The issue there:
>>
>> Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
>> the server for a cached directory.  When this happens, the cfid is removed
>> from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
>> cifs_kill_sb continues on before the queued work puts the dentry, we trigger
>> the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
>> this, but some thoughts:
>>
>> 1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
>>    entire workqueue is potentially doing more work that necessary.  Should
>>    there be a workqueue that's more appropriately scoped?
>> 2. With an unresponsive server, this causes umount (even umount -l) to hang
>>    (waiting for SMB2_close calls), and when I test with backports on a 6.1
>>    kernel, appears to cause a deadlock between kill_sb and some cifs
>>    reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
>>    was addressed by changes to fs/super.c, so not really an SMB problem, but
>>    just an indication that flush_waitqueue isn't the right solution).
>> 3. Should cached_dir_lease_break() drop the dentry before queueing work
>>    (and if so, is it OK to do this under the spinlock, or should the spinlock
>>    be dropped first)?
>> 4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
>>    while looping?
>>
>>
>> Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
>> v6.1 for stable, but it's not a clean backport because some other important
>> fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
>> smb2_query_info_compound()") weren't picked up.
>>
>> Paul Aurich (5):
>>   smb: cached directories can be more than root file handle
>>   smb: Don't leak cfid when reconnect races with open_cached_dir
>>   smb: prevent use-after-free due to open_cached_dir error paths
>>   smb: No need to wait for work when cleaning up cached directories
>>   smb: During umount, flush any pending lease breaks for cached dirs
>>
>>  fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
>>  1 file changed, 47 insertions(+), 59 deletions(-)
>>
>> --
>> 2.45.2
>>
>>
>
>
>-- 
>Thanks,
>
>Steve


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-08 22:44   ` Paul Aurich
@ 2024-11-08 23:09     ` Enzo Matsumiya
  2024-11-10  0:49       ` Steve French
  0 siblings, 1 reply; 13+ messages in thread
From: Enzo Matsumiya @ 2024-11-08 23:09 UTC (permalink / raw)
  To: Steve French, linux-cifs, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	Ralph Böhme

On 11/08, Paul Aurich wrote:
>No, this series doesn't try to address that. I was just focused on the 
>issues around the lifecycle of the cfids and the dentry:s.

I'll be reviewing the patches next week, but for now I can say +1.

We've been debugging this issue for the past month or so, and it's been
quite hard to understand/debug who was racing who.

The 'dentry still in use' bug seems to appear only for the root dentry,
whereas cached_fids for subdirectories sometimes can get duplicates, and
thus one dangling, where in the end an underflow + double list_del()
happens to it, e.g.:

(this is coming from cifs_readdir())
crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount 0xffff892f4b5b3e00
   entry = {
     next = 0xffff892e053ed400,
     prev = 0xffff892d8e3ecb88
   },
   cfids = 0xffff892d8e3ecb80,
   path = 0xffff892f48b7b3b0 "\\dir1\\dir2\\dir3",
   has_lease = 0x0,
   is_open = 0x0,
   on_list = 0x1,
   time = 0x0,
   tcon = 0x0,
   refcount = { ... counter = 0x2 ... }

(this is at the crashing frame on smb2_close_cached_fid())
crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount ffff892e053ee000
   entry = {
     next = 0xdead000000000100,
     prev = 0xdead000000000122
   },
   cfids = 0xffff892d8e3ecb80,
   path = 0xffff892f825ced30 "\\dir1\\dir2\\dir3",
   has_lease = 0x0,
   is_open = 0x1,
   on_list = 0x1,
   time = 0x1040998fc,
   tcon = 0xffff892fe0b4d000,
   refcount = { ... counter = 0x0 ... }

You can see that the second one had already been deleted (refcount=0,
->entry is poisoned), but the first one hasn't been filled in by
open_cached_dir() yet, but already has 2 references to it.

A reliable reproducer I found for this was running xfstests '-g
generic/dir' in one terminal, and generic/072 some seconds later.

(in the beginning I thought that a reconnect was required to trigger
this bug, but any kind of interruption (I could trigger it with ctrl-c
mid-xfstests a few times) should trigger it)

actimeo >= 0 seems to play a role as well, because things can get quite
complicated (unsure if problematic though) with a callchain such as:

open_cached_dir
   -> path_to_dentry
     -> lookup_positive_unlocked
       -> lookup_dcache
         -> cifs_d_revalidate (dentry->d_op->d_revalidate)
           -> cifs_revalidate_dentry
             -> cifs_revalidate_dentry_attr
               -> cifs_dentry_needs_reval
                 -> open_cached_dir_by_dentry


Anyway, thanks a lot for you patches, Paul.  Like I said, I'll be
testing + reviewing them soon.


Cheers,

Enzo

>On 2024-11-08 16:39:03 -0600, Steve French wrote:
>>The perf improvement allowing caching of directories (helping "ls"
>>then "ls" again for same dir) not just the perf improvement with "ls
>>"then "stat" could be very important.
>>
>>Did this series also address Ralph's point about missing requesting
>>"H" (handle caching) flag when requesting directory leases from the
>>server?
>>
>>On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
>>>
>>>The SMB client cached directory functionality has a few problems around
>>>flaky/lost server connections, which manifest as a pair of BUGs when
>>>eventually unmounting the server connection:
>>>
>>>    [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
>>>    [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
>>>
>>>Based on bisection, these issues started with the lease directory cache
>>>handling introduced in commit ebe98f1447bb ("cifs: enable caching of
>>>directories for which a lease is held"), and go away if I mount with
>>>'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
>>>(v6.1.x), but the issues persist even in current git versions.  I think the
>>>situation was improved (occurrence frequency went down) with
>>>commit 5c86919455c1 ("smb: client: fix use-after-free in
>>>smb2_query_info_compound()").
>>>
>>>
>>>I'm able to reproduce the "Dentry still in use" errors by connecting to an
>>>actively-used SMB share (the server organically generates lease breaks) and
>>>leaving these running for 'a while':
>>>
>>>- while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
>>>- while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
>>>
>>>('a while' is anywhere from 10 minutes to overnight. Also, it's not the
>>>cleanest reproducer, but I stopped iterating once I had something that was
>>>even remotely reliable for me...)
>>>
>>>This series attempts to fix these, as well as a use-after-free that could
>>>occur because open_cached_dir() explicitly frees the cached_fid, rather than
>>>relying on reference counting.
>>>
>>>
>>>The last patch in this series (smb: During umount, flush any pending lease
>>>breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
>>>The issue there:
>>>
>>>Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
>>>the server for a cached directory.  When this happens, the cfid is removed
>>>from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
>>>cifs_kill_sb continues on before the queued work puts the dentry, we trigger
>>>the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
>>>this, but some thoughts:
>>>
>>>1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
>>>   entire workqueue is potentially doing more work that necessary.  Should
>>>   there be a workqueue that's more appropriately scoped?
>>>2. With an unresponsive server, this causes umount (even umount -l) to hang
>>>   (waiting for SMB2_close calls), and when I test with backports on a 6.1
>>>   kernel, appears to cause a deadlock between kill_sb and some cifs
>>>   reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
>>>   was addressed by changes to fs/super.c, so not really an SMB problem, but
>>>   just an indication that flush_waitqueue isn't the right solution).
>>>3. Should cached_dir_lease_break() drop the dentry before queueing work
>>>   (and if so, is it OK to do this under the spinlock, or should the spinlock
>>>   be dropped first)?
>>>4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
>>>   while looping?
>>>
>>>
>>>Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
>>>v6.1 for stable, but it's not a clean backport because some other important
>>>fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
>>>smb2_query_info_compound()") weren't picked up.
>>>
>>>Paul Aurich (5):
>>>  smb: cached directories can be more than root file handle
>>>  smb: Don't leak cfid when reconnect races with open_cached_dir
>>>  smb: prevent use-after-free due to open_cached_dir error paths
>>>  smb: No need to wait for work when cleaning up cached directories
>>>  smb: During umount, flush any pending lease breaks for cached dirs
>>>
>>> fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
>>> 1 file changed, 47 insertions(+), 59 deletions(-)
>>>
>>>--
>>>2.45.2
>>>
>>>
>>
>>
>>-- 
>>Thanks,
>>
>>Steve
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-08 23:09     ` Enzo Matsumiya
@ 2024-11-10  0:49       ` Steve French
  2024-11-13 19:08         ` Paul Aurich
  0 siblings, 1 reply; 13+ messages in thread
From: Steve French @ 2024-11-10  0:49 UTC (permalink / raw)
  To: Enzo Matsumiya
  Cc: linux-cifs, Steve French, Paulo Alcantara, Ronnie Sahlberg,
	Shyam Prasad N, Tom Talpey, Bharath SM, Ralph Böhme

Running the buildbot on these against Windows with directory leases
enabled (ie not using "nohandlecache") I see a crash in generic/246 in
cifs_mount (see call trace at end of the email for more details). The
series does improve things, it fixes the oops that would cause e.g.
generic/048 to crash e.g. it fixes this one

[ 2039.224037] BUG: Dentry 00000000114e5018{i=6000000019c6c,n=/}
still in use (2) [unmount of cifs cifs]
[ 2039.224086] WARNING: CPU: 1 PID: 123637 at fs/dcache.c:1536
umount_check+0xc3/0xf0
[ 2039.224112] Modules linked in: cifs ccm cmac nls_utf8 cifs_arc4
nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
nfs lockd grace netfs nf_conntrack_netbios_ns nf_conntrack_broadcast
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat
nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle
ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw
iptable_security ip_set ebtable_filter ebtables ip6table_filter
ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net
net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink
zram xfs bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper
crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel
sha512_ssse3 sha256_ssse3 floppy virtio_blk sha1_ssse3 qemu_fw_cfg
virtio_console [last unloaded: cifs]
[ 2039.224382] CPU: 1 UID: 0 PID: 123637 Comm: rm Not tainted 6.12.0-rc6 #1
[ 2039.224390] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
[ 2039.224395] RIP: 0010:umount_check+0xc3/0xf0
[ 2039.224401] Code: db 74 0d 48 8d 7b 40 e8 9b e3 f5 ff 48 8b 53 40
41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 74 ba a2 e8 dd
e8 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 41 83
fc 01
[ 2039.224408] RSP: 0018:ff11000128bcf968 EFLAGS: 00010282
[ 2039.224416] RAX: dffffc0000000000 RBX: ff11000110ba3d80 RCX: 0000000000000027
[ 2039.224422] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb0b1a08
[ 2039.224427] RBP: ff11000112ebb8f8 R08: ffffffffa13f759e R09: ffe21c0099616341
[ 2039.224432] R10: ff110004cb0b1a0b R11: 0000000000000001 R12: 0000000000000002
[ 2039.224437] R13: ff1100013fe0c668 R14: ffffffffc1dee340 R15: ffffffffc2094200
[ 2039.224442] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
knlGS:0000000000000000
[ 2039.224447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2039.224452] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
[ 2039.224465] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2039.224470] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2039.224475] Call Trace:
[ 2039.224479]  <TASK>
[ 2039.224485]  ? __warn+0xa9/0x220
[ 2039.224497]  ? umount_check+0xc3/0xf0
[ 2039.224506]  ? report_bug+0x1d4/0x1e0
[ 2039.224521]  ? handle_bug+0x5b/0xa0
[ 2039.224529]  ? exc_invalid_op+0x18/0x50
[ 2039.224537]  ? asm_exc_invalid_op+0x1a/0x20
[ 2039.224556]  ? irq_work_claim+0x1e/0x40
[ 2039.224568]  ? umount_check+0xc3/0xf0
[ 2039.224577]  ? umount_check+0xc3/0xf0
[ 2039.224586]  ? __pfx_umount_check+0x10/0x10
[ 2039.224594]  d_walk+0x71/0x4e0
[ 2039.224604]  ? d_walk+0x4b/0x4e0
[ 2039.224616]  shrink_dcache_for_umount+0x6d/0x220
[ 2039.224629]  generic_shutdown_super+0x4a/0x1c0
[ 2039.224640]  kill_anon_super+0x22/0x40
[ 2039.224649]  cifs_kill_sb+0x78/0x90 [cifs]
[ 2039.224848]  deactivate_locked_super+0x69/0xf0
[ 2039.224858]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
[ 2039.225019]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
[ 2039.225172]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
[ 2039.225319]  ? mark_held_locks+0x65/0x90
[ 2039.225339]  ? kfree+0x1e5/0x490
[ 2039.225349]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
[ 2039.225494]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
[ 2039.225639]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
[ 2039.225821]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
[ 2039.225970]  cifs_unlink+0x21d/0x7b0 [cifs]
[ 2039.226121]  vfs_unlink+0x1bf/0x4a0
[ 2039.226131]  ? lookup_one_qstr_excl+0x24/0xd0
[ 2039.226143]  do_unlinkat+0x380/0x450
[ 2039.226154]  ? __pfx_do_unlinkat+0x10/0x10
[ 2039.226161]  ? kasan_save_track+0x14/0x30
[ 2039.226171]  ? rcu_is_watching+0x20/0x50
[ 2039.226180]  ? strncpy_from_user+0x126/0x180
[ 2039.226201]  __x64_sys_unlinkat+0x5e/0xa0
[ 2039.226211]  do_syscall_64+0x75/0x180
[ 2039.226222]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 2039.226230] RIP: 0033:0x7fdc92d7784b
[ 2039.226237] Code: 77 05 c3 0f 1f 40 00 48 8b 15 c9 85 0d 00 f7 d8
64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 07 01 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9d 85 0d 00 f7 d8 64 89
01 48
[ 2039.226244] RSP: 002b:00007ffd51debe58 EFLAGS: 00000206 ORIG_RAX:
0000000000000107
[ 2039.226253] RAX: ffffffffffffffda RBX: 0000556c6a58d5a0 RCX: 00007fdc92d7784b
[ 2039.226258] RDX: 0000000000000000 RSI: 0000556c6a58c380 RDI: 00000000ffffff9c
[ 2039.226263] RBP: 00007ffd51debf20 R08: 0000556c6a58c380 R09: 00007ffd51debf7c
[ 2039.226268] R10: 0000556c6a58d7d0 R11: 0000000000000206 R12: 0000556c6a58c2f0
[ 2039.226273] R13: 0000000000000000 R14: 00007ffd51debf80 R15: 0000000000000000
[ 2039.226297]  </TASK>
[ 2039.226301] irq event stamp: 5933
[ 2039.226305] hardirqs last  enabled at (5939): [<ffffffffa122e2de>]
__up_console_sem+0x5e/0x70
[ 2039.226314] hardirqs last disabled at (5944): [<ffffffffa122e2c3>]
__up_console_sem+0x43/0x70
[ 2039.226320] softirqs last  enabled at (3096): [<ffffffffa11340ce>]
__irq_exit_rcu+0xfe/0x120
[ 2039.226327] softirqs last disabled at (2941): [<ffffffffa11340ce>]
__irq_exit_rcu+0xfe/0x120
[ 2039.226333] ---[ end trace 0000000000000000 ]---
[ 2040.931627] VFS: Busy inodes after unmount of cifs (cifs)
[ 2040.931693] ------------[ cut here ]------------
[ 2040.931699] kernel BUG at fs/super.c:650!
[ 2040.932770] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 2040.932846] CPU: 1 UID: 0 PID: 123637 Comm: rm Tainted: G        W
        6.12.0-rc6 #1
[ 2040.932936] Tainted: [W]=WARN
[ 2040.932969] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
[ 2040.933028] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0
[ 2040.933086] Code: 7b 28 e8 cc ca f8 ff 48 8b 6b 28 48 89 ef e8 c0
ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 60 14 ba a2 e8 d9
6c b6 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
90 90
[ 2040.933252] RSP: 0018:ff11000128bcfa38 EFLAGS: 00010282
[ 2040.933305] RAX: 000000000000002d RBX: ff1100013fe0c000 RCX: ffffffffa12c915e
[ 2040.933370] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ff110004cb0b7080
[ 2040.933436] RBP: ffffffffc20529a0 R08: 0000000000000001 R09: ffe21c0025179f0f
[ 2040.933500] R10: ff11000128bcf87f R11: 0000000000000001 R12: ff1100013fe0c9c0
[ 2040.933565] R13: ff1100013fe0c780 R14: ff11000168113378 R15: ffffffffc2094200
[ 2040.933630] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
knlGS:0000000000000000
[ 2040.933703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2040.933757] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
[ 2040.933828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2040.933905] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2040.933972] Call Trace:
[ 2040.934000]  <TASK>
[ 2040.934024]  ? die+0x37/0x90
[ 2040.934058]  ? do_trap+0x133/0x230
[ 2040.934094]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934144]  ? do_error_trap+0x94/0x130
[ 2040.934182]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934229]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934277]  ? handle_invalid_op+0x2c/0x40
[ 2040.934318]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934364]  ? exc_invalid_op+0x2f/0x50
[ 2040.934405]  ? asm_exc_invalid_op+0x1a/0x20
[ 2040.934453]  ? tick_nohz_tick_stopped+0x1e/0x40
[ 2040.934502]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934551]  ? generic_shutdown_super+0x1b7/0x1c0
[ 2040.934600]  kill_anon_super+0x22/0x40
[ 2040.934641]  cifs_kill_sb+0x78/0x90 [cifs]
[ 2040.934851]  deactivate_locked_super+0x69/0xf0
[ 2040.934915]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
[ 2040.935125]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
[ 2040.935331]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
[ 2040.935537]  ? mark_held_locks+0x65/0x90
[ 2040.935581]  ? kfree+0x1e5/0x490
[ 2040.937096]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
[ 2040.938778]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
[ 2040.940478]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
[ 2040.942193]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
[ 2040.943865]  cifs_unlink+0x21d/0x7b0 [cifs]
[ 2040.945527]  vfs_unlink+0x1bf/0x4a0
[ 2040.947008]  ? lookup_one_qstr_excl+0x24/0xd0
[ 2040.948451]  do_unlinkat+0x380/0x450
[ 2040.949889]  ? __pfx_do_unlinkat+0x10/0x10
[ 2040.951306]  ? kasan_save_track+0x14/0x30
[ 2040.952708]  ? rcu_is_watching+0x20/0x50
[ 2040.954135]  ? strncpy_from_user+0x126/0x180
[ 2040.955494]  __x64_sys_unlinkat+0x5e/0xa0
[ 2040.956779]  do_syscall_64+0x75/0x180
[ 2040.958053]  entry_SYSCALL_64_after_hwframe+0x76/0x7e



but I did see a crash in generic/246 with current mainline + the five
directory lease patches (see below).  Any ideas?

[ 4315.917270] CIFS: Attempting to mount //win16.vm.test/Share
[ 4315.928088] ==================================================================
[ 4315.930402] BUG: KASAN: slab-use-after-free in
rwsem_down_write_slowpath+0xa34/0xaf0
[ 4315.932655] Read of size 4 at addr ff1100012ad38034 by task mount.cifs/340968

[ 4315.937119] CPU: 2 UID: 0 PID: 340968 Comm: mount.cifs Tainted: G
   D W          6.12.0-rc6 #1
[ 4315.939454] Tainted: [D]=DIE, [W]=WARN
[ 4315.941747] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
[ 4315.944094] Call Trace:
[ 4315.946425]  <TASK>
[ 4315.948724]  dump_stack_lvl+0x79/0xb0
[ 4315.951046]  print_report+0xcb/0x620
[ 4315.953359]  ? __virt_addr_valid+0x19a/0x300
[ 4315.955682]  ? rwsem_down_write_slowpath+0xa34/0xaf0
[ 4315.957994]  kasan_report+0xbd/0xf0
[ 4315.960317]  ? rwsem_down_write_slowpath+0xa34/0xaf0
[ 4315.962610]  rwsem_down_write_slowpath+0xa34/0xaf0
[ 4315.964835]  ? kasan_save_stack+0x34/0x50
[ 4315.967030]  ? __pfx_rwsem_down_write_slowpath+0x10/0x10
[ 4315.969221]  ? cifs_mount+0xfb/0x3b0 [cifs]
[ 4315.971586]  ? cifs_smb3_do_mount+0x1a5/0xc10 [cifs]
[ 4315.974108]  ? smb3_get_tree+0x1f0/0x430 [cifs]
[ 4315.976511]  ? rcu_is_watching+0x20/0x50
[ 4315.978712]  ? trace_lock_acquire+0x116/0x150
[ 4315.980884]  ? lock_acquire+0x40/0x90
[ 4315.982989]  ? super_lock+0xea/0x1d0
[ 4315.985036]  ? super_lock+0xea/0x1d0
[ 4315.986997]  down_write+0x15b/0x160
[ 4315.988893]  ? __pfx_down_write+0x10/0x10
[ 4315.990751]  ? __mod_timer+0x407/0x590
[ 4315.992531]  super_lock+0xea/0x1d0
[ 4315.994231]  ? __pfx_super_lock+0x10/0x10
[ 4315.995891]  ? __pfx_lock_release+0x10/0x10
[ 4315.997528]  ? rcu_is_watching+0x20/0x50
[ 4315.999137]  ? lock_release+0xa5/0x3d0
[ 4316.000705]  ? cifs_match_super+0x177/0x650 [cifs]
[ 4316.002385]  grab_super+0x80/0x1e0
[ 4316.003884]  ? __pfx_grab_super+0x10/0x10
[ 4316.005351]  ? cifs_put_tlink+0xa1/0xc0 [cifs]
[ 4316.006990]  ? cifs_match_super+0x17f/0x650 [cifs]
[ 4316.008620]  ? __pfx_cifs_match_super+0x10/0x10 [cifs]
[ 4316.010236]  sget+0x121/0x350
[ 4316.011694]  ? __pfx_cifs_set_super+0x10/0x10 [cifs]
[ 4316.013293]  cifs_smb3_do_mount+0x293/0xc10 [cifs]
[ 4316.014933]  ? __pfx___mutex_lock+0x10/0x10
[ 4316.016333]  ? cred_has_capability.isra.0+0xd4/0x1a0
[ 4316.017752]  ? __pfx_cifs_smb3_do_mount+0x10/0x10 [cifs]
[ 4316.019338]  smb3_get_tree+0x1f0/0x430 [cifs]
[ 4316.020910]  vfs_get_tree+0x50/0x180
[ 4316.022292]  path_mount+0x5d6/0xf20
[ 4316.023668]  ? __pfx_path_mount+0x10/0x10
[ 4316.024983]  ? user_path_at+0x45/0x60
[ 4316.026275]  __x64_sys_mount+0x174/0x1b0
[ 4316.027552]  ? __pfx___x64_sys_mount+0x10/0x10
[ 4316.028794]  ? rcu_is_watching+0x20/0x50
[ 4316.030021]  do_syscall_64+0x75/0x180
[ 4316.031226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 4316.032476] RIP: 0033:0x7f979fa7b8fe
[ 4316.033742] Code: 48 8b 0d 1d a5 0c 00 f7 d8 64 89 01 48 83 c8 ff
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea a4 0c 00 f7 d8 64 89
01 48
[ 4316.036462] RSP: 002b:00007fffbaa4f598 EFLAGS: 00000246 ORIG_RAX:
00000000000000a5
[ 4316.037874] RAX: ffffffffffffffda RBX: 0000557684547471 RCX: 00007f979fa7b8fe
[ 4316.039284] RDX: 0000557684547471 RSI: 00005576845474d7 RDI: 00007fffbaa50caf
[ 4316.040701] RBP: 000000000000000a R08: 00005576847deeb0 R09: 0000000000000000
[ 4316.042106] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffbaa50caf
[ 4316.043522] R13: 00005576847dff20 R14: 00005576847deeb0 R15: 00007f979fb75000
[ 4316.044928]  </TASK>

[ 4316.047670] Allocated by task 340820:
[ 4316.049036]  kasan_save_stack+0x24/0x50
[ 4316.050414]  kasan_save_track+0x14/0x30
[ 4316.051771]  __kasan_slab_alloc+0x59/0x70
[ 4316.053139]  kmem_cache_alloc_node_noprof+0x116/0x330
[ 4316.054529]  copy_process+0x299/0x45e0
[ 4316.055915]  kernel_clone+0xf2/0x4b0
[ 4316.057299]  __do_sys_clone+0x90/0xb0
[ 4316.058708]  do_syscall_64+0x75/0x180
[ 4316.060095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[ 4316.062893] Freed by task 0:
[ 4316.064283]  kasan_save_stack+0x24/0x50
[ 4316.065695]  kasan_save_track+0x14/0x30
[ 4316.067076]  kasan_save_free_info+0x3b/0x60
[ 4316.068462]  __kasan_slab_free+0x38/0x50
[ 4316.069841]  kmem_cache_free+0x239/0x5a0
[ 4316.071213]  delayed_put_task_struct+0x149/0x1b0
[ 4316.072608]  rcu_do_batch+0x2f4/0x880
[ 4316.073979]  rcu_core+0x3d6/0x510
[ 4316.075340]  handle_softirqs+0x10f/0x580
[ 4316.076708]  __irq_exit_rcu+0xfe/0x120
[ 4316.078052]  irq_exit_rcu+0xe/0x20
[ 4316.079400]  sysvec_apic_timer_interrupt+0x76/0x90
[ 4316.080747]  asm_sysvec_apic_timer_interrupt+0x1a/0x20

[ 4316.083488] Last potentially related work creation:
[ 4316.084845]  kasan_save_stack+0x24/0x50
[ 4316.086188]  __kasan_record_aux_stack+0x8e/0xa0
[ 4316.087564]  __call_rcu_common.constprop.0+0x87/0x920
[ 4316.088932]  release_task+0x836/0xc60
[ 4316.090296]  wait_consider_task+0x9db/0x19c0
[ 4316.091682]  __do_wait+0xe9/0x390
[ 4316.093046]  do_wait+0xcb/0x230
[ 4316.094412]  kernel_wait4+0xe4/0x1a0
[ 4316.095767]  __do_sys_wait4+0xce/0xe0
[ 4316.097110]  do_syscall_64+0x75/0x180
[ 4316.098455]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[ 4316.100994] The buggy address belongs to the object at ff1100012ad38000
                which belongs to the cache task_struct of size 14656
[ 4316.103497] The buggy address is located 52 bytes inside of
                freed 14656-byte region [ff1100012ad38000, ff1100012ad3b940)

[ 4316.107248] The buggy address belongs to the physical page:
[ 4316.108537] page: refcount:1 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x12ad38
[ 4316.109846] head: order:3 mapcount:0 entire_mapcount:0
nr_pages_mapped:0 pincount:0
[ 4316.111193] memcg:ff11000100d2c6c1
[ 4316.112543] anon flags:
0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[ 4316.113945] page_type: f5(slab)
[ 4316.115364] raw: 0017ffffc0000040 ff11000100280a00 0000000000000000
0000000000000001
[ 4316.116852] raw: 0000000000000000 0000000000020002 00000001f5000000
ff11000100d2c6c1
[ 4316.118329] head: 0017ffffc0000040 ff11000100280a00
0000000000000000 0000000000000001
[ 4316.119818] head: 0000000000000000 0000000000020002
00000001f5000000 ff11000100d2c6c1
[ 4316.121302] head: 0017ffffc0000003 ffd4000004ab4e01
ffffffffffffffff 0000000000000000
[ 4316.122818] head: 0000000000000008 0000000000000000
00000000ffffffff 0000000000000000
[ 4316.124323] page dumped because: kasan: bad access detected

[ 4316.127318] Memory state around the buggy address:
[ 4316.128847]  ff1100012ad37f00: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[ 4316.130427]  ff1100012ad37f80: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[ 4316.132006] >ff1100012ad38000: fa fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 4316.133594]                                      ^
[ 4316.135170]  ff1100012ad38080: fb fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 4316.136796]  ff1100012ad38100: fb fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 4316.138399] ==================================================================


On Fri, Nov 8, 2024 at 5:12 PM Enzo Matsumiya <ematsumiya@suse.de> wrote:
>
> On 11/08, Paul Aurich wrote:
> >No, this series doesn't try to address that. I was just focused on the
> >issues around the lifecycle of the cfids and the dentry:s.
>
> I'll be reviewing the patches next week, but for now I can say +1.
>
> We've been debugging this issue for the past month or so, and it's been
> quite hard to understand/debug who was racing who.
>
> The 'dentry still in use' bug seems to appear only for the root dentry,
> whereas cached_fids for subdirectories sometimes can get duplicates, and
> thus one dangling, where in the end an underflow + double list_del()
> happens to it, e.g.:
>
> (this is coming from cifs_readdir())
> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount 0xffff892f4b5b3e00
>    entry = {
>      next = 0xffff892e053ed400,
>      prev = 0xffff892d8e3ecb88
>    },
>    cfids = 0xffff892d8e3ecb80,
>    path = 0xffff892f48b7b3b0 "\\dir1\\dir2\\dir3",
>    has_lease = 0x0,
>    is_open = 0x0,
>    on_list = 0x1,
>    time = 0x0,
>    tcon = 0x0,
>    refcount = { ... counter = 0x2 ... }
>
> (this is at the crashing frame on smb2_close_cached_fid())
> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount ffff892e053ee000
>    entry = {
>      next = 0xdead000000000100,
>      prev = 0xdead000000000122
>    },
>    cfids = 0xffff892d8e3ecb80,
>    path = 0xffff892f825ced30 "\\dir1\\dir2\\dir3",
>    has_lease = 0x0,
>    is_open = 0x1,
>    on_list = 0x1,
>    time = 0x1040998fc,
>    tcon = 0xffff892fe0b4d000,
>    refcount = { ... counter = 0x0 ... }
>
> You can see that the second one had already been deleted (refcount=0,
> ->entry is poisoned), but the first one hasn't been filled in by
> open_cached_dir() yet, but already has 2 references to it.
>
> A reliable reproducer I found for this was running xfstests '-g
> generic/dir' in one terminal, and generic/072 some seconds later.
>
> (in the beginning I thought that a reconnect was required to trigger
> this bug, but any kind of interruption (I could trigger it with ctrl-c
> mid-xfstests a few times) should trigger it)
>
> actimeo >= 0 seems to play a role as well, because things can get quite
> complicated (unsure if problematic though) with a callchain such as:
>
> open_cached_dir
>    -> path_to_dentry
>      -> lookup_positive_unlocked
>        -> lookup_dcache
>          -> cifs_d_revalidate (dentry->d_op->d_revalidate)
>            -> cifs_revalidate_dentry
>              -> cifs_revalidate_dentry_attr
>                -> cifs_dentry_needs_reval
>                  -> open_cached_dir_by_dentry
>
>
> Anyway, thanks a lot for you patches, Paul.  Like I said, I'll be
> testing + reviewing them soon.
>
>
> Cheers,
>
> Enzo
>
> >On 2024-11-08 16:39:03 -0600, Steve French wrote:
> >>The perf improvement allowing caching of directories (helping "ls"
> >>then "ls" again for same dir) not just the perf improvement with "ls
> >>"then "stat" could be very important.
> >>
> >>Did this series also address Ralph's point about missing requesting
> >>"H" (handle caching) flag when requesting directory leases from the
> >>server?
> >>
> >>On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
> >>>
> >>>The SMB client cached directory functionality has a few problems around
> >>>flaky/lost server connections, which manifest as a pair of BUGs when
> >>>eventually unmounting the server connection:
> >>>
> >>>    [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
> >>>    [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
> >>>
> >>>Based on bisection, these issues started with the lease directory cache
> >>>handling introduced in commit ebe98f1447bb ("cifs: enable caching of
> >>>directories for which a lease is held"), and go away if I mount with
> >>>'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
> >>>(v6.1.x), but the issues persist even in current git versions.  I think the
> >>>situation was improved (occurrence frequency went down) with
> >>>commit 5c86919455c1 ("smb: client: fix use-after-free in
> >>>smb2_query_info_compound()").
> >>>
> >>>
> >>>I'm able to reproduce the "Dentry still in use" errors by connecting to an
> >>>actively-used SMB share (the server organically generates lease breaks) and
> >>>leaving these running for 'a while':
> >>>
> >>>- while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
> >>>- while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
> >>>
> >>>('a while' is anywhere from 10 minutes to overnight. Also, it's not the
> >>>cleanest reproducer, but I stopped iterating once I had something that was
> >>>even remotely reliable for me...)
> >>>
> >>>This series attempts to fix these, as well as a use-after-free that could
> >>>occur because open_cached_dir() explicitly frees the cached_fid, rather than
> >>>relying on reference counting.
> >>>
> >>>
> >>>The last patch in this series (smb: During umount, flush any pending lease
> >>>breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
> >>>The issue there:
> >>>
> >>>Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
> >>>the server for a cached directory.  When this happens, the cfid is removed
> >>>from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
> >>>cifs_kill_sb continues on before the queued work puts the dentry, we trigger
> >>>the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
> >>>this, but some thoughts:
> >>>
> >>>1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
> >>>   entire workqueue is potentially doing more work that necessary.  Should
> >>>   there be a workqueue that's more appropriately scoped?
> >>>2. With an unresponsive server, this causes umount (even umount -l) to hang
> >>>   (waiting for SMB2_close calls), and when I test with backports on a 6.1
> >>>   kernel, appears to cause a deadlock between kill_sb and some cifs
> >>>   reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
> >>>   was addressed by changes to fs/super.c, so not really an SMB problem, but
> >>>   just an indication that flush_waitqueue isn't the right solution).
> >>>3. Should cached_dir_lease_break() drop the dentry before queueing work
> >>>   (and if so, is it OK to do this under the spinlock, or should the spinlock
> >>>   be dropped first)?
> >>>4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
> >>>   while looping?
> >>>
> >>>
> >>>Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
> >>>v6.1 for stable, but it's not a clean backport because some other important
> >>>fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
> >>>smb2_query_info_compound()") weren't picked up.
> >>>
> >>>Paul Aurich (5):
> >>>  smb: cached directories can be more than root file handle
> >>>  smb: Don't leak cfid when reconnect races with open_cached_dir
> >>>  smb: prevent use-after-free due to open_cached_dir error paths
> >>>  smb: No need to wait for work when cleaning up cached directories
> >>>  smb: During umount, flush any pending lease breaks for cached dirs
> >>>
> >>> fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
> >>> 1 file changed, 47 insertions(+), 59 deletions(-)
> >>>
> >>>--
> >>>2.45.2
> >>>
> >>>
> >>
> >>
> >>--
> >>Thanks,
> >>
> >>Steve
> >
> >



--
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-10  0:49       ` Steve French
@ 2024-11-13 19:08         ` Paul Aurich
  2024-11-18 17:40           ` Steve French
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Aurich @ 2024-11-13 19:08 UTC (permalink / raw)
  To: Steve French
  Cc: Enzo Matsumiya, linux-cifs, Steve French, Paulo Alcantara,
	Ronnie Sahlberg, Shyam Prasad N, Tom Talpey, Bharath SM,
	Ralph Böhme

I'm working on a v2 set of patches that I hope would address this (I'm 
guessing it's somehow caused by the flush_workqueue() call, although to be 
honest, I have no idea).

My testing has also turned up another issue that needs fixing (the laundromat 
thread can also race with cifs_kill_sb, resulting in yet another 'Dentry still 
in use' oops).

~Paul

On 2024-11-09 18:49:54 -0600, Steve French wrote:
>Running the buildbot on these against Windows with directory leases
>enabled (ie not using "nohandlecache") I see a crash in generic/246 in
>cifs_mount (see call trace at end of the email for more details). The
>series does improve things, it fixes the oops that would cause e.g.
>generic/048 to crash e.g. it fixes this one
>
>[ 2039.224037] BUG: Dentry 00000000114e5018{i=6000000019c6c,n=/}
>still in use (2) [unmount of cifs cifs]
>[ 2039.224086] WARNING: CPU: 1 PID: 123637 at fs/dcache.c:1536
>umount_check+0xc3/0xf0
>[ 2039.224112] Modules linked in: cifs ccm cmac nls_utf8 cifs_arc4
>nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
>nfs lockd grace netfs nf_conntrack_netbios_ns nf_conntrack_broadcast
>nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
>nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat
>nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle
>ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack
>nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw
>iptable_security ip_set ebtable_filter ebtables ip6table_filter
>ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net
>net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink
>zram xfs bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper
>crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel
>sha512_ssse3 sha256_ssse3 floppy virtio_blk sha1_ssse3 qemu_fw_cfg
>virtio_console [last unloaded: cifs]
>[ 2039.224382] CPU: 1 UID: 0 PID: 123637 Comm: rm Not tainted 6.12.0-rc6 #1
>[ 2039.224390] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
>[ 2039.224395] RIP: 0010:umount_check+0xc3/0xf0
>[ 2039.224401] Code: db 74 0d 48 8d 7b 40 e8 9b e3 f5 ff 48 8b 53 40
>41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 74 ba a2 e8 dd
>e8 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 41 83
>fc 01
>[ 2039.224408] RSP: 0018:ff11000128bcf968 EFLAGS: 00010282
>[ 2039.224416] RAX: dffffc0000000000 RBX: ff11000110ba3d80 RCX: 0000000000000027
>[ 2039.224422] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb0b1a08
>[ 2039.224427] RBP: ff11000112ebb8f8 R08: ffffffffa13f759e R09: ffe21c0099616341
>[ 2039.224432] R10: ff110004cb0b1a0b R11: 0000000000000001 R12: 0000000000000002
>[ 2039.224437] R13: ff1100013fe0c668 R14: ffffffffc1dee340 R15: ffffffffc2094200
>[ 2039.224442] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
>knlGS:0000000000000000
>[ 2039.224447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[ 2039.224452] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
>[ 2039.224465] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>[ 2039.224470] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>[ 2039.224475] Call Trace:
>[ 2039.224479]  <TASK>
>[ 2039.224485]  ? __warn+0xa9/0x220
>[ 2039.224497]  ? umount_check+0xc3/0xf0
>[ 2039.224506]  ? report_bug+0x1d4/0x1e0
>[ 2039.224521]  ? handle_bug+0x5b/0xa0
>[ 2039.224529]  ? exc_invalid_op+0x18/0x50
>[ 2039.224537]  ? asm_exc_invalid_op+0x1a/0x20
>[ 2039.224556]  ? irq_work_claim+0x1e/0x40
>[ 2039.224568]  ? umount_check+0xc3/0xf0
>[ 2039.224577]  ? umount_check+0xc3/0xf0
>[ 2039.224586]  ? __pfx_umount_check+0x10/0x10
>[ 2039.224594]  d_walk+0x71/0x4e0
>[ 2039.224604]  ? d_walk+0x4b/0x4e0
>[ 2039.224616]  shrink_dcache_for_umount+0x6d/0x220
>[ 2039.224629]  generic_shutdown_super+0x4a/0x1c0
>[ 2039.224640]  kill_anon_super+0x22/0x40
>[ 2039.224649]  cifs_kill_sb+0x78/0x90 [cifs]
>[ 2039.224848]  deactivate_locked_super+0x69/0xf0
>[ 2039.224858]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
>[ 2039.225019]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
>[ 2039.225172]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
>[ 2039.225319]  ? mark_held_locks+0x65/0x90
>[ 2039.225339]  ? kfree+0x1e5/0x490
>[ 2039.225349]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
>[ 2039.225494]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
>[ 2039.225639]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
>[ 2039.225821]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
>[ 2039.225970]  cifs_unlink+0x21d/0x7b0 [cifs]
>[ 2039.226121]  vfs_unlink+0x1bf/0x4a0
>[ 2039.226131]  ? lookup_one_qstr_excl+0x24/0xd0
>[ 2039.226143]  do_unlinkat+0x380/0x450
>[ 2039.226154]  ? __pfx_do_unlinkat+0x10/0x10
>[ 2039.226161]  ? kasan_save_track+0x14/0x30
>[ 2039.226171]  ? rcu_is_watching+0x20/0x50
>[ 2039.226180]  ? strncpy_from_user+0x126/0x180
>[ 2039.226201]  __x64_sys_unlinkat+0x5e/0xa0
>[ 2039.226211]  do_syscall_64+0x75/0x180
>[ 2039.226222]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>[ 2039.226230] RIP: 0033:0x7fdc92d7784b
>[ 2039.226237] Code: 77 05 c3 0f 1f 40 00 48 8b 15 c9 85 0d 00 f7 d8
>64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 07 01 00
>00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9d 85 0d 00 f7 d8 64 89
>01 48
>[ 2039.226244] RSP: 002b:00007ffd51debe58 EFLAGS: 00000206 ORIG_RAX:
>0000000000000107
>[ 2039.226253] RAX: ffffffffffffffda RBX: 0000556c6a58d5a0 RCX: 00007fdc92d7784b
>[ 2039.226258] RDX: 0000000000000000 RSI: 0000556c6a58c380 RDI: 00000000ffffff9c
>[ 2039.226263] RBP: 00007ffd51debf20 R08: 0000556c6a58c380 R09: 00007ffd51debf7c
>[ 2039.226268] R10: 0000556c6a58d7d0 R11: 0000000000000206 R12: 0000556c6a58c2f0
>[ 2039.226273] R13: 0000000000000000 R14: 00007ffd51debf80 R15: 0000000000000000
>[ 2039.226297]  </TASK>
>[ 2039.226301] irq event stamp: 5933
>[ 2039.226305] hardirqs last  enabled at (5939): [<ffffffffa122e2de>]
>__up_console_sem+0x5e/0x70
>[ 2039.226314] hardirqs last disabled at (5944): [<ffffffffa122e2c3>]
>__up_console_sem+0x43/0x70
>[ 2039.226320] softirqs last  enabled at (3096): [<ffffffffa11340ce>]
>__irq_exit_rcu+0xfe/0x120
>[ 2039.226327] softirqs last disabled at (2941): [<ffffffffa11340ce>]
>__irq_exit_rcu+0xfe/0x120
>[ 2039.226333] ---[ end trace 0000000000000000 ]---
>[ 2040.931627] VFS: Busy inodes after unmount of cifs (cifs)
>[ 2040.931693] ------------[ cut here ]------------
>[ 2040.931699] kernel BUG at fs/super.c:650!
>[ 2040.932770] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
>[ 2040.932846] CPU: 1 UID: 0 PID: 123637 Comm: rm Tainted: G        W
>        6.12.0-rc6 #1
>[ 2040.932936] Tainted: [W]=WARN
>[ 2040.932969] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
>[ 2040.933028] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0
>[ 2040.933086] Code: 7b 28 e8 cc ca f8 ff 48 8b 6b 28 48 89 ef e8 c0
>ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 60 14 ba a2 e8 d9
>6c b6 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
>90 90
>[ 2040.933252] RSP: 0018:ff11000128bcfa38 EFLAGS: 00010282
>[ 2040.933305] RAX: 000000000000002d RBX: ff1100013fe0c000 RCX: ffffffffa12c915e
>[ 2040.933370] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ff110004cb0b7080
>[ 2040.933436] RBP: ffffffffc20529a0 R08: 0000000000000001 R09: ffe21c0025179f0f
>[ 2040.933500] R10: ff11000128bcf87f R11: 0000000000000001 R12: ff1100013fe0c9c0
>[ 2040.933565] R13: ff1100013fe0c780 R14: ff11000168113378 R15: ffffffffc2094200
>[ 2040.933630] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
>knlGS:0000000000000000
>[ 2040.933703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[ 2040.933757] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
>[ 2040.933828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>[ 2040.933905] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>[ 2040.933972] Call Trace:
>[ 2040.934000]  <TASK>
>[ 2040.934024]  ? die+0x37/0x90
>[ 2040.934058]  ? do_trap+0x133/0x230
>[ 2040.934094]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934144]  ? do_error_trap+0x94/0x130
>[ 2040.934182]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934229]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934277]  ? handle_invalid_op+0x2c/0x40
>[ 2040.934318]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934364]  ? exc_invalid_op+0x2f/0x50
>[ 2040.934405]  ? asm_exc_invalid_op+0x1a/0x20
>[ 2040.934453]  ? tick_nohz_tick_stopped+0x1e/0x40
>[ 2040.934502]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934551]  ? generic_shutdown_super+0x1b7/0x1c0
>[ 2040.934600]  kill_anon_super+0x22/0x40
>[ 2040.934641]  cifs_kill_sb+0x78/0x90 [cifs]
>[ 2040.934851]  deactivate_locked_super+0x69/0xf0
>[ 2040.934915]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
>[ 2040.935125]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
>[ 2040.935331]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
>[ 2040.935537]  ? mark_held_locks+0x65/0x90
>[ 2040.935581]  ? kfree+0x1e5/0x490
>[ 2040.937096]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
>[ 2040.938778]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
>[ 2040.940478]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
>[ 2040.942193]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
>[ 2040.943865]  cifs_unlink+0x21d/0x7b0 [cifs]
>[ 2040.945527]  vfs_unlink+0x1bf/0x4a0
>[ 2040.947008]  ? lookup_one_qstr_excl+0x24/0xd0
>[ 2040.948451]  do_unlinkat+0x380/0x450
>[ 2040.949889]  ? __pfx_do_unlinkat+0x10/0x10
>[ 2040.951306]  ? kasan_save_track+0x14/0x30
>[ 2040.952708]  ? rcu_is_watching+0x20/0x50
>[ 2040.954135]  ? strncpy_from_user+0x126/0x180
>[ 2040.955494]  __x64_sys_unlinkat+0x5e/0xa0
>[ 2040.956779]  do_syscall_64+0x75/0x180
>[ 2040.958053]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
>
>
>but I did see a crash in generic/246 with current mainline + the five
>directory lease patches (see below).  Any ideas?
>
>[ 4315.917270] CIFS: Attempting to mount //win16.vm.test/Share
>[ 4315.928088] ==================================================================
>[ 4315.930402] BUG: KASAN: slab-use-after-free in
>rwsem_down_write_slowpath+0xa34/0xaf0
>[ 4315.932655] Read of size 4 at addr ff1100012ad38034 by task mount.cifs/340968
>
>[ 4315.937119] CPU: 2 UID: 0 PID: 340968 Comm: mount.cifs Tainted: G
>   D W          6.12.0-rc6 #1
>[ 4315.939454] Tainted: [D]=DIE, [W]=WARN
>[ 4315.941747] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
>[ 4315.944094] Call Trace:
>[ 4315.946425]  <TASK>
>[ 4315.948724]  dump_stack_lvl+0x79/0xb0
>[ 4315.951046]  print_report+0xcb/0x620
>[ 4315.953359]  ? __virt_addr_valid+0x19a/0x300
>[ 4315.955682]  ? rwsem_down_write_slowpath+0xa34/0xaf0
>[ 4315.957994]  kasan_report+0xbd/0xf0
>[ 4315.960317]  ? rwsem_down_write_slowpath+0xa34/0xaf0
>[ 4315.962610]  rwsem_down_write_slowpath+0xa34/0xaf0
>[ 4315.964835]  ? kasan_save_stack+0x34/0x50
>[ 4315.967030]  ? __pfx_rwsem_down_write_slowpath+0x10/0x10
>[ 4315.969221]  ? cifs_mount+0xfb/0x3b0 [cifs]
>[ 4315.971586]  ? cifs_smb3_do_mount+0x1a5/0xc10 [cifs]
>[ 4315.974108]  ? smb3_get_tree+0x1f0/0x430 [cifs]
>[ 4315.976511]  ? rcu_is_watching+0x20/0x50
>[ 4315.978712]  ? trace_lock_acquire+0x116/0x150
>[ 4315.980884]  ? lock_acquire+0x40/0x90
>[ 4315.982989]  ? super_lock+0xea/0x1d0
>[ 4315.985036]  ? super_lock+0xea/0x1d0
>[ 4315.986997]  down_write+0x15b/0x160
>[ 4315.988893]  ? __pfx_down_write+0x10/0x10
>[ 4315.990751]  ? __mod_timer+0x407/0x590
>[ 4315.992531]  super_lock+0xea/0x1d0
>[ 4315.994231]  ? __pfx_super_lock+0x10/0x10
>[ 4315.995891]  ? __pfx_lock_release+0x10/0x10
>[ 4315.997528]  ? rcu_is_watching+0x20/0x50
>[ 4315.999137]  ? lock_release+0xa5/0x3d0
>[ 4316.000705]  ? cifs_match_super+0x177/0x650 [cifs]
>[ 4316.002385]  grab_super+0x80/0x1e0
>[ 4316.003884]  ? __pfx_grab_super+0x10/0x10
>[ 4316.005351]  ? cifs_put_tlink+0xa1/0xc0 [cifs]
>[ 4316.006990]  ? cifs_match_super+0x17f/0x650 [cifs]
>[ 4316.008620]  ? __pfx_cifs_match_super+0x10/0x10 [cifs]
>[ 4316.010236]  sget+0x121/0x350
>[ 4316.011694]  ? __pfx_cifs_set_super+0x10/0x10 [cifs]
>[ 4316.013293]  cifs_smb3_do_mount+0x293/0xc10 [cifs]
>[ 4316.014933]  ? __pfx___mutex_lock+0x10/0x10
>[ 4316.016333]  ? cred_has_capability.isra.0+0xd4/0x1a0
>[ 4316.017752]  ? __pfx_cifs_smb3_do_mount+0x10/0x10 [cifs]
>[ 4316.019338]  smb3_get_tree+0x1f0/0x430 [cifs]
>[ 4316.020910]  vfs_get_tree+0x50/0x180
>[ 4316.022292]  path_mount+0x5d6/0xf20
>[ 4316.023668]  ? __pfx_path_mount+0x10/0x10
>[ 4316.024983]  ? user_path_at+0x45/0x60
>[ 4316.026275]  __x64_sys_mount+0x174/0x1b0
>[ 4316.027552]  ? __pfx___x64_sys_mount+0x10/0x10
>[ 4316.028794]  ? rcu_is_watching+0x20/0x50
>[ 4316.030021]  do_syscall_64+0x75/0x180
>[ 4316.031226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>[ 4316.032476] RIP: 0033:0x7f979fa7b8fe
>[ 4316.033742] Code: 48 8b 0d 1d a5 0c 00 f7 d8 64 89 01 48 83 c8 ff
>c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
>00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea a4 0c 00 f7 d8 64 89
>01 48
>[ 4316.036462] RSP: 002b:00007fffbaa4f598 EFLAGS: 00000246 ORIG_RAX:
>00000000000000a5
>[ 4316.037874] RAX: ffffffffffffffda RBX: 0000557684547471 RCX: 00007f979fa7b8fe
>[ 4316.039284] RDX: 0000557684547471 RSI: 00005576845474d7 RDI: 00007fffbaa50caf
>[ 4316.040701] RBP: 000000000000000a R08: 00005576847deeb0 R09: 0000000000000000
>[ 4316.042106] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffbaa50caf
>[ 4316.043522] R13: 00005576847dff20 R14: 00005576847deeb0 R15: 00007f979fb75000
>[ 4316.044928]  </TASK>
>
>[ 4316.047670] Allocated by task 340820:
>[ 4316.049036]  kasan_save_stack+0x24/0x50
>[ 4316.050414]  kasan_save_track+0x14/0x30
>[ 4316.051771]  __kasan_slab_alloc+0x59/0x70
>[ 4316.053139]  kmem_cache_alloc_node_noprof+0x116/0x330
>[ 4316.054529]  copy_process+0x299/0x45e0
>[ 4316.055915]  kernel_clone+0xf2/0x4b0
>[ 4316.057299]  __do_sys_clone+0x90/0xb0
>[ 4316.058708]  do_syscall_64+0x75/0x180
>[ 4316.060095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
>[ 4316.062893] Freed by task 0:
>[ 4316.064283]  kasan_save_stack+0x24/0x50
>[ 4316.065695]  kasan_save_track+0x14/0x30
>[ 4316.067076]  kasan_save_free_info+0x3b/0x60
>[ 4316.068462]  __kasan_slab_free+0x38/0x50
>[ 4316.069841]  kmem_cache_free+0x239/0x5a0
>[ 4316.071213]  delayed_put_task_struct+0x149/0x1b0
>[ 4316.072608]  rcu_do_batch+0x2f4/0x880
>[ 4316.073979]  rcu_core+0x3d6/0x510
>[ 4316.075340]  handle_softirqs+0x10f/0x580
>[ 4316.076708]  __irq_exit_rcu+0xfe/0x120
>[ 4316.078052]  irq_exit_rcu+0xe/0x20
>[ 4316.079400]  sysvec_apic_timer_interrupt+0x76/0x90
>[ 4316.080747]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
>
>[ 4316.083488] Last potentially related work creation:
>[ 4316.084845]  kasan_save_stack+0x24/0x50
>[ 4316.086188]  __kasan_record_aux_stack+0x8e/0xa0
>[ 4316.087564]  __call_rcu_common.constprop.0+0x87/0x920
>[ 4316.088932]  release_task+0x836/0xc60
>[ 4316.090296]  wait_consider_task+0x9db/0x19c0
>[ 4316.091682]  __do_wait+0xe9/0x390
>[ 4316.093046]  do_wait+0xcb/0x230
>[ 4316.094412]  kernel_wait4+0xe4/0x1a0
>[ 4316.095767]  __do_sys_wait4+0xce/0xe0
>[ 4316.097110]  do_syscall_64+0x75/0x180
>[ 4316.098455]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
>[ 4316.100994] The buggy address belongs to the object at ff1100012ad38000
>                which belongs to the cache task_struct of size 14656
>[ 4316.103497] The buggy address is located 52 bytes inside of
>                freed 14656-byte region [ff1100012ad38000, ff1100012ad3b940)
>
>[ 4316.107248] The buggy address belongs to the physical page:
>[ 4316.108537] page: refcount:1 mapcount:0 mapping:0000000000000000
>index:0x0 pfn:0x12ad38
>[ 4316.109846] head: order:3 mapcount:0 entire_mapcount:0
>nr_pages_mapped:0 pincount:0
>[ 4316.111193] memcg:ff11000100d2c6c1
>[ 4316.112543] anon flags:
>0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>[ 4316.113945] page_type: f5(slab)
>[ 4316.115364] raw: 0017ffffc0000040 ff11000100280a00 0000000000000000
>0000000000000001
>[ 4316.116852] raw: 0000000000000000 0000000000020002 00000001f5000000
>ff11000100d2c6c1
>[ 4316.118329] head: 0017ffffc0000040 ff11000100280a00
>0000000000000000 0000000000000001
>[ 4316.119818] head: 0000000000000000 0000000000020002
>00000001f5000000 ff11000100d2c6c1
>[ 4316.121302] head: 0017ffffc0000003 ffd4000004ab4e01
>ffffffffffffffff 0000000000000000
>[ 4316.122818] head: 0000000000000008 0000000000000000
>00000000ffffffff 0000000000000000
>[ 4316.124323] page dumped because: kasan: bad access detected
>
>[ 4316.127318] Memory state around the buggy address:
>[ 4316.128847]  ff1100012ad37f00: ff ff ff ff ff ff ff ff ff ff ff ff
>ff ff ff ff
>[ 4316.130427]  ff1100012ad37f80: ff ff ff ff ff ff ff ff ff ff ff ff
>ff ff ff ff
>[ 4316.132006] >ff1100012ad38000: fa fb fb fb fb fb fb fb fb fb fb fb
>fb fb fb fb
>[ 4316.133594]                                      ^
>[ 4316.135170]  ff1100012ad38080: fb fb fb fb fb fb fb fb fb fb fb fb
>fb fb fb fb
>[ 4316.136796]  ff1100012ad38100: fb fb fb fb fb fb fb fb fb fb fb fb
>fb fb fb fb
>[ 4316.138399] ==================================================================
>
>
>On Fri, Nov 8, 2024 at 5:12 PM Enzo Matsumiya <ematsumiya@suse.de> wrote:
>>
>> On 11/08, Paul Aurich wrote:
>> >No, this series doesn't try to address that. I was just focused on the
>> >issues around the lifecycle of the cfids and the dentry:s.
>>
>> I'll be reviewing the patches next week, but for now I can say +1.
>>
>> We've been debugging this issue for the past month or so, and it's been
>> quite hard to understand/debug who was racing who.
>>
>> The 'dentry still in use' bug seems to appear only for the root dentry,
>> whereas cached_fids for subdirectories sometimes can get duplicates, and
>> thus one dangling, where in the end an underflow + double list_del()
>> happens to it, e.g.:
>>
>> (this is coming from cifs_readdir())
>> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount 0xffff892f4b5b3e00
>>    entry = {
>>      next = 0xffff892e053ed400,
>>      prev = 0xffff892d8e3ecb88
>>    },
>>    cfids = 0xffff892d8e3ecb80,
>>    path = 0xffff892f48b7b3b0 "\\dir1\\dir2\\dir3",
>>    has_lease = 0x0,
>>    is_open = 0x0,
>>    on_list = 0x1,
>>    time = 0x0,
>>    tcon = 0x0,
>>    refcount = { ... counter = 0x2 ... }
>>
>> (this is at the crashing frame on smb2_close_cached_fid())
>> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount ffff892e053ee000
>>    entry = {
>>      next = 0xdead000000000100,
>>      prev = 0xdead000000000122
>>    },
>>    cfids = 0xffff892d8e3ecb80,
>>    path = 0xffff892f825ced30 "\\dir1\\dir2\\dir3",
>>    has_lease = 0x0,
>>    is_open = 0x1,
>>    on_list = 0x1,
>>    time = 0x1040998fc,
>>    tcon = 0xffff892fe0b4d000,
>>    refcount = { ... counter = 0x0 ... }
>>
>> You can see that the second one had already been deleted (refcount=0,
>> ->entry is poisoned), but the first one hasn't been filled in by
>> open_cached_dir() yet, but already has 2 references to it.
>>
>> A reliable reproducer I found for this was running xfstests '-g
>> generic/dir' in one terminal, and generic/072 some seconds later.
>>
>> (in the beginning I thought that a reconnect was required to trigger
>> this bug, but any kind of interruption (I could trigger it with ctrl-c
>> mid-xfstests a few times) should trigger it)
>>
>> actimeo >= 0 seems to play a role as well, because things can get quite
>> complicated (unsure if problematic though) with a callchain such as:
>>
>> open_cached_dir
>>    -> path_to_dentry
>>      -> lookup_positive_unlocked
>>        -> lookup_dcache
>>          -> cifs_d_revalidate (dentry->d_op->d_revalidate)
>>            -> cifs_revalidate_dentry
>>              -> cifs_revalidate_dentry_attr
>>                -> cifs_dentry_needs_reval
>>                  -> open_cached_dir_by_dentry
>>
>>
>> Anyway, thanks a lot for you patches, Paul.  Like I said, I'll be
>> testing + reviewing them soon.
>>
>>
>> Cheers,
>>
>> Enzo
>>
>> >On 2024-11-08 16:39:03 -0600, Steve French wrote:
>> >>The perf improvement allowing caching of directories (helping "ls"
>> >>then "ls" again for same dir) not just the perf improvement with "ls
>> >>"then "stat" could be very important.
>> >>
>> >>Did this series also address Ralph's point about missing requesting
>> >>"H" (handle caching) flag when requesting directory leases from the
>> >>server?
>> >>
>> >>On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
>> >>>
>> >>>The SMB client cached directory functionality has a few problems around
>> >>>flaky/lost server connections, which manifest as a pair of BUGs when
>> >>>eventually unmounting the server connection:
>> >>>
>> >>>    [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
>> >>>    [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
>> >>>
>> >>>Based on bisection, these issues started with the lease directory cache
>> >>>handling introduced in commit ebe98f1447bb ("cifs: enable caching of
>> >>>directories for which a lease is held"), and go away if I mount with
>> >>>'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
>> >>>(v6.1.x), but the issues persist even in current git versions.  I think the
>> >>>situation was improved (occurrence frequency went down) with
>> >>>commit 5c86919455c1 ("smb: client: fix use-after-free in
>> >>>smb2_query_info_compound()").
>> >>>
>> >>>
>> >>>I'm able to reproduce the "Dentry still in use" errors by connecting to an
>> >>>actively-used SMB share (the server organically generates lease breaks) and
>> >>>leaving these running for 'a while':
>> >>>
>> >>>- while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
>> >>>- while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
>> >>>
>> >>>('a while' is anywhere from 10 minutes to overnight. Also, it's not the
>> >>>cleanest reproducer, but I stopped iterating once I had something that was
>> >>>even remotely reliable for me...)
>> >>>
>> >>>This series attempts to fix these, as well as a use-after-free that could
>> >>>occur because open_cached_dir() explicitly frees the cached_fid, rather than
>> >>>relying on reference counting.
>> >>>
>> >>>
>> >>>The last patch in this series (smb: During umount, flush any pending lease
>> >>>breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
>> >>>The issue there:
>> >>>
>> >>>Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
>> >>>the server for a cached directory.  When this happens, the cfid is removed
>> >>>from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
>> >>>cifs_kill_sb continues on before the queued work puts the dentry, we trigger
>> >>>the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
>> >>>this, but some thoughts:
>> >>>
>> >>>1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
>> >>>   entire workqueue is potentially doing more work that necessary.  Should
>> >>>   there be a workqueue that's more appropriately scoped?
>> >>>2. With an unresponsive server, this causes umount (even umount -l) to hang
>> >>>   (waiting for SMB2_close calls), and when I test with backports on a 6.1
>> >>>   kernel, appears to cause a deadlock between kill_sb and some cifs
>> >>>   reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
>> >>>   was addressed by changes to fs/super.c, so not really an SMB problem, but
>> >>>   just an indication that flush_waitqueue isn't the right solution).
>> >>>3. Should cached_dir_lease_break() drop the dentry before queueing work
>> >>>   (and if so, is it OK to do this under the spinlock, or should the spinlock
>> >>>   be dropped first)?
>> >>>4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
>> >>>   while looping?
>> >>>
>> >>>
>> >>>Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
>> >>>v6.1 for stable, but it's not a clean backport because some other important
>> >>>fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
>> >>>smb2_query_info_compound()") weren't picked up.
>> >>>
>> >>>Paul Aurich (5):
>> >>>  smb: cached directories can be more than root file handle
>> >>>  smb: Don't leak cfid when reconnect races with open_cached_dir
>> >>>  smb: prevent use-after-free due to open_cached_dir error paths
>> >>>  smb: No need to wait for work when cleaning up cached directories
>> >>>  smb: During umount, flush any pending lease breaks for cached dirs
>> >>>
>> >>> fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
>> >>> 1 file changed, 47 insertions(+), 59 deletions(-)
>> >>>
>> >>>--
>> >>>2.45.2
>> >>>
>> >>>
>> >>
>> >>
>> >>--
>> >>Thanks,
>> >>
>> >>Steve
>> >
>> >
>
>
>
>--
>Thanks,
>
>Steve
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting
  2024-11-13 19:08         ` Paul Aurich
@ 2024-11-18 17:40           ` Steve French
  0 siblings, 0 replies; 13+ messages in thread
From: Steve French @ 2024-11-18 17:40 UTC (permalink / raw)
  To: Paul Aurich; +Cc: CIFS, Ralph Böhme, ronnie sahlberg

Let me know if any of the patches are ready for merging or any
additional patches.  I will add at least one to fix the R vs RH dir
lease issue Ralph pointed out.

On Wed, Nov 13, 2024 at 1:08 PM Paul Aurich <paul@darkrain42.org> wrote:
>
> I'm working on a v2 set of patches that I hope would address this (I'm
> guessing it's somehow caused by the flush_workqueue() call, although to be
> honest, I have no idea).
>
> My testing has also turned up another issue that needs fixing (the laundromat
> thread can also race with cifs_kill_sb, resulting in yet another 'Dentry still
> in use' oops).
>
> ~Paul
>
> On 2024-11-09 18:49:54 -0600, Steve French wrote:
> >Running the buildbot on these against Windows with directory leases
> >enabled (ie not using "nohandlecache") I see a crash in generic/246 in
> >cifs_mount (see call trace at end of the email for more details). The
> >series does improve things, it fixes the oops that would cause e.g.
> >generic/048 to crash e.g. it fixes this one
> >
> >[ 2039.224037] BUG: Dentry 00000000114e5018{i=6000000019c6c,n=/}
> >still in use (2) [unmount of cifs cifs]
> >[ 2039.224086] WARNING: CPU: 1 PID: 123637 at fs/dcache.c:1536
> >umount_check+0xc3/0xf0
> >[ 2039.224112] Modules linked in: cifs ccm cmac nls_utf8 cifs_arc4
> >nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> >nfs lockd grace netfs nf_conntrack_netbios_ns nf_conntrack_broadcast
> >nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> >nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat
> >nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle
> >ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack
> >nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw
> >iptable_security ip_set ebtable_filter ebtables ip6table_filter
> >ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net
> >net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink
> >zram xfs bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper
> >crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel
> >sha512_ssse3 sha256_ssse3 floppy virtio_blk sha1_ssse3 qemu_fw_cfg
> >virtio_console [last unloaded: cifs]
> >[ 2039.224382] CPU: 1 UID: 0 PID: 123637 Comm: rm Not tainted 6.12.0-rc6 #1
> >[ 2039.224390] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
> >[ 2039.224395] RIP: 0010:umount_check+0xc3/0xf0
> >[ 2039.224401] Code: db 74 0d 48 8d 7b 40 e8 9b e3 f5 ff 48 8b 53 40
> >41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 74 ba a2 e8 dd
> >e8 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 41 83
> >fc 01
> >[ 2039.224408] RSP: 0018:ff11000128bcf968 EFLAGS: 00010282
> >[ 2039.224416] RAX: dffffc0000000000 RBX: ff11000110ba3d80 RCX: 0000000000000027
> >[ 2039.224422] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb0b1a08
> >[ 2039.224427] RBP: ff11000112ebb8f8 R08: ffffffffa13f759e R09: ffe21c0099616341
> >[ 2039.224432] R10: ff110004cb0b1a0b R11: 0000000000000001 R12: 0000000000000002
> >[ 2039.224437] R13: ff1100013fe0c668 R14: ffffffffc1dee340 R15: ffffffffc2094200
> >[ 2039.224442] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
> >knlGS:0000000000000000
> >[ 2039.224447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >[ 2039.224452] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
> >[ 2039.224465] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >[ 2039.224470] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >[ 2039.224475] Call Trace:
> >[ 2039.224479]  <TASK>
> >[ 2039.224485]  ? __warn+0xa9/0x220
> >[ 2039.224497]  ? umount_check+0xc3/0xf0
> >[ 2039.224506]  ? report_bug+0x1d4/0x1e0
> >[ 2039.224521]  ? handle_bug+0x5b/0xa0
> >[ 2039.224529]  ? exc_invalid_op+0x18/0x50
> >[ 2039.224537]  ? asm_exc_invalid_op+0x1a/0x20
> >[ 2039.224556]  ? irq_work_claim+0x1e/0x40
> >[ 2039.224568]  ? umount_check+0xc3/0xf0
> >[ 2039.224577]  ? umount_check+0xc3/0xf0
> >[ 2039.224586]  ? __pfx_umount_check+0x10/0x10
> >[ 2039.224594]  d_walk+0x71/0x4e0
> >[ 2039.224604]  ? d_walk+0x4b/0x4e0
> >[ 2039.224616]  shrink_dcache_for_umount+0x6d/0x220
> >[ 2039.224629]  generic_shutdown_super+0x4a/0x1c0
> >[ 2039.224640]  kill_anon_super+0x22/0x40
> >[ 2039.224649]  cifs_kill_sb+0x78/0x90 [cifs]
> >[ 2039.224848]  deactivate_locked_super+0x69/0xf0
> >[ 2039.224858]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
> >[ 2039.225019]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
> >[ 2039.225172]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
> >[ 2039.225319]  ? mark_held_locks+0x65/0x90
> >[ 2039.225339]  ? kfree+0x1e5/0x490
> >[ 2039.225349]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
> >[ 2039.225494]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
> >[ 2039.225639]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
> >[ 2039.225821]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
> >[ 2039.225970]  cifs_unlink+0x21d/0x7b0 [cifs]
> >[ 2039.226121]  vfs_unlink+0x1bf/0x4a0
> >[ 2039.226131]  ? lookup_one_qstr_excl+0x24/0xd0
> >[ 2039.226143]  do_unlinkat+0x380/0x450
> >[ 2039.226154]  ? __pfx_do_unlinkat+0x10/0x10
> >[ 2039.226161]  ? kasan_save_track+0x14/0x30
> >[ 2039.226171]  ? rcu_is_watching+0x20/0x50
> >[ 2039.226180]  ? strncpy_from_user+0x126/0x180
> >[ 2039.226201]  __x64_sys_unlinkat+0x5e/0xa0
> >[ 2039.226211]  do_syscall_64+0x75/0x180
> >[ 2039.226222]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >[ 2039.226230] RIP: 0033:0x7fdc92d7784b
> >[ 2039.226237] Code: 77 05 c3 0f 1f 40 00 48 8b 15 c9 85 0d 00 f7 d8
> >64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 07 01 00
> >00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9d 85 0d 00 f7 d8 64 89
> >01 48
> >[ 2039.226244] RSP: 002b:00007ffd51debe58 EFLAGS: 00000206 ORIG_RAX:
> >0000000000000107
> >[ 2039.226253] RAX: ffffffffffffffda RBX: 0000556c6a58d5a0 RCX: 00007fdc92d7784b
> >[ 2039.226258] RDX: 0000000000000000 RSI: 0000556c6a58c380 RDI: 00000000ffffff9c
> >[ 2039.226263] RBP: 00007ffd51debf20 R08: 0000556c6a58c380 R09: 00007ffd51debf7c
> >[ 2039.226268] R10: 0000556c6a58d7d0 R11: 0000000000000206 R12: 0000556c6a58c2f0
> >[ 2039.226273] R13: 0000000000000000 R14: 00007ffd51debf80 R15: 0000000000000000
> >[ 2039.226297]  </TASK>
> >[ 2039.226301] irq event stamp: 5933
> >[ 2039.226305] hardirqs last  enabled at (5939): [<ffffffffa122e2de>]
> >__up_console_sem+0x5e/0x70
> >[ 2039.226314] hardirqs last disabled at (5944): [<ffffffffa122e2c3>]
> >__up_console_sem+0x43/0x70
> >[ 2039.226320] softirqs last  enabled at (3096): [<ffffffffa11340ce>]
> >__irq_exit_rcu+0xfe/0x120
> >[ 2039.226327] softirqs last disabled at (2941): [<ffffffffa11340ce>]
> >__irq_exit_rcu+0xfe/0x120
> >[ 2039.226333] ---[ end trace 0000000000000000 ]---
> >[ 2040.931627] VFS: Busy inodes after unmount of cifs (cifs)
> >[ 2040.931693] ------------[ cut here ]------------
> >[ 2040.931699] kernel BUG at fs/super.c:650!
> >[ 2040.932770] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
> >[ 2040.932846] CPU: 1 UID: 0 PID: 123637 Comm: rm Tainted: G        W
> >        6.12.0-rc6 #1
> >[ 2040.932936] Tainted: [W]=WARN
> >[ 2040.932969] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
> >[ 2040.933028] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.933086] Code: 7b 28 e8 cc ca f8 ff 48 8b 6b 28 48 89 ef e8 c0
> >ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 60 14 ba a2 e8 d9
> >6c b6 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
> >90 90
> >[ 2040.933252] RSP: 0018:ff11000128bcfa38 EFLAGS: 00010282
> >[ 2040.933305] RAX: 000000000000002d RBX: ff1100013fe0c000 RCX: ffffffffa12c915e
> >[ 2040.933370] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ff110004cb0b7080
> >[ 2040.933436] RBP: ffffffffc20529a0 R08: 0000000000000001 R09: ffe21c0025179f0f
> >[ 2040.933500] R10: ff11000128bcf87f R11: 0000000000000001 R12: ff1100013fe0c9c0
> >[ 2040.933565] R13: ff1100013fe0c780 R14: ff11000168113378 R15: ffffffffc2094200
> >[ 2040.933630] FS:  00007fdc92c66740(0000) GS:ff110004cb080000(0000)
> >knlGS:0000000000000000
> >[ 2040.933703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >[ 2040.933757] CR2: 0000556c6a58d488 CR3: 00000001229ec006 CR4: 0000000000373ef0
> >[ 2040.933828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >[ 2040.933905] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >[ 2040.933972] Call Trace:
> >[ 2040.934000]  <TASK>
> >[ 2040.934024]  ? die+0x37/0x90
> >[ 2040.934058]  ? do_trap+0x133/0x230
> >[ 2040.934094]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934144]  ? do_error_trap+0x94/0x130
> >[ 2040.934182]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934229]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934277]  ? handle_invalid_op+0x2c/0x40
> >[ 2040.934318]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934364]  ? exc_invalid_op+0x2f/0x50
> >[ 2040.934405]  ? asm_exc_invalid_op+0x1a/0x20
> >[ 2040.934453]  ? tick_nohz_tick_stopped+0x1e/0x40
> >[ 2040.934502]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934551]  ? generic_shutdown_super+0x1b7/0x1c0
> >[ 2040.934600]  kill_anon_super+0x22/0x40
> >[ 2040.934641]  cifs_kill_sb+0x78/0x90 [cifs]
> >[ 2040.934851]  deactivate_locked_super+0x69/0xf0
> >[ 2040.934915]  cifsFileInfo_put_final+0x25a/0x290 [cifs]
> >[ 2040.935125]  _cifsFileInfo_put+0x5f9/0x770 [cifs]
> >[ 2040.935331]  ? __pfx__cifsFileInfo_put+0x10/0x10 [cifs]
> >[ 2040.935537]  ? mark_held_locks+0x65/0x90
> >[ 2040.935581]  ? kfree+0x1e5/0x490
> >[ 2040.937096]  ? cifs_close_deferred_file_under_dentry+0x2e5/0x350 [cifs]
> >[ 2040.938778]  cifs_close_deferred_file_under_dentry+0x26d/0x350 [cifs]
> >[ 2040.940478]  ? __pfx_cifs_close_deferred_file_under_dentry+0x10/0x10 [cifs]
> >[ 2040.942193]  ? cifs_sb_master_tcon+0x23/0x30 [cifs]
> >[ 2040.943865]  cifs_unlink+0x21d/0x7b0 [cifs]
> >[ 2040.945527]  vfs_unlink+0x1bf/0x4a0
> >[ 2040.947008]  ? lookup_one_qstr_excl+0x24/0xd0
> >[ 2040.948451]  do_unlinkat+0x380/0x450
> >[ 2040.949889]  ? __pfx_do_unlinkat+0x10/0x10
> >[ 2040.951306]  ? kasan_save_track+0x14/0x30
> >[ 2040.952708]  ? rcu_is_watching+0x20/0x50
> >[ 2040.954135]  ? strncpy_from_user+0x126/0x180
> >[ 2040.955494]  __x64_sys_unlinkat+0x5e/0xa0
> >[ 2040.956779]  do_syscall_64+0x75/0x180
> >[ 2040.958053]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> >
> >
> >but I did see a crash in generic/246 with current mainline + the five
> >directory lease patches (see below).  Any ideas?
> >
> >[ 4315.917270] CIFS: Attempting to mount //win16.vm.test/Share
> >[ 4315.928088] ==================================================================
> >[ 4315.930402] BUG: KASAN: slab-use-after-free in
> >rwsem_down_write_slowpath+0xa34/0xaf0
> >[ 4315.932655] Read of size 4 at addr ff1100012ad38034 by task mount.cifs/340968
> >
> >[ 4315.937119] CPU: 2 UID: 0 PID: 340968 Comm: mount.cifs Tainted: G
> >   D W          6.12.0-rc6 #1
> >[ 4315.939454] Tainted: [D]=DIE, [W]=WARN
> >[ 4315.941747] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014
> >[ 4315.944094] Call Trace:
> >[ 4315.946425]  <TASK>
> >[ 4315.948724]  dump_stack_lvl+0x79/0xb0
> >[ 4315.951046]  print_report+0xcb/0x620
> >[ 4315.953359]  ? __virt_addr_valid+0x19a/0x300
> >[ 4315.955682]  ? rwsem_down_write_slowpath+0xa34/0xaf0
> >[ 4315.957994]  kasan_report+0xbd/0xf0
> >[ 4315.960317]  ? rwsem_down_write_slowpath+0xa34/0xaf0
> >[ 4315.962610]  rwsem_down_write_slowpath+0xa34/0xaf0
> >[ 4315.964835]  ? kasan_save_stack+0x34/0x50
> >[ 4315.967030]  ? __pfx_rwsem_down_write_slowpath+0x10/0x10
> >[ 4315.969221]  ? cifs_mount+0xfb/0x3b0 [cifs]
> >[ 4315.971586]  ? cifs_smb3_do_mount+0x1a5/0xc10 [cifs]
> >[ 4315.974108]  ? smb3_get_tree+0x1f0/0x430 [cifs]
> >[ 4315.976511]  ? rcu_is_watching+0x20/0x50
> >[ 4315.978712]  ? trace_lock_acquire+0x116/0x150
> >[ 4315.980884]  ? lock_acquire+0x40/0x90
> >[ 4315.982989]  ? super_lock+0xea/0x1d0
> >[ 4315.985036]  ? super_lock+0xea/0x1d0
> >[ 4315.986997]  down_write+0x15b/0x160
> >[ 4315.988893]  ? __pfx_down_write+0x10/0x10
> >[ 4315.990751]  ? __mod_timer+0x407/0x590
> >[ 4315.992531]  super_lock+0xea/0x1d0
> >[ 4315.994231]  ? __pfx_super_lock+0x10/0x10
> >[ 4315.995891]  ? __pfx_lock_release+0x10/0x10
> >[ 4315.997528]  ? rcu_is_watching+0x20/0x50
> >[ 4315.999137]  ? lock_release+0xa5/0x3d0
> >[ 4316.000705]  ? cifs_match_super+0x177/0x650 [cifs]
> >[ 4316.002385]  grab_super+0x80/0x1e0
> >[ 4316.003884]  ? __pfx_grab_super+0x10/0x10
> >[ 4316.005351]  ? cifs_put_tlink+0xa1/0xc0 [cifs]
> >[ 4316.006990]  ? cifs_match_super+0x17f/0x650 [cifs]
> >[ 4316.008620]  ? __pfx_cifs_match_super+0x10/0x10 [cifs]
> >[ 4316.010236]  sget+0x121/0x350
> >[ 4316.011694]  ? __pfx_cifs_set_super+0x10/0x10 [cifs]
> >[ 4316.013293]  cifs_smb3_do_mount+0x293/0xc10 [cifs]
> >[ 4316.014933]  ? __pfx___mutex_lock+0x10/0x10
> >[ 4316.016333]  ? cred_has_capability.isra.0+0xd4/0x1a0
> >[ 4316.017752]  ? __pfx_cifs_smb3_do_mount+0x10/0x10 [cifs]
> >[ 4316.019338]  smb3_get_tree+0x1f0/0x430 [cifs]
> >[ 4316.020910]  vfs_get_tree+0x50/0x180
> >[ 4316.022292]  path_mount+0x5d6/0xf20
> >[ 4316.023668]  ? __pfx_path_mount+0x10/0x10
> >[ 4316.024983]  ? user_path_at+0x45/0x60
> >[ 4316.026275]  __x64_sys_mount+0x174/0x1b0
> >[ 4316.027552]  ? __pfx___x64_sys_mount+0x10/0x10
> >[ 4316.028794]  ? rcu_is_watching+0x20/0x50
> >[ 4316.030021]  do_syscall_64+0x75/0x180
> >[ 4316.031226]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >[ 4316.032476] RIP: 0033:0x7f979fa7b8fe
> >[ 4316.033742] Code: 48 8b 0d 1d a5 0c 00 f7 d8 64 89 01 48 83 c8 ff
> >c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
> >00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea a4 0c 00 f7 d8 64 89
> >01 48
> >[ 4316.036462] RSP: 002b:00007fffbaa4f598 EFLAGS: 00000246 ORIG_RAX:
> >00000000000000a5
> >[ 4316.037874] RAX: ffffffffffffffda RBX: 0000557684547471 RCX: 00007f979fa7b8fe
> >[ 4316.039284] RDX: 0000557684547471 RSI: 00005576845474d7 RDI: 00007fffbaa50caf
> >[ 4316.040701] RBP: 000000000000000a R08: 00005576847deeb0 R09: 0000000000000000
> >[ 4316.042106] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffbaa50caf
> >[ 4316.043522] R13: 00005576847dff20 R14: 00005576847deeb0 R15: 00007f979fb75000
> >[ 4316.044928]  </TASK>
> >
> >[ 4316.047670] Allocated by task 340820:
> >[ 4316.049036]  kasan_save_stack+0x24/0x50
> >[ 4316.050414]  kasan_save_track+0x14/0x30
> >[ 4316.051771]  __kasan_slab_alloc+0x59/0x70
> >[ 4316.053139]  kmem_cache_alloc_node_noprof+0x116/0x330
> >[ 4316.054529]  copy_process+0x299/0x45e0
> >[ 4316.055915]  kernel_clone+0xf2/0x4b0
> >[ 4316.057299]  __do_sys_clone+0x90/0xb0
> >[ 4316.058708]  do_syscall_64+0x75/0x180
> >[ 4316.060095]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> >[ 4316.062893] Freed by task 0:
> >[ 4316.064283]  kasan_save_stack+0x24/0x50
> >[ 4316.065695]  kasan_save_track+0x14/0x30
> >[ 4316.067076]  kasan_save_free_info+0x3b/0x60
> >[ 4316.068462]  __kasan_slab_free+0x38/0x50
> >[ 4316.069841]  kmem_cache_free+0x239/0x5a0
> >[ 4316.071213]  delayed_put_task_struct+0x149/0x1b0
> >[ 4316.072608]  rcu_do_batch+0x2f4/0x880
> >[ 4316.073979]  rcu_core+0x3d6/0x510
> >[ 4316.075340]  handle_softirqs+0x10f/0x580
> >[ 4316.076708]  __irq_exit_rcu+0xfe/0x120
> >[ 4316.078052]  irq_exit_rcu+0xe/0x20
> >[ 4316.079400]  sysvec_apic_timer_interrupt+0x76/0x90
> >[ 4316.080747]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >
> >[ 4316.083488] Last potentially related work creation:
> >[ 4316.084845]  kasan_save_stack+0x24/0x50
> >[ 4316.086188]  __kasan_record_aux_stack+0x8e/0xa0
> >[ 4316.087564]  __call_rcu_common.constprop.0+0x87/0x920
> >[ 4316.088932]  release_task+0x836/0xc60
> >[ 4316.090296]  wait_consider_task+0x9db/0x19c0
> >[ 4316.091682]  __do_wait+0xe9/0x390
> >[ 4316.093046]  do_wait+0xcb/0x230
> >[ 4316.094412]  kernel_wait4+0xe4/0x1a0
> >[ 4316.095767]  __do_sys_wait4+0xce/0xe0
> >[ 4316.097110]  do_syscall_64+0x75/0x180
> >[ 4316.098455]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> >[ 4316.100994] The buggy address belongs to the object at ff1100012ad38000
> >                which belongs to the cache task_struct of size 14656
> >[ 4316.103497] The buggy address is located 52 bytes inside of
> >                freed 14656-byte region [ff1100012ad38000, ff1100012ad3b940)
> >
> >[ 4316.107248] The buggy address belongs to the physical page:
> >[ 4316.108537] page: refcount:1 mapcount:0 mapping:0000000000000000
> >index:0x0 pfn:0x12ad38
> >[ 4316.109846] head: order:3 mapcount:0 entire_mapcount:0
> >nr_pages_mapped:0 pincount:0
> >[ 4316.111193] memcg:ff11000100d2c6c1
> >[ 4316.112543] anon flags:
> >0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> >[ 4316.113945] page_type: f5(slab)
> >[ 4316.115364] raw: 0017ffffc0000040 ff11000100280a00 0000000000000000
> >0000000000000001
> >[ 4316.116852] raw: 0000000000000000 0000000000020002 00000001f5000000
> >ff11000100d2c6c1
> >[ 4316.118329] head: 0017ffffc0000040 ff11000100280a00
> >0000000000000000 0000000000000001
> >[ 4316.119818] head: 0000000000000000 0000000000020002
> >00000001f5000000 ff11000100d2c6c1
> >[ 4316.121302] head: 0017ffffc0000003 ffd4000004ab4e01
> >ffffffffffffffff 0000000000000000
> >[ 4316.122818] head: 0000000000000008 0000000000000000
> >00000000ffffffff 0000000000000000
> >[ 4316.124323] page dumped because: kasan: bad access detected
> >
> >[ 4316.127318] Memory state around the buggy address:
> >[ 4316.128847]  ff1100012ad37f00: ff ff ff ff ff ff ff ff ff ff ff ff
> >ff ff ff ff
> >[ 4316.130427]  ff1100012ad37f80: ff ff ff ff ff ff ff ff ff ff ff ff
> >ff ff ff ff
> >[ 4316.132006] >ff1100012ad38000: fa fb fb fb fb fb fb fb fb fb fb fb
> >fb fb fb fb
> >[ 4316.133594]                                      ^
> >[ 4316.135170]  ff1100012ad38080: fb fb fb fb fb fb fb fb fb fb fb fb
> >fb fb fb fb
> >[ 4316.136796]  ff1100012ad38100: fb fb fb fb fb fb fb fb fb fb fb fb
> >fb fb fb fb
> >[ 4316.138399] ==================================================================
> >
> >
> >On Fri, Nov 8, 2024 at 5:12 PM Enzo Matsumiya <ematsumiya@suse.de> wrote:
> >>
> >> On 11/08, Paul Aurich wrote:
> >> >No, this series doesn't try to address that. I was just focused on the
> >> >issues around the lifecycle of the cfids and the dentry:s.
> >>
> >> I'll be reviewing the patches next week, but for now I can say +1.
> >>
> >> We've been debugging this issue for the past month or so, and it's been
> >> quite hard to understand/debug who was racing who.
> >>
> >> The 'dentry still in use' bug seems to appear only for the root dentry,
> >> whereas cached_fids for subdirectories sometimes can get duplicates, and
> >> thus one dangling, where in the end an underflow + double list_del()
> >> happens to it, e.g.:
> >>
> >> (this is coming from cifs_readdir())
> >> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount 0xffff892f4b5b3e00
> >>    entry = {
> >>      next = 0xffff892e053ed400,
> >>      prev = 0xffff892d8e3ecb88
> >>    },
> >>    cfids = 0xffff892d8e3ecb80,
> >>    path = 0xffff892f48b7b3b0 "\\dir1\\dir2\\dir3",
> >>    has_lease = 0x0,
> >>    is_open = 0x0,
> >>    on_list = 0x1,
> >>    time = 0x0,
> >>    tcon = 0x0,
> >>    refcount = { ... counter = 0x2 ... }
> >>
> >> (this is at the crashing frame on smb2_close_cached_fid())
> >> crash> cached_fid.entry,cfids,path,has_lease,is_open,on_list,time,tcon,refcount ffff892e053ee000
> >>    entry = {
> >>      next = 0xdead000000000100,
> >>      prev = 0xdead000000000122
> >>    },
> >>    cfids = 0xffff892d8e3ecb80,
> >>    path = 0xffff892f825ced30 "\\dir1\\dir2\\dir3",
> >>    has_lease = 0x0,
> >>    is_open = 0x1,
> >>    on_list = 0x1,
> >>    time = 0x1040998fc,
> >>    tcon = 0xffff892fe0b4d000,
> >>    refcount = { ... counter = 0x0 ... }
> >>
> >> You can see that the second one had already been deleted (refcount=0,
> >> ->entry is poisoned), but the first one hasn't been filled in by
> >> open_cached_dir() yet, but already has 2 references to it.
> >>
> >> A reliable reproducer I found for this was running xfstests '-g
> >> generic/dir' in one terminal, and generic/072 some seconds later.
> >>
> >> (in the beginning I thought that a reconnect was required to trigger
> >> this bug, but any kind of interruption (I could trigger it with ctrl-c
> >> mid-xfstests a few times) should trigger it)
> >>
> >> actimeo >= 0 seems to play a role as well, because things can get quite
> >> complicated (unsure if problematic though) with a callchain such as:
> >>
> >> open_cached_dir
> >>    -> path_to_dentry
> >>      -> lookup_positive_unlocked
> >>        -> lookup_dcache
> >>          -> cifs_d_revalidate (dentry->d_op->d_revalidate)
> >>            -> cifs_revalidate_dentry
> >>              -> cifs_revalidate_dentry_attr
> >>                -> cifs_dentry_needs_reval
> >>                  -> open_cached_dir_by_dentry
> >>
> >>
> >> Anyway, thanks a lot for you patches, Paul.  Like I said, I'll be
> >> testing + reviewing them soon.
> >>
> >>
> >> Cheers,
> >>
> >> Enzo
> >>
> >> >On 2024-11-08 16:39:03 -0600, Steve French wrote:
> >> >>The perf improvement allowing caching of directories (helping "ls"
> >> >>then "ls" again for same dir) not just the perf improvement with "ls
> >> >>"then "stat" could be very important.
> >> >>
> >> >>Did this series also address Ralph's point about missing requesting
> >> >>"H" (handle caching) flag when requesting directory leases from the
> >> >>server?
> >> >>
> >> >>On Fri, Nov 8, 2024 at 4:35 PM Paul Aurich <paul@darkrain42.org> wrote:
> >> >>>
> >> >>>The SMB client cached directory functionality has a few problems around
> >> >>>flaky/lost server connections, which manifest as a pair of BUGs when
> >> >>>eventually unmounting the server connection:
> >> >>>
> >> >>>    [18645.013550] BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
> >> >>>    [18645.789274] VFS: Busy inodes after unmount of cifs (cifs)
> >> >>>
> >> >>>Based on bisection, these issues started with the lease directory cache
> >> >>>handling introduced in commit ebe98f1447bb ("cifs: enable caching of
> >> >>>directories for which a lease is held"), and go away if I mount with
> >> >>>'nohandlecache'.  I started seeing these on Debian Bookworm stable kernel
> >> >>>(v6.1.x), but the issues persist even in current git versions.  I think the
> >> >>>situation was improved (occurrence frequency went down) with
> >> >>>commit 5c86919455c1 ("smb: client: fix use-after-free in
> >> >>>smb2_query_info_compound()").
> >> >>>
> >> >>>
> >> >>>I'm able to reproduce the "Dentry still in use" errors by connecting to an
> >> >>>actively-used SMB share (the server organically generates lease breaks) and
> >> >>>leaving these running for 'a while':
> >> >>>
> >> >>>- while true; do cd ~; sleep 1; for i in {1..3}; do cd /mnt/test/subdir; echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1; done; echo ...; done
> >> >>>- while true; do iptables -F OUTPUT; mount -t cifs -a; for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done; iptables -I OUTPUT -p tcp --dport 445 -j DROP; sleep 10; echo "unmounting"; umount -l -t cifs -a; echo "done unmounting"; sleep 20; echo "recovering"; iptables -F OUTPUT; sleep 10; done
> >> >>>
> >> >>>('a while' is anywhere from 10 minutes to overnight. Also, it's not the
> >> >>>cleanest reproducer, but I stopped iterating once I had something that was
> >> >>>even remotely reliable for me...)
> >> >>>
> >> >>>This series attempts to fix these, as well as a use-after-free that could
> >> >>>occur because open_cached_dir() explicitly frees the cached_fid, rather than
> >> >>>relying on reference counting.
> >> >>>
> >> >>>
> >> >>>The last patch in this series (smb: During umount, flush any pending lease
> >> >>>breaks for cached dirs) is a work-in-progress, and should not be taken as-is.
> >> >>>The issue there:
> >> >>>
> >> >>>Unmounting an SMB share (cifs_kill_sb) can race with a queued lease break from
> >> >>>the server for a cached directory.  When this happens, the cfid is removed
> >> >>>from the linked list, so close_all_cached_dirs() cannot drop the dentry.  If
> >> >>>cifs_kill_sb continues on before the queued work puts the dentry, we trigger
> >> >>>the "Dentry still in use" BUG splat.  Flushing the cifsiod_wq seems to help
> >> >>>this, but some thoughts:
> >> >>>
> >> >>>1. cifsiod_wq is a global workqueue, rather than per-mount.  Flushing the
> >> >>>   entire workqueue is potentially doing more work that necessary.  Should
> >> >>>   there be a workqueue that's more appropriately scoped?
> >> >>>2. With an unresponsive server, this causes umount (even umount -l) to hang
> >> >>>   (waiting for SMB2_close calls), and when I test with backports on a 6.1
> >> >>>   kernel, appears to cause a deadlock between kill_sb and some cifs
> >> >>>   reconnection code calling iterate_supers_type.  (Pretty sure the deadlock
> >> >>>   was addressed by changes to fs/super.c, so not really an SMB problem, but
> >> >>>   just an indication that flush_waitqueue isn't the right solution).
> >> >>>3. Should cached_dir_lease_break() drop the dentry before queueing work
> >> >>>   (and if so, is it OK to do this under the spinlock, or should the spinlock
> >> >>>   be dropped first)?
> >> >>>4. Related to #3 -- shouldn't close_all_cached_dirs() be holding the spinlock
> >> >>>   while looping?
> >> >>>
> >> >>>
> >> >>>Lastly, patches 2, 3, and 5 (in its final form) are beneficial going back to
> >> >>>v6.1 for stable, but it's not a clean backport because some other important
> >> >>>fixes (commit 5c86919455c1 ("smb: client: fix use-after-free in
> >> >>>smb2_query_info_compound()") weren't picked up.
> >> >>>
> >> >>>Paul Aurich (5):
> >> >>>  smb: cached directories can be more than root file handle
> >> >>>  smb: Don't leak cfid when reconnect races with open_cached_dir
> >> >>>  smb: prevent use-after-free due to open_cached_dir error paths
> >> >>>  smb: No need to wait for work when cleaning up cached directories
> >> >>>  smb: During umount, flush any pending lease breaks for cached dirs
> >> >>>
> >> >>> fs/smb/client/cached_dir.c | 106 ++++++++++++++++---------------------
> >> >>> 1 file changed, 47 insertions(+), 59 deletions(-)
> >> >>>
> >> >>>--
> >> >>>2.45.2
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>--
> >> >>Thanks,
> >> >>
> >> >>Steve
> >> >
> >> >
> >
> >
> >
> >--
> >Thanks,
> >
> >Steve
> >
>


--
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-11-18 17:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 22:29 [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Paul Aurich
2024-11-08 22:29 ` [PATCH 1/5] smb: cached directories can be more than root file handle Paul Aurich
2024-11-08 22:29 ` [PATCH 2/5] smb: Don't leak cfid when reconnect races with open_cached_dir Paul Aurich
2024-11-08 22:29 ` [PATCH 3/5] smb: prevent use-after-free due to open_cached_dir error paths Paul Aurich
2024-11-08 22:29 ` [PATCH 4/5] smb: No need to wait for work when cleaning up cached directories Paul Aurich
2024-11-08 22:29 ` [PATCH 5/5] smb: During umount, flush any pending lease breaks for cached dirs Paul Aurich
2024-11-08 22:33   ` Steve French
2024-11-08 22:39 ` [PATCH 0/5] SMB cached directory fixes around reconnection/unmounting Steve French
2024-11-08 22:44   ` Paul Aurich
2024-11-08 23:09     ` Enzo Matsumiya
2024-11-10  0:49       ` Steve French
2024-11-13 19:08         ` Paul Aurich
2024-11-18 17:40           ` Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox