* [PATCH v2] cifs: ignore cached share root handle closing errors
@ 2020-04-01 12:50 Aurelien Aptel
2020-04-03 18:04 ` Pavel Shilovsky
0 siblings, 1 reply; 3+ messages in thread
From: Aurelien Aptel @ 2020-04-01 12:50 UTC (permalink / raw)
To: linux-cifs; +Cc: smfrench, piastryyy, Aurelien Aptel
Fix tcon use-after-free and NULL ptr deref.
Customer system crashes with the following kernel log:
[462233.169868] CIFS VFS: Cancelling wait for mid 4894753 cmd: 14 => a QUERY DIR
[462233.228045] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.305922] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.306205] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347060] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347107] CIFS VFS: Close unmatched open
[462233.347113] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
...
[exception RIP: cifs_put_tcon+0xa0] (this is doing tcon->ses->server)
#6 [...] smb2_cancelled_close_fid at ... [cifs]
#7 [...] process_one_work at ...
#8 [...] worker_thread at ...
#9 [...] kthread at ...
The most likely explanation we have is:
* When we put the last reference of a tcon (refcount=0), we close the
cached share root handle.
* If closing a handle is interupted, SMB2_close() will
queue a SMB2_close() in a work thread.
* The queued object keeps a tcon ref so we bump the tcon
refcount, jumping from 0 to 1.
* We reach the end of cifs_put_tcon(), we free the tcon object despite
it now having a refcount of 1.
* The queued work now runs, but the tcon, ses & server was freed in
the meantime resulting in a crash.
THREAD 1
========
cifs_put_tcon => tcon refcount reach 0
SMB2_tdis
close_shroot_lease
close_shroot_lease_locked => if cached root has lease && refcount reach 0
smb2_close_cached_fid => if cached root valid
SMB2_close => retry close in a worker thread if interrupted
smb2_handle_cancelled_close
__smb2_handle_cancelled_close => !! tcon refcount bump 0 => 1 !!
INIT_WORK(&cancelled->work, smb2_cancelled_close_fid);
queue_work(cifsiod_wq, &cancelled->work) => queue work
tconInfoFree(tcon); ==> freed!
cifs_put_smb_ses(ses); ==> freed!
THREAD 2 (workqueue)
========
smb2_cancelled_close_fid
SMB2_close(0, cancelled->tcon, ...); => use-after-free of tcon
cifs_put_tcon(cancelled->tcon); => tcon refcount reach 0 second time
*CRASH*
Fixes: d9191319358d ("CIFS: Close cached root handle only if it has a lease")
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
---
fs/cifs/smb2misc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/cifs/smb2misc.c b/fs/cifs/smb2misc.c
index 0511aaf451d4..965276aeffcf 100644
--- a/fs/cifs/smb2misc.c
+++ b/fs/cifs/smb2misc.c
@@ -766,6 +766,12 @@ smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
cifs_dbg(FYI, "%s: tc_count=%d\n", __func__, tcon->tc_count);
spin_lock(&cifs_tcp_ses_lock);
+ if (tcon->tc_count <= 0) {
+ spin_unlock(&cifs_tcp_ses_lock);
+ cifs_dbg(VFS, "tcon is closing, skipping async close retry of fid %llu %llu\n",
+ persistent_fid, volatile_fid);
+ return 0;
+ }
tcon->tc_count++;
spin_unlock(&cifs_tcp_ses_lock);
--
2.16.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] cifs: ignore cached share root handle closing errors
2020-04-01 12:50 [PATCH v2] cifs: ignore cached share root handle closing errors Aurelien Aptel
@ 2020-04-03 18:04 ` Pavel Shilovsky
2020-04-06 10:01 ` Aurélien Aptel
0 siblings, 1 reply; 3+ messages in thread
From: Pavel Shilovsky @ 2020-04-03 18:04 UTC (permalink / raw)
To: Aurelien Aptel; +Cc: linux-cifs, Steve French
ср, 1 апр. 2020 г. в 05:50, Aurelien Aptel <aaptel@suse.com>:
>
> Fix tcon use-after-free and NULL ptr deref.
>
> Customer system crashes with the following kernel log:
>
> [462233.169868] CIFS VFS: Cancelling wait for mid 4894753 cmd: 14 => a QUERY DIR
> [462233.228045] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
> [462233.305922] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
> [462233.306205] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
> [462233.347060] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
> [462233.347107] CIFS VFS: Close unmatched open
> [462233.347113] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> ...
> [exception RIP: cifs_put_tcon+0xa0] (this is doing tcon->ses->server)
> #6 [...] smb2_cancelled_close_fid at ... [cifs]
> #7 [...] process_one_work at ...
> #8 [...] worker_thread at ...
> #9 [...] kthread at ...
>
> The most likely explanation we have is:
>
> * When we put the last reference of a tcon (refcount=0), we close the
> cached share root handle.
> * If closing a handle is interupted, SMB2_close() will
> queue a SMB2_close() in a work thread.
> * The queued object keeps a tcon ref so we bump the tcon
> refcount, jumping from 0 to 1.
> * We reach the end of cifs_put_tcon(), we free the tcon object despite
> it now having a refcount of 1.
> * The queued work now runs, but the tcon, ses & server was freed in
> the meantime resulting in a crash.
>
> THREAD 1
> ========
> cifs_put_tcon => tcon refcount reach 0
> SMB2_tdis
> close_shroot_lease
> close_shroot_lease_locked => if cached root has lease && refcount reach 0
> smb2_close_cached_fid => if cached root valid
> SMB2_close => retry close in a worker thread if interrupted
> smb2_handle_cancelled_close
> __smb2_handle_cancelled_close => !! tcon refcount bump 0 => 1 !!
> INIT_WORK(&cancelled->work, smb2_cancelled_close_fid);
> queue_work(cifsiod_wq, &cancelled->work) => queue work
> tconInfoFree(tcon); ==> freed!
> cifs_put_smb_ses(ses); ==> freed!
>
> THREAD 2 (workqueue)
> ========
> smb2_cancelled_close_fid
> SMB2_close(0, cancelled->tcon, ...); => use-after-free of tcon
> cifs_put_tcon(cancelled->tcon); => tcon refcount reach 0 second time
> *CRASH*
>
> Fixes: d9191319358d ("CIFS: Close cached root handle only if it has a lease")
> Signed-off-by: Aurelien Aptel <aaptel@suse.com>
> ---
> fs/cifs/smb2misc.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/cifs/smb2misc.c b/fs/cifs/smb2misc.c
> index 0511aaf451d4..965276aeffcf 100644
> --- a/fs/cifs/smb2misc.c
> +++ b/fs/cifs/smb2misc.c
> @@ -766,6 +766,12 @@ smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
>
> cifs_dbg(FYI, "%s: tc_count=%d\n", __func__, tcon->tc_count);
> spin_lock(&cifs_tcp_ses_lock);
> + if (tcon->tc_count <= 0) {
> + spin_unlock(&cifs_tcp_ses_lock);
> + cifs_dbg(VFS, "tcon is closing, skipping async close retry of fid %llu %llu\n",
Thanks for a good catch! Some suggestions below:
- Why VFS? Share is being disconnected anyway, so all associated
handles should be closed by the server eventually. I think FYI is
sufficient here.
- cifs_dbg_server may be better than cifs_dbg and please include tcon
ID here otherwise FIDs may be meaningless if there are several shares
mounted on a host.
- May be WARN_ONCE if tc_count is negative since it is a possible bug?
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
--
Best regards,
Pavel Shilovsky
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] cifs: ignore cached share root handle closing errors
2020-04-03 18:04 ` Pavel Shilovsky
@ 2020-04-06 10:01 ` Aurélien Aptel
0 siblings, 0 replies; 3+ messages in thread
From: Aurélien Aptel @ 2020-04-06 10:01 UTC (permalink / raw)
To: Pavel Shilovsky; +Cc: linux-cifs, Steve French
Hi Pavel,
Thanks for the review, will send v3 :)
I thought this one would be of interest to you as it's related to your
patch in the Fixes tag.
I've noticed the shared root handle is causing us lots of issues
lately. Paulo is working on a DFS bug related to it too (when switching
servers the handle becomes invalid).
Cheers,
--
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-04-06 10:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-01 12:50 [PATCH v2] cifs: ignore cached share root handle closing errors Aurelien Aptel
2020-04-03 18:04 ` Pavel Shilovsky
2020-04-06 10:01 ` Aurélien Aptel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).