* [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races
@ 2026-04-28 14:08 DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 1/3] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: DaeMyung Kang @ 2026-04-28 14:08 UTC (permalink / raw)
To: Namjae Jeon, Steve French
Cc: Sergey Senozhatsky, Tom Talpey, Hyunchul Lee, Ronnie Sahlberg,
linux-cifs, linux-kernel, DaeMyung Kang
This series fixes lifetime bugs around ksmbd connection shutdown,
session file-table teardown, and durable handle scavenging.
Patch 1 centralizes the final struct ksmbd_conn release so every last
putter runs ida_destroy() and transport cleanup. The release is queued
to a dedicated workqueue because transport teardown can sleep, while
one known last-putter is an RCU callback.
Patch 2 hardens __close_file_table_ids() by taking a transient
ksmbd_file reference, unpublishing from the session idr under ft->lock,
and doing sleepable preserve/close work outside ft->lock. It also
makes the FP_NEW window visible to the opener through
ksmbd_update_fstate().
Patch 2 is scoped to file-table teardown and the FP_NEW publication
window. It does not try to fix durable reconnect rollback on later
smb2_open() error paths (and the related post-FP_INITED reference
window in fresh smb2_open). Both already exist before this series
and need either an explicit unpublish-on-error step or an extra
session-owned reference. That is left as follow-up work. The
FP_NEW -> FP_INITED failure reuses smb2_open()'s existing -ENOENT to
STATUS_OBJECT_NAME_INVALID mapping to avoid changing wire behavior in
this lifetime fix.
Patch 3 closes two related races in the durable scavenger against any
walker that iterates f_ci->m_fp_list (ksmbd_lookup_fd_inode() and the
share-mode checks). The scavenger no longer reuses fp->node as a
scavenger-private collect-list node, takes an explicit transient
reference under global_ft.lock, and drops both the durable lifetime
and transient refs with atomic_sub_and_test(2, ...) after the
m_fp_list unlink so an in-flight m_fp_list walker that snatched fp
owns the final close cleanly. fp->persistent_id is cleared inside
__ksmbd_remove_durable_fd() so a delayed final close cannot re-issue
idr_remove() on a slot that idr_alloc_cyclic() may have already
re-handed to a new durable handle. __put_fd_final() bypasses the
per-conn open_files_count decrement when fp is detached from any
session table (fp->conn cleared by session_fd_check() at durable
preserve, paired with the volatile_id clear at unpublish), since the
walker that owns the final close in that case runs from an unrelated
work->conn whose counter never tracked this durable fp.
The series is intentionally scoped to lifetime/race fixes and the
walker-final-putter regression that the durable scavenger handoff in
patch 3 newly exposes. Pre-existing reconnect rollback and per-conn
open_files_count accounting gaps are left as follow-up work so this
series does not have to claim a full durable-reconnect or accounting
cleanup.
Validation:
* abrupt-disconnect kmemleak/kprobe A/B for connection release
* same-session two-tcon DEBUG_LIST/DEBUG_OBJECTS_WORK stress
* forced durable-preserve sleep-path harness for session teardown
* KASAN-enabled direct SMB2 coverage for session/tree teardown and
durable-preserve paths
* KASAN-enabled direct SMB2 coverage for durable scavenger expiry
racing with m_fp_list lookups
* checkpatch --strict for all patches
* make -j$(nproc) M=fs/smb/server
v1 -> v2:
* Split the original change into bisectable patches: connection final
release, session file-table teardown, and durable scavenger races.
* Keep sleepable session preserve/close work out of ft->lock and make
the FP_NEW publication race visible through a cleared volatile id.
* Document that durable reconnect rollback on later smb2_open()
error paths is a pre-existing follow-up item, and keep the
existing -ENOENT to STATUS_OBJECT_NAME_INVALID wire mapping for
this lifetime fix.
* Avoid reusing fp->node as a temporary durable scavenger list node
and take a transient reference in the durable scavenger so
concurrent ksmbd_lookup_fd_inode() walkers cannot UAF on freed fp.
These two fixes are folded into a single patch because the
list-head-reuse fix alone leaves a deterministic UAF window for
m_fp_list walkers; bisecting onto an intermediate state would land
on a use-after-free that pre-patch chaos merely made less
reproducible.
* Clear fp->persistent_id in __ksmbd_remove_durable_fd() so a holder
that owns the final close after a scavenger removal does not
re-issue idr_remove() on a persistent id that may have already been
handed out to a new durable handle.
* Bypass the per-conn open_files_count decrement in __put_fd_final()
when fp is detached from any session table, so an m_fp_list walker
that owns the final close of a scavenged durable fp does not
underflow an unrelated conn's stats counter.
* Document the ksmbd_conn_wq lifetime invariant in ksmbd_conn_put()
instead of guarding with WARN_ON_ONCE, so a violation surfaces as
a NULL deref rather than a silent leak of the final release.
DaeMyung Kang (3):
ksmbd: centralize ksmbd_conn final release to plug transport leak
ksmbd: harden file lifetime during session teardown
ksmbd: close durable scavenger races against m_fp_list lookups
fs/smb/server/connection.c | 101 +++++++++--
fs/smb/server/connection.h | 6 +
fs/smb/server/oplock.c | 7 +-
fs/smb/server/server.c | 12 ++
fs/smb/server/smb2pdu.c | 6 +-
fs/smb/server/vfs_cache.c | 341 ++++++++++++++++++++++++++++++-------
fs/smb/server/vfs_cache.h | 4 +-
7 files changed, 396 insertions(+), 81 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/3] ksmbd: centralize ksmbd_conn final release to plug transport leak
2026-04-28 14:08 [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races DaeMyung Kang
@ 2026-04-28 14:08 ` DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 2/3] ksmbd: harden file lifetime during session teardown DaeMyung Kang
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: DaeMyung Kang @ 2026-04-28 14:08 UTC (permalink / raw)
To: Namjae Jeon, Steve French
Cc: Sergey Senozhatsky, Tom Talpey, Hyunchul Lee, Ronnie Sahlberg,
linux-cifs, linux-kernel, DaeMyung Kang
ksmbd_conn_free() is one of four sites that can observe the last
refcount drop of a struct ksmbd_conn. The other three
fs/smb/server/connection.c ksmbd_conn_r_count_dec()
fs/smb/server/oplock.c __free_opinfo()
fs/smb/server/vfs_cache.c session_fd_check()
end the conn with a bare kfree(), skipping
ida_destroy(&conn->async_ida) and
conn->transport->ops->free_transport(conn->transport). Whenever one
of them is the last putter, the embedded async_ida and the entire
transport struct leak -- for TCP, that is also the struct socket and
the kvec iov.
__free_opinfo() being a final putter is not theoretical. opinfo_put()
queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so
ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and
have ksmbd_conn_free() run in the handler thread before any of them
fire. ksmbd_conn_free() then observes refcnt > 0 and short-circuits;
the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn)
branch and the transport is lost.
A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare:
each iteration holds 8 files open via sleep processes, force-closes
TCP with "ss -K sport = :445", kills the holders, lazy-umounts;
repeated 10 times, then ksmbd shutdown and kmemleak scan.
state conn_alloc conn_free tcp_free opi_rcu kmemleak
---------- ---------- --------- -------- ------- --------
pre-patch 20 20 10 160 7
with patch 20 20 20 160 0
Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak backtraces point
into struct tcp_transport / iov. With this patch tcp_free matches
conn_free at 20/20 and kmemleak is clean.
Move the per-struct final release into __ksmbd_conn_release_work() and
route the three bare-kfree final-put sites through a new
ksmbd_conn_put(). Those sites now pair ida_destroy() and
free_transport() with kfree(conn) regardless of which holder happens
to release the last reference. stop_sessions() only triggers the
transport shutdown and does not itself drop the last conn reference,
so it is unaffected.
The centralized release reaches sock_release() -> tcp_close() ->
lock_sock_nested() (might_sleep) from every final putter, including
__free_opinfo() invoked from an RCU softirq callback, which trips
CONFIG_DEBUG_ATOMIC_SLEEP. Defer the release to a dedicated
ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any
non-sleeping context.
Make ksmbd_file own a strong connection reference while fp->conn is
non-NULL so durable-preserve and final-close paths cannot dereference
a stale connection. ksmbd_open_fd() and ksmbd_reopen_durable_fd()
take the reference via ksmbd_conn_get() (the latter also reorders the
fp->conn / fp->tcon assignments before __open_id() so the published fp
is never observed with fp->conn == NULL); session_fd_check() and
__ksmbd_close_fd() drop it via ksmbd_conn_put(). With that invariant,
session_fd_check() can take a local conn pointer once and use it
across the m_op_list and lock_list iterations even though op->conn
puts may otherwise drop the last reference.
At module exit the workqueue is flushed and destroyed after
rcu_barrier(), so any release queued by a trailing RCU callback is
drained before the inode hash and module text go away.
Fixes: ee426bfb9d09 ("ksmbd: add refcnt to ksmbd_conn struct")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
---
fs/smb/server/connection.c | 101 +++++++++++++++++++++++++++++++------
fs/smb/server/connection.h | 6 +++
fs/smb/server/oplock.c | 7 +--
fs/smb/server/server.c | 12 +++++
fs/smb/server/vfs_cache.c | 60 ++++++++++++++++++----
5 files changed, 156 insertions(+), 30 deletions(-)
diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c
index fbbc0529743f..fc9b35d41185 100644
--- a/fs/smb/server/connection.c
+++ b/fs/smb/server/connection.c
@@ -79,6 +79,81 @@ static int create_proc_clients(void) { return 0; }
static void delete_proc_clients(void) {}
#endif
+static struct workqueue_struct *ksmbd_conn_wq;
+
+int ksmbd_conn_wq_init(void)
+{
+ ksmbd_conn_wq = alloc_workqueue("ksmbd-conn-release",
+ WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
+ if (!ksmbd_conn_wq)
+ return -ENOMEM;
+ return 0;
+}
+
+void ksmbd_conn_wq_destroy(void)
+{
+ if (ksmbd_conn_wq) {
+ destroy_workqueue(ksmbd_conn_wq);
+ ksmbd_conn_wq = NULL;
+ }
+}
+
+/*
+ * __ksmbd_conn_release_work() - perform the final, once-per-struct cleanup
+ * of a ksmbd_conn whose refcount has just dropped to zero.
+ *
+ * This is the common release path used by ksmbd_conn_put() for the embedded
+ * state that outlives the connection thread: async_ida and the attached
+ * transport (which owns the socket and iov for TCP). Called from a workqueue
+ * so that sleep-allowed teardown (sock_release -> tcp_close ->
+ * lock_sock_nested) never runs from an RCU softirq callback (free_opinfo_rcu)
+ * or any other non-sleeping putter context.
+ */
+static void __ksmbd_conn_release_work(struct work_struct *work)
+{
+ struct ksmbd_conn *conn =
+ container_of(work, struct ksmbd_conn, release_work);
+
+ ida_destroy(&conn->async_ida);
+ conn->transport->ops->free_transport(conn->transport);
+ kfree(conn);
+}
+
+/**
+ * ksmbd_conn_get() - take a reference on @conn and return it.
+ *
+ * Returns @conn unchanged so callers can write
+ * "fp->conn = ksmbd_conn_get(work->conn);" in one expression. Returns NULL
+ * if @conn is NULL.
+ */
+struct ksmbd_conn *ksmbd_conn_get(struct ksmbd_conn *conn)
+{
+ if (!conn)
+ return NULL;
+
+ atomic_inc(&conn->refcnt);
+ return conn;
+}
+
+/**
+ * ksmbd_conn_put() - drop a reference and, if it was the last, queue the
+ * release onto ksmbd_conn_wq so it runs from process context.
+ *
+ * Callable from any context including RCU softirq callbacks and non-sleeping
+ * locks; the actual release is deferred to the workqueue. ksmbd_conn_wq is
+ * created in ksmbd_server_init() before any conn can be allocated and is
+ * destroyed in ksmbd_server_exit() after rcu_barrier(), so it is always
+ * non-NULL while a conn reference is held.
+ */
+void ksmbd_conn_put(struct ksmbd_conn *conn)
+{
+ if (!conn)
+ return;
+
+ if (atomic_dec_and_test(&conn->refcnt))
+ queue_work(ksmbd_conn_wq, &conn->release_work);
+}
+
/**
* ksmbd_conn_free() - free resources of the connection instance
*
@@ -93,23 +168,19 @@ void ksmbd_conn_free(struct ksmbd_conn *conn)
hash_del(&conn->hlist);
up_write(&conn_list_lock);
+ /*
+ * request_buf / preauth_info / mechToken are only ever accessed by the
+ * connection handler thread that owns @conn. ksmbd_conn_free() is
+ * called from the transport free_transport() path when that thread is
+ * exiting, so it is safe to release them unconditionally even when
+ * ksmbd_conn_put() below is not the final putter (oplock / ksmbd_file
+ * holders only retain the conn pointer, not these per-thread buffers).
+ */
xa_destroy(&conn->sessions);
kvfree(conn->request_buf);
kfree(conn->preauth_info);
kfree(conn->mechToken);
- if (atomic_dec_and_test(&conn->refcnt)) {
- /*
- * async_ida is embedded in struct ksmbd_conn, so pair
- * ida_destroy() with the final kfree() rather than with
- * the unconditional field teardown above. This keeps
- * the IDA valid for the entire lifetime of the struct,
- * even while other refcount holders (oplock / vfs
- * durable handles) still reference the connection.
- */
- ida_destroy(&conn->async_ida);
- conn->transport->ops->free_transport(conn->transport);
- kfree(conn);
- }
+ ksmbd_conn_put(conn);
}
/**
@@ -136,6 +207,7 @@ struct ksmbd_conn *ksmbd_conn_alloc(void)
conn->um = ERR_PTR(-EOPNOTSUPP);
if (IS_ERR(conn->um))
conn->um = NULL;
+ INIT_WORK(&conn->release_work, __ksmbd_conn_release_work);
atomic_set(&conn->req_running, 0);
atomic_set(&conn->r_count, 0);
atomic_set(&conn->refcnt, 1);
@@ -512,8 +584,7 @@ void ksmbd_conn_r_count_dec(struct ksmbd_conn *conn)
if (!atomic_dec_return(&conn->r_count) && waitqueue_active(&conn->r_count_q))
wake_up(&conn->r_count_q);
- if (atomic_dec_and_test(&conn->refcnt))
- kfree(conn);
+ ksmbd_conn_put(conn);
}
int ksmbd_conn_transport_init(void)
diff --git a/fs/smb/server/connection.h b/fs/smb/server/connection.h
index ae21a1bd4c70..e95cb4ed0b3f 100644
--- a/fs/smb/server/connection.h
+++ b/fs/smb/server/connection.h
@@ -16,6 +16,7 @@
#include <linux/kthread.h>
#include <linux/nls.h>
#include <linux/unicode.h>
+#include <linux/workqueue.h>
#include "smb_common.h"
#include "ksmbd_work.h"
@@ -119,6 +120,7 @@ struct ksmbd_conn {
bool binding;
atomic_t refcnt;
bool is_aapl;
+ struct work_struct release_work;
};
struct ksmbd_conn_ops {
@@ -163,6 +165,10 @@ void ksmbd_conn_wait_idle(struct ksmbd_conn *conn);
int ksmbd_conn_wait_idle_sess_id(struct ksmbd_conn *curr_conn, u64 sess_id);
struct ksmbd_conn *ksmbd_conn_alloc(void);
void ksmbd_conn_free(struct ksmbd_conn *conn);
+struct ksmbd_conn *ksmbd_conn_get(struct ksmbd_conn *conn);
+void ksmbd_conn_put(struct ksmbd_conn *conn);
+int ksmbd_conn_wq_init(void);
+void ksmbd_conn_wq_destroy(void);
bool ksmbd_conn_lookup_dialect(struct ksmbd_conn *c);
int ksmbd_conn_write(struct ksmbd_work *work);
int ksmbd_conn_rdma_read(struct ksmbd_conn *conn,
diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c
index cd3f28b0e7cb..8feca02ddbf2 100644
--- a/fs/smb/server/oplock.c
+++ b/fs/smb/server/oplock.c
@@ -30,7 +30,6 @@ static DEFINE_RWLOCK(lease_list_lock);
static struct oplock_info *alloc_opinfo(struct ksmbd_work *work,
u64 id, __u16 Tid)
{
- struct ksmbd_conn *conn = work->conn;
struct ksmbd_session *sess = work->sess;
struct oplock_info *opinfo;
@@ -39,7 +38,7 @@ static struct oplock_info *alloc_opinfo(struct ksmbd_work *work,
return NULL;
opinfo->sess = sess;
- opinfo->conn = conn;
+ opinfo->conn = ksmbd_conn_get(work->conn);
opinfo->level = SMB2_OPLOCK_LEVEL_NONE;
opinfo->op_state = OPLOCK_STATE_NONE;
opinfo->pending_break = 0;
@@ -50,7 +49,6 @@ static struct oplock_info *alloc_opinfo(struct ksmbd_work *work,
init_waitqueue_head(&opinfo->oplock_brk);
atomic_set(&opinfo->refcount, 1);
atomic_set(&opinfo->breaking_cnt, 0);
- atomic_inc(&opinfo->conn->refcnt);
return opinfo;
}
@@ -132,8 +130,7 @@ static void __free_opinfo(struct oplock_info *opinfo)
{
if (opinfo->is_lease)
free_lease(opinfo);
- if (opinfo->conn && atomic_dec_and_test(&opinfo->conn->refcnt))
- kfree(opinfo->conn);
+ ksmbd_conn_put(opinfo->conn);
kfree(opinfo);
}
diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c
index 58ef02c423fc..5d799b2d4c62 100644
--- a/fs/smb/server/server.c
+++ b/fs/smb/server/server.c
@@ -596,8 +596,14 @@ static int __init ksmbd_server_init(void)
if (ret)
goto err_crypto_destroy;
+ ret = ksmbd_conn_wq_init();
+ if (ret)
+ goto err_workqueue_destroy;
+
return 0;
+err_workqueue_destroy:
+ ksmbd_workqueue_destroy();
err_crypto_destroy:
ksmbd_crypto_destroy();
err_release_inode_hash:
@@ -623,6 +629,12 @@ static void __exit ksmbd_server_exit(void)
{
ksmbd_server_shutdown();
rcu_barrier();
+ /*
+ * ksmbd_conn_put() defers the final release onto ksmbd_conn_wq,
+ * so drain it after rcu_barrier() has fired any pending RCU
+ * callbacks that may have queued a release.
+ */
+ ksmbd_conn_wq_destroy();
ksmbd_release_inode_hash();
}
diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c
index 3551f01a3fa0..4a18107937cc 100644
--- a/fs/smb/server/vfs_cache.c
+++ b/fs/smb/server/vfs_cache.c
@@ -475,6 +475,17 @@ static void __ksmbd_close_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp)
kfree(smb_lock);
}
+ /*
+ * Drop fp's strong reference on conn (taken in ksmbd_open_fd() /
+ * ksmbd_reopen_durable_fd()). Durable fps that reached the
+ * scavenger have already had fp->conn cleared by session_fd_check(),
+ * in which case there is nothing to drop here.
+ */
+ if (fp->conn) {
+ ksmbd_conn_put(fp->conn);
+ fp->conn = NULL;
+ }
+
if (ksmbd_stream_fd(fp))
kfree(fp->stream.name);
kfree(fp->owner.name);
@@ -752,7 +763,14 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp)
atomic_set(&fp->refcount, 1);
fp->filp = filp;
- fp->conn = work->conn;
+ /*
+ * fp owns a strong reference on fp->conn for as long as fp->conn is
+ * non-NULL, so session_fd_check() and __ksmbd_close_fd() never
+ * dereference a dangling pointer. Paired with ksmbd_conn_put() in
+ * session_fd_check() (durable preserve), in __ksmbd_close_fd()
+ * (final close), and on the error paths below.
+ */
+ fp->conn = ksmbd_conn_get(work->conn);
fp->tcon = work->tcon;
fp->volatile_id = KSMBD_NO_FID;
fp->persistent_id = KSMBD_NO_FID;
@@ -774,6 +792,8 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp)
return fp;
err_out:
+ /* fp->conn was set and refcounted before every branch here. */
+ ksmbd_conn_put(fp->conn);
kmem_cache_free(filp_cache, fp);
return ERR_PTR(ret);
}
@@ -1062,25 +1082,32 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon,
if (!is_reconnectable(fp))
return false;
+ if (WARN_ON_ONCE(!fp->conn))
+ return false;
+
if (ksmbd_vfs_copy_durable_owner(fp, user))
return false;
+ /*
+ * fp owns a strong reference on fp->conn (taken in ksmbd_open_fd()
+ * / ksmbd_reopen_durable_fd()), so conn stays valid for the whole
+ * body of this function regardless of any op->conn puts below.
+ */
conn = fp->conn;
ci = fp->f_ci;
down_write(&ci->m_lock);
list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) {
if (op->conn != conn)
continue;
- if (op->conn && atomic_dec_and_test(&op->conn->refcnt))
- kfree(op->conn);
+ ksmbd_conn_put(op->conn);
op->conn = NULL;
}
up_write(&ci->m_lock);
list_for_each_entry_safe(smb_lock, tmp_lock, &fp->lock_list, flist) {
- spin_lock(&fp->conn->llist_lock);
+ spin_lock(&conn->llist_lock);
list_del_init(&smb_lock->clist);
- spin_unlock(&fp->conn->llist_lock);
+ spin_unlock(&conn->llist_lock);
}
fp->conn = NULL;
@@ -1091,6 +1118,8 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon,
fp->durable_scavenger_timeout =
jiffies_to_msecs(jiffies) + fp->durable_timeout;
+ /* Drop fp's own reference on conn. */
+ ksmbd_conn_put(conn);
return true;
}
@@ -1178,15 +1207,27 @@ int ksmbd_reopen_durable_fd(struct ksmbd_work *work, struct ksmbd_file *fp)
old_f_state = fp->f_state;
fp->f_state = FP_NEW;
+
+ /*
+ * Initialize fp's connection binding before publishing fp into the
+ * session's file table. If __open_id() is ordered first, a
+ * concurrent teardown that iterates the table can observe a valid
+ * volatile_id with fp->conn == NULL and preserve a
+ * partially-initialized fp. fp owns a strong reference on the new
+ * conn (see ksmbd_open_fd()); undo it on __open_id() failure.
+ */
+ fp->conn = ksmbd_conn_get(conn);
+ fp->tcon = work->tcon;
+
__open_id(&work->sess->file_table, fp, OPEN_ID_TYPE_VOLATILE_ID);
if (!has_file_id(fp->volatile_id)) {
+ fp->conn = NULL;
+ fp->tcon = NULL;
+ ksmbd_conn_put(conn);
fp->f_state = old_f_state;
return -EBADF;
}
- fp->conn = conn;
- fp->tcon = work->tcon;
-
list_for_each_entry(smb_lock, &fp->lock_list, flist) {
spin_lock(&conn->llist_lock);
list_add_tail(&smb_lock->clist, &conn->lock_list);
@@ -1198,8 +1239,7 @@ int ksmbd_reopen_durable_fd(struct ksmbd_work *work, struct ksmbd_file *fp)
list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) {
if (op->conn)
continue;
- op->conn = fp->conn;
- atomic_inc(&op->conn->refcnt);
+ op->conn = ksmbd_conn_get(fp->conn);
}
up_write(&ci->m_lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/3] ksmbd: harden file lifetime during session teardown
2026-04-28 14:08 [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 1/3] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
@ 2026-04-28 14:08 ` DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 3/3] ksmbd: close durable scavenger races against m_fp_list lookups DaeMyung Kang
2026-04-29 3:21 ` [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races Namjae Jeon
3 siblings, 0 replies; 5+ messages in thread
From: DaeMyung Kang @ 2026-04-28 14:08 UTC (permalink / raw)
To: Namjae Jeon, Steve French
Cc: Sergey Senozhatsky, Tom Talpey, Hyunchul Lee, Ronnie Sahlberg,
linux-cifs, linux-kernel, DaeMyung Kang
__close_file_table_ids() is the per-session teardown that closes every
fp belonging to a session (or to one tree connect on that session) by
walking the session's volatile-id idr. The current loop has three
related problems on busy or racing workloads:
* Sleeping under ft->lock. The session-teardown skip callback,
session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner()
-> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a
rw_semaphore). Running the callback inside write_lock(&ft->lock)
trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a
durable-fd workload.
* Refcount accounting blind to f_state. The unconditional
atomic_dec_and_test(&fp->refcount) does not distinguish
FP_INITED (idr-owned reference still intact) from FP_CLOSED (an
earlier ksmbd_close_fd() already consumed the idr-owned reference
while leaving fp in the idr because a holder kept refcount
non-zero). When the latter races with teardown the same path
over-decrements into a holder reference and ksmbd_fd_put() later
UAFs that holder.
* FP_NEW window. Between __open_id() publishing fp into the
session idr and ksmbd_update_fstate(..., FP_INITED) committing the
transition at the end of smb2_open(), an fp is in FP_NEW and an
intervening teardown that takes a transient reference and
unpublishes the volatile id leaves the original idr-owned
reference orphaned -- the opener is unaware that fp has been
unpublished, returns success to the client, and the fp leaks at
refcount = 1.
Refactor __close_file_table_ids() to take a transient reference on fp
and unpublish fp from the session idr *under ft->lock* before calling
skip() outside the lock. A transient ref protects lifetime but not
concurrent field mutation, so the idr_remove() is what keeps
__ksmbd_lookup_fd() through this session's idr from granting a new
ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon /
fp->volatile_id / op->conn / lock_list links are about to be rewritten
by session_fd_check(). Durable reconnect is unaffected because it
reaches fp through the global durable table (ksmbd_lookup_durable_fd
-> global_ft).
Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition
under ft->lock so teardown and ksmbd_close_fd() never both consume the
idr-owned reference. See ksmbd_mark_fp_closed() for the per-state
accounting. For the FP_NEW path to be safe, the opener has to learn
that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT
when an FP_NEW -> FP_INITED transition finds f_state already advanced
or the volatile id cleared (both committed by teardown under
ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID
and drops the original reference via ksmbd_fd_put().
The list removal cannot be left for a deferred final putter because
fp->volatile_id has already been cleared and __ksmbd_remove_fd() will
intentionally skip both idr_remove() and list_del_init(). Move the
m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so
that an FP_NEW fp that happened to be added to m_fp_list (smb2_open()
adds fp->node before ksmbd_update_fstate() runs) is still cleaned up
on the deferred putter path; list_del_init() on an empty node is a
no-op and remains safe for fps that were never added.
Add a defensive guard in session_fd_check() that refuses non-FP_INITED
fps so that even if a teardown reaches an FP_NEW fp it falls into the
close branch (where the n_to_drop = 1 accounting keeps the opener's
reference alive) instead of the durable-preserve branch (which mutates
fp->conn / fp->tcon).
Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST
and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload
(open/write storm on one tcon, 50 tree disconnects on the other) and
reported no list-corruption, work_struct ODEBUG, sleep-in-atomic,
lockdep or kmemleak reports. Reverting only the
__close_file_table_ids() hunk while keeping a forced-is_reconnectable()
harness produced the expected sleep-in-atomic at vfs_cache.c:1095,
confirming the ft->lock-out-of-sleepable-skip discipline.
KASAN-enabled direct SMB2 coverage with durable handles enabled
exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(),
the FP_NEW failure path, tree_conn_fd_check(), and a non-zero
session_fd_check() durable-preserve return. This produced no KASAN,
DEBUG_LIST, ODEBUG, or WARNING reports.
Fixes: f44158485826 ("cifsd: add file operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
---
fs/smb/server/smb2pdu.c | 6 +-
fs/smb/server/vfs_cache.c | 179 +++++++++++++++++++++++++++++++++-----
fs/smb/server/vfs_cache.h | 4 +-
3 files changed, 164 insertions(+), 25 deletions(-)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index 21825a69c29a..c1240ca14253 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -3767,8 +3767,10 @@ int smb2_open(struct ksmbd_work *work)
err_out2:
if (!rc) {
- ksmbd_update_fstate(&work->sess->file_table, fp, FP_INITED);
- rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len);
+ rc = ksmbd_update_fstate(&work->sess->file_table, fp,
+ FP_INITED);
+ if (!rc)
+ rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len);
}
if (rc) {
if (rc == -EINVAL)
diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c
index 4a18107937cc..dc4037ef1834 100644
--- a/fs/smb/server/vfs_cache.c
+++ b/fs/smb/server/vfs_cache.c
@@ -431,13 +431,13 @@ static void ksmbd_remove_durable_fd(struct ksmbd_file *fp)
static void __ksmbd_remove_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp)
{
- if (!has_file_id(fp->volatile_id))
- return;
-
down_write(&fp->f_ci->m_lock);
list_del_init(&fp->node);
up_write(&fp->f_ci->m_lock);
+ if (!has_file_id(fp->volatile_id))
+ return;
+
write_lock(&ft->lock);
idr_remove(ft->idr, fp->volatile_id);
write_unlock(&ft->lock);
@@ -798,15 +798,58 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp)
return ERR_PTR(ret);
}
-void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp,
- unsigned int state)
+/**
+ * ksmbd_update_fstate() - update an fp state under the file-table lock
+ * @ft: file table that publishes @fp's volatile id
+ * @fp: file pointer to update
+ * @state: new state
+ *
+ * Return: 0 on success. The FP_NEW -> FP_INITED transition is special:
+ * -ENOENT if teardown already unpublished @fp by advancing the state or
+ * clearing the volatile id. Other state updates preserve the historical
+ * fire-and-forget behavior.
+ */
+int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp,
+ unsigned int state)
{
+ int ret;
+
if (!fp)
- return;
+ return -ENOENT;
write_lock(&ft->lock);
- fp->f_state = state;
+ if (state == FP_INITED &&
+ (fp->f_state != FP_NEW || !has_file_id(fp->volatile_id))) {
+ ret = -ENOENT;
+ } else {
+ fp->f_state = state;
+ ret = 0;
+ }
write_unlock(&ft->lock);
+
+ return ret;
+}
+
+/*
+ * ksmbd_mark_fp_closed() - mark fp closed under ft->lock and return how many
+ * refs the teardown path owns.
+ *
+ * FP_INITED has a normal idr-owned reference, so teardown owns both that
+ * reference and the transient lookup reference. FP_NEW is still owned by the
+ * in-flight opener/reopener, which will drop the original reference after
+ * ksmbd_update_fstate(..., FP_INITED) observes the cleared volatile id.
+ * FP_CLOSED on entry means an earlier ksmbd_close_fd() already consumed the
+ * idr-owned ref.
+ */
+static int ksmbd_mark_fp_closed(struct ksmbd_file *fp)
+{
+ if (fp->f_state == FP_INITED) {
+ set_close_state_blocked_works(fp);
+ fp->f_state = FP_CLOSED;
+ return 2;
+ }
+
+ return 1;
}
static int
@@ -814,7 +857,8 @@ __close_file_table_ids(struct ksmbd_session *sess,
struct ksmbd_tree_connect *tcon,
bool (*skip)(struct ksmbd_tree_connect *tcon,
struct ksmbd_file *fp,
- struct ksmbd_user *user))
+ struct ksmbd_user *user),
+ bool skip_preserves_fp)
{
struct ksmbd_file_table *ft = &sess->file_table;
struct ksmbd_file *fp;
@@ -822,32 +866,120 @@ __close_file_table_ids(struct ksmbd_session *sess,
int num = 0;
while (1) {
+ int n_to_drop;
+
write_lock(&ft->lock);
fp = idr_get_next(ft->idr, &id);
if (!fp) {
write_unlock(&ft->lock);
break;
}
-
- if (skip(tcon, fp, sess->user) ||
- !atomic_dec_and_test(&fp->refcount)) {
+ if (!atomic_inc_not_zero(&fp->refcount)) {
id++;
write_unlock(&ft->lock);
continue;
}
- set_close_state_blocked_works(fp);
- idr_remove(ft->idr, fp->volatile_id);
- fp->volatile_id = KSMBD_NO_FID;
- write_unlock(&ft->lock);
+ if (skip_preserves_fp) {
+ /*
+ * Session teardown: skip() is session_fd_check(),
+ * which may sleep and mutates fp->conn / fp->tcon /
+ * fp->volatile_id when it chooses to preserve fp
+ * for durable reconnect. Unpublish fp from the
+ * session idr here, under ft->lock, so that
+ * __ksmbd_lookup_fd() through this session cannot
+ * grant a new ksmbd_fp_get() reference to an fp
+ * whose fields are about to be rewritten outside
+ * the lock. Durable reconnect still reaches fp via
+ * global_ft.
+ */
+ idr_remove(ft->idr, id);
+ fp->volatile_id = KSMBD_NO_FID;
+ write_unlock(&ft->lock);
+
+ if (skip(tcon, fp, sess->user)) {
+ /*
+ * session_fd_check() has converted fp to
+ * durable-preserve state and cleared its
+ * per-conn fields. fp is already unpublished
+ * above; the original idr-owned ref keeps it
+ * alive for the durable scavenger. Drop only
+ * the transient ref. atomic_dec() is safe --
+ * atomic_inc_not_zero() succeeded on a
+ * positive value and we added one more, so
+ * refcount cannot be zero here.
+ */
+ atomic_dec(&fp->refcount);
+ id++;
+ continue;
+ }
+
+ /*
+ * Keep the close-state decision under the same lock
+ * observed by ksmbd_update_fstate(), which is how an
+ * in-flight FP_NEW opener learns that teardown has
+ * cleared its volatile id.
+ */
+ write_lock(&ft->lock);
+ n_to_drop = ksmbd_mark_fp_closed(fp);
+ write_unlock(&ft->lock);
+ } else {
+ /*
+ * Tree teardown: skip() is tree_conn_fd_check(), a
+ * cheap pointer compare that doesn't sleep and has
+ * no side effects, so keep the skip decision plus
+ * the unpublish-and-mark-closed sequence atomic
+ * under ft->lock. fps belonging to other tree
+ * connects (skip() == true) stay fully published in
+ * the session idr with no lock window.
+ */
+ if (skip(tcon, fp, sess->user)) {
+ atomic_dec(&fp->refcount);
+ write_unlock(&ft->lock);
+ id++;
+ continue;
+ }
+ idr_remove(ft->idr, id);
+ fp->volatile_id = KSMBD_NO_FID;
+ n_to_drop = ksmbd_mark_fp_closed(fp);
+ write_unlock(&ft->lock);
+ }
+ /*
+ * fp->volatile_id is already cleared to prevent stale idr
+ * removal from a deferred final close. Remove fp from
+ * m_fp_list here because __ksmbd_remove_fd() will skip the
+ * list unlink when volatile_id is KSMBD_NO_FID.
+ */
down_write(&fp->f_ci->m_lock);
list_del_init(&fp->node);
up_write(&fp->f_ci->m_lock);
- __ksmbd_close_fd(ft, fp);
-
- num++;
+ /*
+ * Drop the references this iteration owns:
+ *
+ * n_to_drop == 2: we observed FP_INITED and committed
+ * the FP_CLOSED transition ourselves, so we own the
+ * transient (+1) and the still-intact idr-owned ref.
+ *
+ * n_to_drop == 1: either a prior ksmbd_close_fd()
+ * already consumed the idr-owned ref, or fp was still
+ * FP_NEW and the in-flight opener/reopener must keep
+ * the original reference until ksmbd_update_fstate()
+ * observes the cleared volatile id.
+ *
+ * If we end up as the final putter, finalize fp and
+ * account the open_files_count decrement via the caller's
+ * atomic_sub(num, ...). Otherwise the remaining user's
+ * ksmbd_fd_put() reaches __put_fd_final(), which does its
+ * own atomic_dec(&open_files_count), so we must not count
+ * this fp here -- doing so would double-decrement the
+ * connection-wide counter.
+ */
+ if (atomic_sub_and_test(n_to_drop, &fp->refcount)) {
+ __ksmbd_close_fd(NULL, fp);
+ num++;
+ }
id++;
}
@@ -1082,6 +1214,9 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon,
if (!is_reconnectable(fp))
return false;
+ if (fp->f_state != FP_INITED)
+ return false;
+
if (WARN_ON_ONCE(!fp->conn))
return false;
@@ -1127,7 +1262,8 @@ void ksmbd_close_tree_conn_fds(struct ksmbd_work *work)
{
int num = __close_file_table_ids(work->sess,
work->tcon,
- tree_conn_fd_check);
+ tree_conn_fd_check,
+ false);
atomic_sub(num, &work->conn->stats.open_files_count);
}
@@ -1136,7 +1272,8 @@ void ksmbd_close_session_fds(struct ksmbd_work *work)
{
int num = __close_file_table_ids(work->sess,
work->tcon,
- session_fd_check);
+ session_fd_check,
+ true);
atomic_sub(num, &work->conn->stats.open_files_count);
}
@@ -1268,7 +1405,7 @@ void ksmbd_destroy_file_table(struct ksmbd_session *sess)
if (!ft->idr)
return;
- __close_file_table_ids(sess, NULL, session_fd_check);
+ __close_file_table_ids(sess, NULL, session_fd_check, true);
idr_destroy(ft->idr);
kfree(ft->idr);
ft->idr = NULL;
diff --git a/fs/smb/server/vfs_cache.h b/fs/smb/server/vfs_cache.h
index 866f32c10d4d..e6871266a94b 100644
--- a/fs/smb/server/vfs_cache.h
+++ b/fs/smb/server/vfs_cache.h
@@ -172,8 +172,8 @@ int ksmbd_close_inode_fds(struct ksmbd_work *work, struct inode *inode);
int ksmbd_init_global_file_table(void);
void ksmbd_free_global_file_table(void);
void ksmbd_set_fd_limit(unsigned long limit);
-void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp,
- unsigned int state);
+int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp,
+ unsigned int state);
bool ksmbd_vfs_compare_durable_owner(struct ksmbd_file *fp,
struct ksmbd_user *user);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 3/3] ksmbd: close durable scavenger races against m_fp_list lookups
2026-04-28 14:08 [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 1/3] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 2/3] ksmbd: harden file lifetime during session teardown DaeMyung Kang
@ 2026-04-28 14:08 ` DaeMyung Kang
2026-04-29 3:21 ` [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races Namjae Jeon
3 siblings, 0 replies; 5+ messages in thread
From: DaeMyung Kang @ 2026-04-28 14:08 UTC (permalink / raw)
To: Namjae Jeon, Steve French
Cc: Sergey Senozhatsky, Tom Talpey, Hyunchul Lee, Ronnie Sahlberg,
linux-cifs, linux-kernel, DaeMyung Kang
ksmbd_durable_scavenger() has two related races against any walker
that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode()
(used by ksmbd_vfs_rename) and the share-mode checks in
fs/smb/server/smb_common.c.
(1) fp->node list-head reuse. Durable-preserved handles can remain
linked on f_ci->m_fp_list after session teardown so share-mode checks
still see them while the handle is reconnectable. The scavenger
collected expired handles by adding fp->node to a local
scavenger_list after removing them from the global durable idr.
Because fp->node is the same list_head used by m_fp_list,
list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links
and corrupts both lists. CONFIG_DEBUG_LIST can report this on the
share-mode walk path.
(2) Refcount race against m_fp_list walkers. The scavenger qualifies
an expired durable handle with atomic_read(&fp->refcount) > 1 and
fp->conn under global_ft.lock, removes fp from global_ft, then drops
global_ft.lock before unlinking fp from m_fp_list and freeing it.
During that gap fp is still linked on m_fp_list with f_state ==
FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls
ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and
takes a live reference; the scavenger then unlinks and frees fp
while the holder owns a reference, leading to UAF on the holder's
subsequent ksmbd_fd_put() and on any field reads performed by a
concurrent share-mode walker that iterates m_fp_list without taking
ksmbd_fp_get() (smb_check_perm_dleases-like paths).
Fix both:
* Stop reusing fp->node as a scavenger-private list node. Remove
one expired handle from global_ft under global_ft.lock, take an
explicit transient reference, drop the lock, unlink fp->node
from m_fp_list under f_ci->m_lock, then drop both the durable
lifetime and transient references with atomic_sub_and_test(2,
&fp->refcount). If the scavenger is the last putter the close
runs there; otherwise an in-flight holder that already raced
through the m_fp_list lookup owns the final close via its
ksmbd_fd_put() path. The one-at-a-time disposal can rescan the
durable idr when multiple handles expire in the same pass, but
durable scavenging is a background expiration path and the final
full scan recomputes min_timeout before the next wait.
* Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right
after idr_remove(), so a delayed final close from a holder that
snatched fp does not re-issue idr_remove() on a persistent id
that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have
already handed out to a brand-new durable handle.
* Bypass the per-conn open_files_count decrement in
__put_fd_final() when fp is detached from any session table
(fp->conn cleared by session_fd_check() at durable preserve --
paired with the volatile_id clear at unpublish, so checking
fp->conn alone is sufficient). The walker that owns the final
close runs from an unrelated work->conn whose
stats.open_files_count never tracked this durable fp; without
this guard the holder would underflow that unrelated counter.
The two races are folded into one patch because patch (1) alone
cleans up the corrupted list but leaves a deterministic UAF window
for m_fp_list walkers that the transient-reference and
persistent_id discipline in (2) close; bisecting onto an
intermediate state would land on a UAF that pre-patch chaos merely
made less reproducible.
Validation:
* CONFIG_DEBUG_LIST coverage for the list_head reuse path.
* KASAN-enabled direct SMB2 durable-handle coverage that exercised
ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode()
returns while durable handles expired under concurrent rename
lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING
reports.
* checkpatch --strict
* make -j$(nproc) M=fs/smb/server
Fixes: d484d621d40f ("ksmbd: add durable scavenger timer")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
---
fs/smb/server/vfs_cache.c | 102 ++++++++++++++++++++++++++++----------
1 file changed, 76 insertions(+), 26 deletions(-)
diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c
index dc4037ef1834..354c4d8a1cfb 100644
--- a/fs/smb/server/vfs_cache.c
+++ b/fs/smb/server/vfs_cache.c
@@ -418,6 +418,14 @@ static void __ksmbd_remove_durable_fd(struct ksmbd_file *fp)
return;
idr_remove(global_ft.idr, fp->persistent_id);
+ /*
+ * Clear persistent_id so a later __ksmbd_close_fd() that runs from a
+ * delayed putter (e.g. when a concurrent ksmbd_lookup_fd_inode()
+ * walker held the final reference) does not re-issue idr_remove() on
+ * an id that idr_alloc_cyclic() may have already handed out to a new
+ * durable handle.
+ */
+ fp->persistent_id = KSMBD_NO_FID;
}
static void ksmbd_remove_durable_fd(struct ksmbd_file *fp)
@@ -521,6 +529,20 @@ static struct ksmbd_file *__ksmbd_lookup_fd(struct ksmbd_file_table *ft,
static void __put_fd_final(struct ksmbd_work *work, struct ksmbd_file *fp)
{
+ /*
+ * Detached durable fp -- session_fd_check() cleared fp->conn at
+ * preserve, so this fp is no longer tracked by any conn's
+ * stats.open_files_count. This happens when
+ * ksmbd_scavenger_dispose_dh() hands the final close off to an
+ * m_fp_list walker (e.g. ksmbd_lookup_fd_inode()) whose work->conn
+ * is unrelated to the conn that originally opened the handle; close
+ * via the NULL-ft path so we do not underflow that unrelated
+ * counter.
+ */
+ if (!fp->conn) {
+ __ksmbd_close_fd(NULL, fp);
+ return;
+ }
__ksmbd_close_fd(&work->sess->file_table, fp);
atomic_dec(&work->conn->stats.open_files_count);
}
@@ -1033,24 +1055,37 @@ static bool ksmbd_durable_scavenger_alive(void)
return true;
}
-static void ksmbd_scavenger_dispose_dh(struct list_head *head)
+static void ksmbd_scavenger_dispose_dh(struct ksmbd_file *fp)
{
- while (!list_empty(head)) {
- struct ksmbd_file *fp;
+ /*
+ * Durable-preserved fp can remain linked on f_ci->m_fp_list for
+ * share-mode checks. Unlink it before final close; fp->node is not
+ * available as a scavenger-private list node because re-adding it to
+ * another list corrupts m_fp_list.
+ */
+ down_write(&fp->f_ci->m_lock);
+ list_del_init(&fp->node);
+ up_write(&fp->f_ci->m_lock);
- fp = list_first_entry(head, struct ksmbd_file, node);
- list_del_init(&fp->node);
+ /*
+ * Drop both the durable lifetime reference and the transient reference
+ * taken by the scavenger under global_ft.lock. If a concurrent
+ * ksmbd_lookup_fd_inode() (or any other m_fp_list walker) snatched fp
+ * before the unlink above, that holder owns the final close via
+ * ksmbd_fd_put() -> __ksmbd_close_fd(). Otherwise the scavenger is
+ * the last putter and finalises fp here.
+ */
+ if (atomic_sub_and_test(2, &fp->refcount))
__ksmbd_close_fd(NULL, fp);
- }
}
static int ksmbd_durable_scavenger(void *dummy)
{
struct ksmbd_file *fp = NULL;
+ struct ksmbd_file *expired_fp;
unsigned int id;
unsigned int min_timeout = 1;
bool found_fp_timeout;
- LIST_HEAD(scavenger_list);
unsigned long remaining_jiffies;
__module_get(THIS_MODULE);
@@ -1060,8 +1095,6 @@ static int ksmbd_durable_scavenger(void *dummy)
if (try_to_freeze())
continue;
- found_fp_timeout = false;
-
remaining_jiffies = wait_event_timeout(dh_wq,
ksmbd_durable_scavenger_alive() == false,
__msecs_to_jiffies(min_timeout));
@@ -1070,23 +1103,39 @@ static int ksmbd_durable_scavenger(void *dummy)
else
min_timeout = DURABLE_HANDLE_MAX_TIMEOUT;
- write_lock(&global_ft.lock);
- idr_for_each_entry(global_ft.idr, fp, id) {
- if (!fp->durable_timeout)
- continue;
+ do {
+ expired_fp = NULL;
+ found_fp_timeout = false;
- if (atomic_read(&fp->refcount) > 1 ||
- fp->conn)
- continue;
-
- found_fp_timeout = true;
- if (fp->durable_scavenger_timeout <=
- jiffies_to_msecs(jiffies)) {
- __ksmbd_remove_durable_fd(fp);
- list_add(&fp->node, &scavenger_list);
- } else {
+ write_lock(&global_ft.lock);
+ idr_for_each_entry(global_ft.idr, fp, id) {
unsigned long durable_timeout;
+ if (!fp->durable_timeout)
+ continue;
+
+ if (atomic_read(&fp->refcount) > 1 ||
+ fp->conn)
+ continue;
+
+ found_fp_timeout = true;
+ if (fp->durable_scavenger_timeout <=
+ jiffies_to_msecs(jiffies)) {
+ __ksmbd_remove_durable_fd(fp);
+ /*
+ * Take a transient reference so fp
+ * cannot be freed by an in-flight
+ * ksmbd_lookup_fd_inode() that found
+ * it through f_ci->m_fp_list while we
+ * drop global_ft.lock and reach the
+ * m_fp_list unlink in
+ * ksmbd_scavenger_dispose_dh().
+ */
+ atomic_inc(&fp->refcount);
+ expired_fp = fp;
+ break;
+ }
+
durable_timeout =
fp->durable_scavenger_timeout -
jiffies_to_msecs(jiffies);
@@ -1094,10 +1143,11 @@ static int ksmbd_durable_scavenger(void *dummy)
if (min_timeout > durable_timeout)
min_timeout = durable_timeout;
}
- }
- write_unlock(&global_ft.lock);
+ write_unlock(&global_ft.lock);
- ksmbd_scavenger_dispose_dh(&scavenger_list);
+ if (expired_fp)
+ ksmbd_scavenger_dispose_dh(expired_fp);
+ } while (expired_fp);
if (found_fp_timeout == false)
break;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races
2026-04-28 14:08 [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races DaeMyung Kang
` (2 preceding siblings ...)
2026-04-28 14:08 ` [PATCH v2 3/3] ksmbd: close durable scavenger races against m_fp_list lookups DaeMyung Kang
@ 2026-04-29 3:21 ` Namjae Jeon
3 siblings, 0 replies; 5+ messages in thread
From: Namjae Jeon @ 2026-04-29 3:21 UTC (permalink / raw)
To: DaeMyung Kang
Cc: Steve French, Sergey Senozhatsky, Tom Talpey, Hyunchul Lee,
Ronnie Sahlberg, linux-cifs, linux-kernel
On Tue, Apr 28, 2026 at 11:09 PM DaeMyung Kang <charsyam@gmail.com> wrote:
>
> This series fixes lifetime bugs around ksmbd connection shutdown,
> session file-table teardown, and durable handle scavenging.
>
> Patch 1 centralizes the final struct ksmbd_conn release so every last
> putter runs ida_destroy() and transport cleanup. The release is queued
> to a dedicated workqueue because transport teardown can sleep, while
> one known last-putter is an RCU callback.
>
> Patch 2 hardens __close_file_table_ids() by taking a transient
> ksmbd_file reference, unpublishing from the session idr under ft->lock,
> and doing sleepable preserve/close work outside ft->lock. It also
> makes the FP_NEW window visible to the opener through
> ksmbd_update_fstate().
>
> Patch 2 is scoped to file-table teardown and the FP_NEW publication
> window. It does not try to fix durable reconnect rollback on later
> smb2_open() error paths (and the related post-FP_INITED reference
> window in fresh smb2_open). Both already exist before this series
> and need either an explicit unpublish-on-error step or an extra
> session-owned reference. That is left as follow-up work. The
> FP_NEW -> FP_INITED failure reuses smb2_open()'s existing -ENOENT to
> STATUS_OBJECT_NAME_INVALID mapping to avoid changing wire behavior in
> this lifetime fix.
>
> Patch 3 closes two related races in the durable scavenger against any
> walker that iterates f_ci->m_fp_list (ksmbd_lookup_fd_inode() and the
> share-mode checks). The scavenger no longer reuses fp->node as a
> scavenger-private collect-list node, takes an explicit transient
> reference under global_ft.lock, and drops both the durable lifetime
> and transient refs with atomic_sub_and_test(2, ...) after the
> m_fp_list unlink so an in-flight m_fp_list walker that snatched fp
> owns the final close cleanly. fp->persistent_id is cleared inside
> __ksmbd_remove_durable_fd() so a delayed final close cannot re-issue
> idr_remove() on a slot that idr_alloc_cyclic() may have already
> re-handed to a new durable handle. __put_fd_final() bypasses the
> per-conn open_files_count decrement when fp is detached from any
> session table (fp->conn cleared by session_fd_check() at durable
> preserve, paired with the volatile_id clear at unpublish), since the
> walker that owns the final close in that case runs from an unrelated
> work->conn whose counter never tracked this durable fp.
>
> The series is intentionally scoped to lifetime/race fixes and the
> walker-final-putter regression that the durable scavenger handoff in
> patch 3 newly exposes. Pre-existing reconnect rollback and per-conn
> open_files_count accounting gaps are left as follow-up work so this
> series does not have to claim a full durable-reconnect or accounting
> cleanup.
>
> Validation:
> * abrupt-disconnect kmemleak/kprobe A/B for connection release
> * same-session two-tcon DEBUG_LIST/DEBUG_OBJECTS_WORK stress
> * forced durable-preserve sleep-path harness for session teardown
> * KASAN-enabled direct SMB2 coverage for session/tree teardown and
> durable-preserve paths
> * KASAN-enabled direct SMB2 coverage for durable scavenger expiry
> racing with m_fp_list lookups
> * checkpatch --strict for all patches
> * make -j$(nproc) M=fs/smb/server
>
> v1 -> v2:
> * Split the original change into bisectable patches: connection final
> release, session file-table teardown, and durable scavenger races.
> * Keep sleepable session preserve/close work out of ft->lock and make
> the FP_NEW publication race visible through a cleared volatile id.
> * Document that durable reconnect rollback on later smb2_open()
> error paths is a pre-existing follow-up item, and keep the
> existing -ENOENT to STATUS_OBJECT_NAME_INVALID wire mapping for
> this lifetime fix.
> * Avoid reusing fp->node as a temporary durable scavenger list node
> and take a transient reference in the durable scavenger so
> concurrent ksmbd_lookup_fd_inode() walkers cannot UAF on freed fp.
> These two fixes are folded into a single patch because the
> list-head-reuse fix alone leaves a deterministic UAF window for
> m_fp_list walkers; bisecting onto an intermediate state would land
> on a use-after-free that pre-patch chaos merely made less
> reproducible.
> * Clear fp->persistent_id in __ksmbd_remove_durable_fd() so a holder
> that owns the final close after a scavenger removal does not
> re-issue idr_remove() on a persistent id that may have already been
> handed out to a new durable handle.
> * Bypass the per-conn open_files_count decrement in __put_fd_final()
> when fp is detached from any session table, so an m_fp_list walker
> that owns the final close of a scavenged durable fp does not
> underflow an unrelated conn's stats counter.
> * Document the ksmbd_conn_wq lifetime invariant in ksmbd_conn_put()
> instead of guarding with WARN_ON_ONCE, so a violation surfaces as
> a NULL deref rather than a silent leak of the final release.
>
> DaeMyung Kang (3):
> ksmbd: centralize ksmbd_conn final release to plug transport leak
> ksmbd: harden file lifetime during session teardown
> ksmbd: close durable scavenger races against m_fp_list lookups
Applied them to #ksmbd-for-next-next.
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-29 3:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 14:08 [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 1/3] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 2/3] ksmbd: harden file lifetime during session teardown DaeMyung Kang
2026-04-28 14:08 ` [PATCH v2 3/3] ksmbd: close durable scavenger races against m_fp_list lookups DaeMyung Kang
2026-04-29 3:21 ` [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races Namjae Jeon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox