From: DaeMyung Kang <charsyam@gmail.com>
To: Namjae Jeon <linkinjeon@kernel.org>, Steve French <smfrench@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
Tom Talpey <tom@talpey.com>, Hyunchul Lee <hyc.lee@gmail.com>,
Ronnie Sahlberg <lsahlber@redhat.com>,
linux-cifs@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org, DaeMyung Kang <charsyam@gmail.com>
Subject: [PATCH 0/2] ksmbd: fix stop_sessions() iteration and centralize ksmbd_conn release
Date: Sat, 25 Apr 2026 18:38:27 +0900 [thread overview]
Message-ID: <20260425093829.4004785-1-charsyam@gmail.com> (raw)
Two independent fixes around ksmbd connection teardown.
Patch 1 ("ksmbd: rewrite stop_sessions() with restartable iteration")
fixes a stale-iterator bug in stop_sessions(). The current loop drops
conn_list_lock around transport->shutdown() and re-acquires it to
continue the same hash_for_each() walk; a concurrent ksmbd_conn_free()
in the unlocked window can make the iterator revisit or skip
connections. The rewrite pins one connection at a time, marks it via
a new conn->stop_called flag, runs shutdown outside the lock, and
restarts from the top. The "is the hash drained?" check is moved
into the locked walk so it no longer races hash_del() from a handler
thread. Patch 1 carries Cc: stable@vger.kernel.org.
Teardown wall time, ksmbd.control --shutdown + rmmod ksmbd with
N concurrent nosharesock cifs connections (3-run mean):
N before after
10 4.93s 5.34s
30 7.34s 7.03s
50 7.25s 7.04s
100 6.98s 6.78s
200 6.77s 6.89s
Performance is unchanged; teardown is dominated by the msleep(100)
outer retry, not by the iteration. The number of ->shutdown()
calls equals the number of live connections on both paths when the
race is not artificially widened.
Patch 2 ("ksmbd: centralize ksmbd_conn final release to plug transport
leak") routes the three bare-kfree final-put sites
(ksmbd_conn_r_count_dec, __free_opinfo, session_fd_check) through a
new ksmbd_conn_put() that defers the once-per-struct cleanup
(ida_destroy, free_transport, kfree) onto a dedicated workqueue.
Those sites previously skipped transport cleanup whenever they
happened to be the last putter, leaking struct tcp_transport and the
iov kvec (TCP) and the embedded async_ida. The workqueue bounce is
needed because the centralized release reaches lock_sock_nested()
through tcp_close(), which sleeps; __free_opinfo() can be a final
putter from an RCU softirq callback (free_opinfo_rcu).
A/B leak validation, QEMU/virtme guest with ksmbd server and CIFS
client in the same guest, debug kernel (CONFIG_DEBUG_KMEMLEAK +
CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP +
CONFIG_DEBUG_OBJECTS + CONFIG_FAILSLAB):
Reproducer per iteration: hold 8 fds open via sleep processes,
force-close TCP with `ss -K sport = :445`, kill holders,
lazy-umount; 10 iterations, then ksmbd shutdown + kmemleak scan.
Pre-patch is HEAD with only patch 1 of this series applied:
state conn_alloc conn_free tcp_free opi_rcu kmemleak
---------- ---------- --------- -------- ------- --------
pre-patch 20 20 10 160 7
with patch 20 20 20 160 0
Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak reports 7
unreferenced struct tcp_transport / t->iov objects. With patch 2
tcp_free matches conn_free at 20/20 and kmemleak is clean across
two independent post-patch runs. opi_rcu=160 confirms the RCU
opinfo release path that motivates the fix is exercised.
Patch 2 also addresses two adjacent issues exposed by the new
debugging context:
* __close_file_table_ids() refactor. session_fd_check() already
sleeps in kstrdup(GFP_KERNEL) and down_write(m_lock) before this
patch, but the existing code calls it under write_lock(&ft->lock)
(an rwlock_t). Refactor so skip() runs outside ft->lock with a
transient reference on fp; idr_remove(), fp->volatile_id clear
and the m_fp_list unlink happen unconditionally so a deferred
final putter (atomic_sub_and_test(2) returning false) cannot be
left to do them via __ksmbd_remove_fd() with a stale
volatile_id.
Validated on an additional debug kernel built with
CONFIG_DEBUG_LIST + CONFIG_DEBUG_OBJECTS_WORK using a
same-session two-tcon storm: one tcon drives an open/write
storm while the other tcon repeats 50 tree disconnects on the
same session. Trace counts: 52 __close_file_table_ids
invocations, 4793 __ksmbd_close_fd, 30337 __put_fd_final, 9578
ksmbd_conn_put, 1 __ksmbd_conn_release_work. No
list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep
or kmemleak reports observed. This stress validates the
file-table/id/list rewrite under DEBUG_LIST/DEBUG_OBJECTS_WORK,
not the transport leak (which the A/B above already covered).
* fp owns a strong reference on fp->conn. fp used to hold a
borrowed pointer to its connection, leaving a window where the
conn could be freed while a still-live fp held a stale
fp->conn. Make fp own a refcount on fp->conn from
ksmbd_open_fd() / ksmbd_reopen_durable_fd() until __ksmbd_close_fd()
or session_fd_check(). ksmbd_reopen_durable_fd() is also
reordered so fp->conn / fp->tcon are set before __open_id()
publishes fp into the session's file table, so a concurrent
teardown cannot observe a valid volatile_id with fp->conn ==
NULL.
The two patches are independent. Either can be reviewed, applied,
or reverted without pulling the other. Patch 2 references patch 1
only to explain why stop_sessions() is left on its open-coded local
cleanup and not converted to ksmbd_conn_put() in this series; if
patch 2 is applied alone, stop_sessions() retains its bare-kfree
leak that patch 1 separately addresses.
Based on ksmbd-for-next-next.
DaeMyung Kang (2):
ksmbd: rewrite stop_sessions() with restartable iteration
ksmbd: centralize ksmbd_conn final release to plug transport leak
fs/smb/server/connection.c | 120 ++++++++++++++++++-----
fs/smb/server/connection.h | 6 ++
fs/smb/server/oplock.c | 4 +-
fs/smb/server/server.c | 12 +++
fs/smb/server/vfs_cache.c | 189 ++++++++++++++++++++++++++++++++-----
5 files changed, 285 insertions(+), 46 deletions(-)
--
2.43.0
next reply other threads:[~2026-04-25 9:38 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-25 9:38 DaeMyung Kang [this message]
2026-04-25 9:38 ` [PATCH 1/2] ksmbd: rewrite stop_sessions() with restartable iteration DaeMyung Kang
2026-04-26 23:20 ` Namjae Jeon
2026-04-25 9:38 ` [PATCH 2/2] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
2026-04-27 1:06 ` Namjae Jeon
2026-04-27 1:32 ` CharSyam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260425093829.4004785-1-charsyam@gmail.com \
--to=charsyam@gmail.com \
--cc=hyc.lee@gmail.com \
--cc=linkinjeon@kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lsahlber@redhat.com \
--cc=senozhatsky@chromium.org \
--cc=smfrench@gmail.com \
--cc=stable@vger.kernel.org \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox