[PATCH 0/2] ksmbd: fix stop_sessions() iteration and centralize ksmbd_conn release

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: DaeMyung Kang <charsyam@gmail.com>
To: Namjae Jeon <linkinjeon@kernel.org>, Steve French <smfrench@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Tom Talpey <tom@talpey.com>, Hyunchul Lee <hyc.lee@gmail.com>,
	Ronnie Sahlberg <lsahlber@redhat.com>,
	linux-cifs@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, DaeMyung Kang <charsyam@gmail.com>
Subject: [PATCH 0/2] ksmbd: fix stop_sessions() iteration and centralize ksmbd_conn release
Date: Sat, 25 Apr 2026 18:38:27 +0900	[thread overview]
Message-ID: <20260425093829.4004785-1-charsyam@gmail.com> (raw)

Two independent fixes around ksmbd connection teardown.

Patch 1 ("ksmbd: rewrite stop_sessions() with restartable iteration")
fixes a stale-iterator bug in stop_sessions().  The current loop drops
conn_list_lock around transport->shutdown() and re-acquires it to
continue the same hash_for_each() walk; a concurrent ksmbd_conn_free()
in the unlocked window can make the iterator revisit or skip
connections.  The rewrite pins one connection at a time, marks it via
a new conn->stop_called flag, runs shutdown outside the lock, and
restarts from the top.  The "is the hash drained?" check is moved
into the locked walk so it no longer races hash_del() from a handler
thread.  Patch 1 carries Cc: stable@vger.kernel.org.

  Teardown wall time, ksmbd.control --shutdown + rmmod ksmbd with
  N concurrent nosharesock cifs connections (3-run mean):

      N        before        after
      10       4.93s         5.34s
      30       7.34s         7.03s
      50       7.25s         7.04s
      100      6.98s         6.78s
      200      6.77s         6.89s

  Performance is unchanged; teardown is dominated by the msleep(100)
  outer retry, not by the iteration.  The number of ->shutdown()
  calls equals the number of live connections on both paths when the
  race is not artificially widened.

Patch 2 ("ksmbd: centralize ksmbd_conn final release to plug transport
leak") routes the three bare-kfree final-put sites
(ksmbd_conn_r_count_dec, __free_opinfo, session_fd_check) through a
new ksmbd_conn_put() that defers the once-per-struct cleanup
(ida_destroy, free_transport, kfree) onto a dedicated workqueue.
Those sites previously skipped transport cleanup whenever they
happened to be the last putter, leaking struct tcp_transport and the
iov kvec (TCP) and the embedded async_ida.  The workqueue bounce is
needed because the centralized release reaches lock_sock_nested()
through tcp_close(), which sleeps; __free_opinfo() can be a final
putter from an RCU softirq callback (free_opinfo_rcu).

  A/B leak validation, QEMU/virtme guest with ksmbd server and CIFS
  client in the same guest, debug kernel (CONFIG_DEBUG_KMEMLEAK +
  CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP +
  CONFIG_DEBUG_OBJECTS + CONFIG_FAILSLAB):

  Reproducer per iteration: hold 8 fds open via sleep processes,
  force-close TCP with `ss -K sport = :445`, kill holders,
  lazy-umount; 10 iterations, then ksmbd shutdown + kmemleak scan.
  Pre-patch is HEAD with only patch 1 of this series applied:

      state         conn_alloc  conn_free  tcp_free  opi_rcu  kmemleak
      ----------    ----------  ---------  --------  -------  --------
      pre-patch         20          20        10       160        7
      with patch        20          20        20       160        0

  Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
  bare-kfree paths skipping transport cleanup; kmemleak reports 7
  unreferenced struct tcp_transport / t->iov objects.  With patch 2
  tcp_free matches conn_free at 20/20 and kmemleak is clean across
  two independent post-patch runs.  opi_rcu=160 confirms the RCU
  opinfo release path that motivates the fix is exercised.

Patch 2 also addresses two adjacent issues exposed by the new
debugging context:

  * __close_file_table_ids() refactor.  session_fd_check() already
    sleeps in kstrdup(GFP_KERNEL) and down_write(m_lock) before this
    patch, but the existing code calls it under write_lock(&ft->lock)
    (an rwlock_t).  Refactor so skip() runs outside ft->lock with a
    transient reference on fp; idr_remove(), fp->volatile_id clear
    and the m_fp_list unlink happen unconditionally so a deferred
    final putter (atomic_sub_and_test(2) returning false) cannot be
    left to do them via __ksmbd_remove_fd() with a stale
    volatile_id.

    Validated on an additional debug kernel built with
    CONFIG_DEBUG_LIST + CONFIG_DEBUG_OBJECTS_WORK using a
    same-session two-tcon storm: one tcon drives an open/write
    storm while the other tcon repeats 50 tree disconnects on the
    same session.  Trace counts: 52 __close_file_table_ids
    invocations, 4793 __ksmbd_close_fd, 30337 __put_fd_final, 9578
    ksmbd_conn_put, 1 __ksmbd_conn_release_work.  No
    list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep
    or kmemleak reports observed.  This stress validates the
    file-table/id/list rewrite under DEBUG_LIST/DEBUG_OBJECTS_WORK,
    not the transport leak (which the A/B above already covered).

  * fp owns a strong reference on fp->conn.  fp used to hold a
    borrowed pointer to its connection, leaving a window where the
    conn could be freed while a still-live fp held a stale
    fp->conn.  Make fp own a refcount on fp->conn from
    ksmbd_open_fd() / ksmbd_reopen_durable_fd() until __ksmbd_close_fd()
    or session_fd_check().  ksmbd_reopen_durable_fd() is also
    reordered so fp->conn / fp->tcon are set before __open_id()
    publishes fp into the session's file table, so a concurrent
    teardown cannot observe a valid volatile_id with fp->conn ==
    NULL.

The two patches are independent.  Either can be reviewed, applied,
or reverted without pulling the other.  Patch 2 references patch 1
only to explain why stop_sessions() is left on its open-coded local
cleanup and not converted to ksmbd_conn_put() in this series; if
patch 2 is applied alone, stop_sessions() retains its bare-kfree
leak that patch 1 separately addresses.

Based on ksmbd-for-next-next.

DaeMyung Kang (2):
  ksmbd: rewrite stop_sessions() with restartable iteration
  ksmbd: centralize ksmbd_conn final release to plug transport leak

 fs/smb/server/connection.c | 120 ++++++++++++++++++-----
 fs/smb/server/connection.h |   6 ++
 fs/smb/server/oplock.c     |   4 +-
 fs/smb/server/server.c     |  12 +++
 fs/smb/server/vfs_cache.c  | 189 ++++++++++++++++++++++++++++++++-----
 5 files changed, 285 insertions(+), 46 deletions(-)

--
2.43.0

next             reply	other threads:[~2026-04-25  9:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-25  9:38 DaeMyung Kang [this message]
2026-04-25  9:38 ` [PATCH 1/2] ksmbd: rewrite stop_sessions() with restartable iteration DaeMyung Kang
2026-04-26 23:20   ` Namjae Jeon
2026-04-25  9:38 ` [PATCH 2/2] ksmbd: centralize ksmbd_conn final release to plug transport leak DaeMyung Kang
2026-04-27  1:06   ` Namjae Jeon
2026-04-27  1:32     ` CharSyam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260425093829.4004785-1-charsyam@gmail.com \
    --to=charsyam@gmail.com \
    --cc=hyc.lee@gmail.com \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lsahlber@redhat.com \
    --cc=senozhatsky@chromium.org \
    --cc=smfrench@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox