From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5F6A42EEDF for ; Tue, 28 Apr 2026 14:09:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777385350; cv=none; b=dgHKeCdohA8VBH6HrhDLpaMh3KmGXT79eOQPYlcUnaAQjKUumz++iQk0Ex2Ul3E8PZaVo2XGwYJvg9rrdSt9p7+ba6UPDK0f16p9Cq3eDri7sFxvnQThI+2qt8zktcQrjUUzfaP6m+cqBEqP8qeSsvSyy7kLIWOu51vsXguuJdI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777385350; c=relaxed/simple; bh=7Y75CJc2/d6aHB2cQnB6UNXLLb91D24toWSFpWWyhQc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WjRwJ3APqLWoF4ymaVGJ1Cm0cPVWH7o7xQ5dPFE04ZLWVO+cRxN0BSFZRoPy6VYnt3EqHJWUHyuua6FPB928RBHbLn8nSNdm52JxsZuyXzn4DVZbi7rFM7j2TrWttGzZXpeFSfqV4U+wr5K5nv/T1327v6+sk0QLsDyC09csII0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jFUrSbDg; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jFUrSbDg" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-826c4c6d95dso672933b3a.0 for ; Tue, 28 Apr 2026 07:09:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777385347; x=1777990147; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xlWHHqWJasISzgKRTEELiwHwHgmZ46NRjOQ3vo3IPy4=; b=jFUrSbDg/4oC2PkH4sl2GhXb7QY8/XCc0gAnSxpwC1ofvASx9wclP7c9dbcJ1w1goR zle6Uj5MGCAdyJZgOp+RUjObCE0EcJ6VHzHNhUbjw3nVnsA1UVVhiYlmVdhveF2G5bon cVVj8he/FVRD58nl+xOaGKJg/niRl7gsNpxaxH+g1nY8uxcqJ12l+118jIc/qZQC0sTG 3KBXWZOEl1HAj7akx3uXT8+lxUrfOXP9Uq7KfQ7s9Ou6z1g0n+uGjoi3FynXCTiL1+oF gw8SYTxM5a4BwCtgRPg8FPjiqPQHGYQng79zzJlWyWC+cQ3OmCOpEp7yiTi/wOkrPZEI +iHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777385347; x=1777990147; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=xlWHHqWJasISzgKRTEELiwHwHgmZ46NRjOQ3vo3IPy4=; b=lBAerj7ClNtFYs66xKzdUBG3CY7sT8PyZURiZxwtnDi7qVdjTLO37hGILafMDb8yPv +h8IToXkUydh/tgULxC4qP6ol9upVoN1cFIE5dy9802kH70dJ6fkuar/SuVe1B63bI5Q aBeFzS7y5j6aM07sHQzNSwcjBMHY9xn8I7cgeqcrp5tOodjSV07P/W+0dJp2EtOIuczm x4Mpm1kaOLZjMHdnY0GRjnpE71PNZpYYYipx/C2/1vYHhPIirynkZYwjVR6b5HqubGZR urpKEKH9UkbIRgx6T0ejuWJJHlfRFO9FvGOlmkOihTWhKDqKbL5Rc7W0PrzdKcGDFtRI 3RLw== X-Forwarded-Encrypted: i=1; AFNElJ+w87JHFeWai4815UFDDI/YJW1Zy8nR4X+75VElxf6FKFNUaFpJnu8FMckodtocvDKyWqRV9M6WBxsAbLY=@vger.kernel.org X-Gm-Message-State: AOJu0YyeWLyRFuQ8b8vF6hFrSoUNUzaqv6HPo5epfC3kijnHWVW/+LdB +BGPl4vbgE5Fv6sZ9SVnVvFXy2HOmmBCEwfx+kYqk5xROz7/UnC00yj6 X-Gm-Gg: AeBDieuuU9ZBWD5/GRZFd9hWh5s64E22wZOiP9p6ikttUW9L6Gvw9E1A4+4UMuxrP1R ifT3qEj1c07PkN1HCme+fyKpW/edEZvLb0/3u//yPbXNtICFJcC8XsmFQ1ozhksp9xIRvT/RifM 6kDOvkn06sdjdMDtb5z353QOfBhrbK8WUqvJCbKnL+K9RPNo9zVFqeoa8WWoglUeG3dLStpvR4d 9YlcVidfCDft191xRehppH9mGjPS2VK8Z5K6VAKyia2UnrDbpb5sDcg78QcGHROvxjYgI9EuUXa swqhWPuRPZCzqZJfxiVcsrrrEJHT20vcDkq3ClIUgQwYv60NaEc+W0V5rZ2WysS3DNeKOm1T4Ja WokTzAoBF94omGB5V10E4TjzqwYpRXpItZ9e7ZbTk48f75O/d+N6KjQ+7YLA7yeHVefogoQpYec ngGAEFwEZb1Cdag3xBq8KnKtolBW0= X-Received: by 2002:a05:6a21:10e:b0:3a2:c72c:745d with SMTP id adf61e73a8af0-3a398db06e1mr2710162637.4.1777385346798; Tue, 28 Apr 2026 07:09:06 -0700 (PDT) Received: from ser8.. ([221.156.231.192]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c7fcade0f56sm2190526a12.21.2026.04.28.07.09.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Apr 2026 07:09:06 -0700 (PDT) From: DaeMyung Kang To: Namjae Jeon , Steve French Cc: Sergey Senozhatsky , Tom Talpey , Hyunchul Lee , Ronnie Sahlberg , linux-cifs@vger.kernel.org, linux-kernel@vger.kernel.org, DaeMyung Kang Subject: [PATCH v2 2/3] ksmbd: harden file lifetime during session teardown Date: Tue, 28 Apr 2026 23:08:55 +0900 Message-ID: <20260428140856.941847-3-charsyam@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260428140856.941847-1-charsyam@gmail.com> References: <20260428140856.941847-1-charsyam@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit __close_file_table_ids() is the per-session teardown that closes every fp belonging to a session (or to one tree connect on that session) by walking the session's volatile-id idr. The current loop has three related problems on busy or racing workloads: * Sleeping under ft->lock. The session-teardown skip callback, session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner() -> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a rw_semaphore). Running the callback inside write_lock(&ft->lock) trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a durable-fd workload. * Refcount accounting blind to f_state. The unconditional atomic_dec_and_test(&fp->refcount) does not distinguish FP_INITED (idr-owned reference still intact) from FP_CLOSED (an earlier ksmbd_close_fd() already consumed the idr-owned reference while leaving fp in the idr because a holder kept refcount non-zero). When the latter races with teardown the same path over-decrements into a holder reference and ksmbd_fd_put() later UAFs that holder. * FP_NEW window. Between __open_id() publishing fp into the session idr and ksmbd_update_fstate(..., FP_INITED) committing the transition at the end of smb2_open(), an fp is in FP_NEW and an intervening teardown that takes a transient reference and unpublishes the volatile id leaves the original idr-owned reference orphaned -- the opener is unaware that fp has been unpublished, returns success to the client, and the fp leaks at refcount = 1. Refactor __close_file_table_ids() to take a transient reference on fp and unpublish fp from the session idr *under ft->lock* before calling skip() outside the lock. A transient ref protects lifetime but not concurrent field mutation, so the idr_remove() is what keeps __ksmbd_lookup_fd() through this session's idr from granting a new ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon / fp->volatile_id / op->conn / lock_list links are about to be rewritten by session_fd_check(). Durable reconnect is unaffected because it reaches fp through the global durable table (ksmbd_lookup_durable_fd -> global_ft). Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition under ft->lock so teardown and ksmbd_close_fd() never both consume the idr-owned reference. See ksmbd_mark_fp_closed() for the per-state accounting. For the FP_NEW path to be safe, the opener has to learn that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT when an FP_NEW -> FP_INITED transition finds f_state already advanced or the volatile id cleared (both committed by teardown under ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID and drops the original reference via ksmbd_fd_put(). The list removal cannot be left for a deferred final putter because fp->volatile_id has already been cleared and __ksmbd_remove_fd() will intentionally skip both idr_remove() and list_del_init(). Move the m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so that an FP_NEW fp that happened to be added to m_fp_list (smb2_open() adds fp->node before ksmbd_update_fstate() runs) is still cleaned up on the deferred putter path; list_del_init() on an empty node is a no-op and remains safe for fps that were never added. Add a defensive guard in session_fd_check() that refuses non-FP_INITED fps so that even if a teardown reaches an FP_NEW fp it falls into the close branch (where the n_to_drop = 1 accounting keeps the opener's reference alive) instead of the durable-preserve branch (which mutates fp->conn / fp->tcon). Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload (open/write storm on one tcon, 50 tree disconnects on the other) and reported no list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep or kmemleak reports. Reverting only the __close_file_table_ids() hunk while keeping a forced-is_reconnectable() harness produced the expected sleep-in-atomic at vfs_cache.c:1095, confirming the ft->lock-out-of-sleepable-skip discipline. KASAN-enabled direct SMB2 coverage with durable handles enabled exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(), the FP_NEW failure path, tree_conn_fd_check(), and a non-zero session_fd_check() durable-preserve return. This produced no KASAN, DEBUG_LIST, ODEBUG, or WARNING reports. Fixes: f44158485826 ("cifsd: add file operations") Signed-off-by: DaeMyung Kang --- fs/smb/server/smb2pdu.c | 6 +- fs/smb/server/vfs_cache.c | 179 +++++++++++++++++++++++++++++++++----- fs/smb/server/vfs_cache.h | 4 +- 3 files changed, 164 insertions(+), 25 deletions(-) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 21825a69c29a..c1240ca14253 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -3767,8 +3767,10 @@ int smb2_open(struct ksmbd_work *work) err_out2: if (!rc) { - ksmbd_update_fstate(&work->sess->file_table, fp, FP_INITED); - rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len); + rc = ksmbd_update_fstate(&work->sess->file_table, fp, + FP_INITED); + if (!rc) + rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len); } if (rc) { if (rc == -EINVAL) diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c index 4a18107937cc..dc4037ef1834 100644 --- a/fs/smb/server/vfs_cache.c +++ b/fs/smb/server/vfs_cache.c @@ -431,13 +431,13 @@ static void ksmbd_remove_durable_fd(struct ksmbd_file *fp) static void __ksmbd_remove_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp) { - if (!has_file_id(fp->volatile_id)) - return; - down_write(&fp->f_ci->m_lock); list_del_init(&fp->node); up_write(&fp->f_ci->m_lock); + if (!has_file_id(fp->volatile_id)) + return; + write_lock(&ft->lock); idr_remove(ft->idr, fp->volatile_id); write_unlock(&ft->lock); @@ -798,15 +798,58 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp) return ERR_PTR(ret); } -void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, - unsigned int state) +/** + * ksmbd_update_fstate() - update an fp state under the file-table lock + * @ft: file table that publishes @fp's volatile id + * @fp: file pointer to update + * @state: new state + * + * Return: 0 on success. The FP_NEW -> FP_INITED transition is special: + * -ENOENT if teardown already unpublished @fp by advancing the state or + * clearing the volatile id. Other state updates preserve the historical + * fire-and-forget behavior. + */ +int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, + unsigned int state) { + int ret; + if (!fp) - return; + return -ENOENT; write_lock(&ft->lock); - fp->f_state = state; + if (state == FP_INITED && + (fp->f_state != FP_NEW || !has_file_id(fp->volatile_id))) { + ret = -ENOENT; + } else { + fp->f_state = state; + ret = 0; + } write_unlock(&ft->lock); + + return ret; +} + +/* + * ksmbd_mark_fp_closed() - mark fp closed under ft->lock and return how many + * refs the teardown path owns. + * + * FP_INITED has a normal idr-owned reference, so teardown owns both that + * reference and the transient lookup reference. FP_NEW is still owned by the + * in-flight opener/reopener, which will drop the original reference after + * ksmbd_update_fstate(..., FP_INITED) observes the cleared volatile id. + * FP_CLOSED on entry means an earlier ksmbd_close_fd() already consumed the + * idr-owned ref. + */ +static int ksmbd_mark_fp_closed(struct ksmbd_file *fp) +{ + if (fp->f_state == FP_INITED) { + set_close_state_blocked_works(fp); + fp->f_state = FP_CLOSED; + return 2; + } + + return 1; } static int @@ -814,7 +857,8 @@ __close_file_table_ids(struct ksmbd_session *sess, struct ksmbd_tree_connect *tcon, bool (*skip)(struct ksmbd_tree_connect *tcon, struct ksmbd_file *fp, - struct ksmbd_user *user)) + struct ksmbd_user *user), + bool skip_preserves_fp) { struct ksmbd_file_table *ft = &sess->file_table; struct ksmbd_file *fp; @@ -822,32 +866,120 @@ __close_file_table_ids(struct ksmbd_session *sess, int num = 0; while (1) { + int n_to_drop; + write_lock(&ft->lock); fp = idr_get_next(ft->idr, &id); if (!fp) { write_unlock(&ft->lock); break; } - - if (skip(tcon, fp, sess->user) || - !atomic_dec_and_test(&fp->refcount)) { + if (!atomic_inc_not_zero(&fp->refcount)) { id++; write_unlock(&ft->lock); continue; } - set_close_state_blocked_works(fp); - idr_remove(ft->idr, fp->volatile_id); - fp->volatile_id = KSMBD_NO_FID; - write_unlock(&ft->lock); + if (skip_preserves_fp) { + /* + * Session teardown: skip() is session_fd_check(), + * which may sleep and mutates fp->conn / fp->tcon / + * fp->volatile_id when it chooses to preserve fp + * for durable reconnect. Unpublish fp from the + * session idr here, under ft->lock, so that + * __ksmbd_lookup_fd() through this session cannot + * grant a new ksmbd_fp_get() reference to an fp + * whose fields are about to be rewritten outside + * the lock. Durable reconnect still reaches fp via + * global_ft. + */ + idr_remove(ft->idr, id); + fp->volatile_id = KSMBD_NO_FID; + write_unlock(&ft->lock); + + if (skip(tcon, fp, sess->user)) { + /* + * session_fd_check() has converted fp to + * durable-preserve state and cleared its + * per-conn fields. fp is already unpublished + * above; the original idr-owned ref keeps it + * alive for the durable scavenger. Drop only + * the transient ref. atomic_dec() is safe -- + * atomic_inc_not_zero() succeeded on a + * positive value and we added one more, so + * refcount cannot be zero here. + */ + atomic_dec(&fp->refcount); + id++; + continue; + } + + /* + * Keep the close-state decision under the same lock + * observed by ksmbd_update_fstate(), which is how an + * in-flight FP_NEW opener learns that teardown has + * cleared its volatile id. + */ + write_lock(&ft->lock); + n_to_drop = ksmbd_mark_fp_closed(fp); + write_unlock(&ft->lock); + } else { + /* + * Tree teardown: skip() is tree_conn_fd_check(), a + * cheap pointer compare that doesn't sleep and has + * no side effects, so keep the skip decision plus + * the unpublish-and-mark-closed sequence atomic + * under ft->lock. fps belonging to other tree + * connects (skip() == true) stay fully published in + * the session idr with no lock window. + */ + if (skip(tcon, fp, sess->user)) { + atomic_dec(&fp->refcount); + write_unlock(&ft->lock); + id++; + continue; + } + idr_remove(ft->idr, id); + fp->volatile_id = KSMBD_NO_FID; + n_to_drop = ksmbd_mark_fp_closed(fp); + write_unlock(&ft->lock); + } + /* + * fp->volatile_id is already cleared to prevent stale idr + * removal from a deferred final close. Remove fp from + * m_fp_list here because __ksmbd_remove_fd() will skip the + * list unlink when volatile_id is KSMBD_NO_FID. + */ down_write(&fp->f_ci->m_lock); list_del_init(&fp->node); up_write(&fp->f_ci->m_lock); - __ksmbd_close_fd(ft, fp); - - num++; + /* + * Drop the references this iteration owns: + * + * n_to_drop == 2: we observed FP_INITED and committed + * the FP_CLOSED transition ourselves, so we own the + * transient (+1) and the still-intact idr-owned ref. + * + * n_to_drop == 1: either a prior ksmbd_close_fd() + * already consumed the idr-owned ref, or fp was still + * FP_NEW and the in-flight opener/reopener must keep + * the original reference until ksmbd_update_fstate() + * observes the cleared volatile id. + * + * If we end up as the final putter, finalize fp and + * account the open_files_count decrement via the caller's + * atomic_sub(num, ...). Otherwise the remaining user's + * ksmbd_fd_put() reaches __put_fd_final(), which does its + * own atomic_dec(&open_files_count), so we must not count + * this fp here -- doing so would double-decrement the + * connection-wide counter. + */ + if (atomic_sub_and_test(n_to_drop, &fp->refcount)) { + __ksmbd_close_fd(NULL, fp); + num++; + } id++; } @@ -1082,6 +1214,9 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon, if (!is_reconnectable(fp)) return false; + if (fp->f_state != FP_INITED) + return false; + if (WARN_ON_ONCE(!fp->conn)) return false; @@ -1127,7 +1262,8 @@ void ksmbd_close_tree_conn_fds(struct ksmbd_work *work) { int num = __close_file_table_ids(work->sess, work->tcon, - tree_conn_fd_check); + tree_conn_fd_check, + false); atomic_sub(num, &work->conn->stats.open_files_count); } @@ -1136,7 +1272,8 @@ void ksmbd_close_session_fds(struct ksmbd_work *work) { int num = __close_file_table_ids(work->sess, work->tcon, - session_fd_check); + session_fd_check, + true); atomic_sub(num, &work->conn->stats.open_files_count); } @@ -1268,7 +1405,7 @@ void ksmbd_destroy_file_table(struct ksmbd_session *sess) if (!ft->idr) return; - __close_file_table_ids(sess, NULL, session_fd_check); + __close_file_table_ids(sess, NULL, session_fd_check, true); idr_destroy(ft->idr); kfree(ft->idr); ft->idr = NULL; diff --git a/fs/smb/server/vfs_cache.h b/fs/smb/server/vfs_cache.h index 866f32c10d4d..e6871266a94b 100644 --- a/fs/smb/server/vfs_cache.h +++ b/fs/smb/server/vfs_cache.h @@ -172,8 +172,8 @@ int ksmbd_close_inode_fds(struct ksmbd_work *work, struct inode *inode); int ksmbd_init_global_file_table(void); void ksmbd_free_global_file_table(void); void ksmbd_set_fd_limit(unsigned long limit); -void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, - unsigned int state); +int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, + unsigned int state); bool ksmbd_vfs_compare_durable_owner(struct ksmbd_file *fp, struct ksmbd_user *user); -- 2.43.0