From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A50DF38837F; Tue, 16 Jun 2026 15:33:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781623990; cv=none; b=ekGzyhvzavMA8J9BZOqKWdLY4iRc5yv9I82/r6civttxeiUQ+YD4ENNqIJe2V9bsMZrA8OSJsEdy25MR3vT9M8b6hBiegVwDa6LI7kq+/g1hQAVA2rKc1ASxTUwg+hprCMnG5AQ5lbkyBn74aT5xsDESuPCXVa5XxAHMoyffVTw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781623990; c=relaxed/simple; bh=RFcvoIRG1Zw2iKW0aAkWKWwetPauxZgxBhuySCFquL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ERHbqF4R6bP8EBNMkqTNv0RYzjS4W0nKHdchezpd07E2BPUZlVpcugvAQJ3C5greVfcJvsfN1pJ4CtZe4rvFluV4Yf0GNJiWaOt9oP6WcgAcom86KNKYWQNclJVeUh432kR4NitfVf2AHuvdG9JbZI8LpoTk4qYdtu2svHRc/C4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=KpNLrB5D; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="KpNLrB5D" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 75EB71F00A3A; Tue, 16 Jun 2026 15:33:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=korg; t=1781623989; bh=kifDaaxUwtG0qR1ht/WMHE+abYumwGWkWxA1CU+J+Jo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=KpNLrB5D+dUgpO+qKbUd43Rrj/U1Ph5gg+AVjcY5GTqEQcBMZUf7vinHB3Es8/L+q RzAcudG/bmhux6BAzufpcL48/6k7YbySQPuasnr/Z0Xfj4FT/Mn1GepYVkW9uW5ePr i8S7MD9iZpNDk1B2v6yEhK3/NXCtUjRK1RidTZsQ= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Jann Horn , "Christian Brauner (Amutable)" Subject: [PATCH 7.0 252/378] fhandle: fix UAF due to unlocked ->mnt_ns read in may_decode_fh() Date: Tue, 16 Jun 2026 20:28:03 +0530 Message-ID: <20260616145123.412088602@linuxfoundation.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260616145109.744539446@linuxfoundation.org> References: <20260616145109.744539446@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 7.0-stable review patch. If anyone has any objections, please let me know. ------------------ From: Jann Horn commit 40ab6644b99685755f740b872c00ef40d9aa870e upstream. may_decode_fh() accesses mount::mnt_ns without holding any locks; that means the mount can concurrently be unmounted, and the mnt_namespace can concurrently be freed after an RCU grace period. This race can happens as follows, assuming that the mount point was created by open_tree(..., OPEN_TREE_CLONE): thread 1 thread 2 RCU __do_sys_open_by_handle_at do_handle_open handle_to_path may_decode_fh is_mounted [mount::mnt_ns access] [mount::mnt_ns access] __do_sys_close fput_close_sync __fput dissolve_on_fput umount_tree class_namespace_excl_destructor namespace_unlock free_mnt_ns mnt_ns_tree_remove call_rcu(mnt_ns_release_rcu) mnt_ns_release_rcu mnt_ns_release kfree [mnt_namespace::user_ns access] **UAF** Fix it by taking rcu_read_lock() around the mount::mnt_ns access, like in __prepend_path(). Additionally, document the semantics of mount::mnt_ns, and use WRITE_ONCE() for writers that can race with lockless readers. This bug is unreachable unless one of the following is set: - CONFIG_PREEMPTION - CONFIG_RCU_STRICT_GRACE_PERIOD because it requires an RCU grace period to happen during a syscall without an explicit preemption. This doesn't seem to have interesting security impact; worst-case, it could leak the result of an integer comparison to userspace (from the level check in cap_capable()), cause an endless loop, or crash the kernel by dereferencing an invalid address. Fixes: 620c266f3949 ("fhandle: relax open_by_handle_at() permission checks") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn Link: https://patch.msgid.link/20260603-vfs-fhandle-uaf-fix-v2-1-d05db76a5084@google.com Signed-off-by: Christian Brauner (Amutable) Signed-off-by: Greg Kroah-Hartman --- fs/fhandle.c | 16 ++++++++++++++-- fs/mount.h | 10 +++++++++- fs/namespace.c | 6 +++--- 3 files changed, 26 insertions(+), 6 deletions(-) --- a/fs/fhandle.c +++ b/fs/fhandle.c @@ -285,6 +285,19 @@ static int do_handle_to_path(struct file return 0; } +static bool capable_wrt_mount(struct mount *mount) +{ + struct mnt_namespace *mnt_ns; + + /* + * For ->mnt_ns access. + * The following READ_ONCE() is semantically rcu_dereference(). + */ + guard(rcu)(); + mnt_ns = READ_ONCE(mount->mnt_ns); + return ns_capable(mnt_ns->user_ns, CAP_SYS_ADMIN); +} + static inline int may_decode_fh(struct handle_to_path_ctx *ctx, unsigned int o_flags) { @@ -320,8 +333,7 @@ static inline int may_decode_fh(struct h if (ns_capable(root->mnt->mnt_sb->s_user_ns, CAP_SYS_ADMIN)) ctx->flags = HANDLE_CHECK_PERMS; else if (is_mounted(root->mnt) && - ns_capable(real_mount(root->mnt)->mnt_ns->user_ns, - CAP_SYS_ADMIN) && + capable_wrt_mount(real_mount(root->mnt)) && !has_locked_children(real_mount(root->mnt), root->dentry)) ctx->flags = HANDLE_CHECK_PERMS | HANDLE_CHECK_SUBTREE; else --- a/fs/mount.h +++ b/fs/mount.h @@ -71,7 +71,15 @@ struct mount { struct hlist_head mnt_slave_list;/* list of slave mounts */ struct hlist_node mnt_slave; /* slave list entry */ struct mount *mnt_master; /* slave is on master->mnt_slave_list */ - struct mnt_namespace *mnt_ns; /* containing namespace */ + /* + * Containing namespace (active or deactivating, non-refcounted). + * Normally protected by namespace_sem. + * Can also be accessed locklessly under RCU. RCU readers can't rely on + * the namespace still being active, but implicitly hold a passive + * reference (because an RCU delay happens between a namespace being + * deactivated and the corresponding passive refcount drop). + */ + struct mnt_namespace *mnt_ns; struct mountpoint *mnt_mp; /* where is it mounted */ union { struct hlist_node mnt_mp_list; /* list mounts with the same mountpoint */ --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1079,7 +1079,7 @@ static void mnt_add_to_ns(struct mnt_nam bool mnt_first_node = true, mnt_last_node = true; WARN_ON(mnt_ns_attached(mnt)); - mnt->mnt_ns = ns; + WRITE_ONCE(mnt->mnt_ns, ns); while (*link) { parent = *link; if (mnt->mnt_id_unique < node_to_mount(parent)->mnt_id_unique) { @@ -1434,7 +1434,7 @@ EXPORT_SYMBOL(mntget); void mnt_make_shortterm(struct vfsmount *mnt) { if (mnt) - real_mount(mnt)->mnt_ns = NULL; + WRITE_ONCE(real_mount(mnt)->mnt_ns, NULL); } /** @@ -1806,7 +1806,7 @@ static void umount_tree(struct mount *mn ns->nr_mounts--; __touch_mnt_namespace(ns); } - p->mnt_ns = NULL; + WRITE_ONCE(p->mnt_ns, NULL); if (how & UMOUNT_SYNC) p->mnt.mnt_flags |= MNT_SYNC_UMOUNT;