From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Andrew Morton <akpm@linux-foundation.org>,
stable@vger.kernel.org, David Carlier <devnexen@gmail.com>
Subject: [for-linus][PATCH 2/2] eventfs: Hold eventfs_mutex and SRCU when remount walks events
Date: Mon, 20 Apr 2026 14:11:53 -0400 [thread overview]
Message-ID: <20260420181220.362882530@kernel.org> (raw)
In-Reply-To: 20260420181151.771404678@kernel.org
From: David Carlier <devnexen@gmail.com>
Commit 340f0c7067a9 ("eventfs: Update all the eventfs_inodes from the
events descriptor") had eventfs_set_attrs() recurse through ei->children
on remount. The walk only holds the rcu_read_lock() taken by
tracefs_apply_options() over tracefs_inodes, which is wrong:
- list_for_each_entry over ei->children races with the list_del_rcu()
in eventfs_remove_rec() -- LIST_POISON1 deref, same shape as
d2603279c7d6.
- eventfs_inodes are freed via call_srcu(&eventfs_srcu, ...).
rcu_read_lock() does not extend an SRCU grace period, so ti->private
can be reclaimed under the walk.
- The writes to ei->attr race with eventfs_set_attr(), which holds
eventfs_mutex.
Reproducer:
while :; do mount -o remount,uid=$((RANDOM%1000)) /sys/kernel/tracing; done &
while :; do
echo "p:kp submit_bio" > /sys/kernel/tracing/kprobe_events
echo > /sys/kernel/tracing/kprobe_events
done
Wrap the events portion of tracefs_apply_options() in
eventfs_remount_lock()/_unlock() that take eventfs_mutex and
srcu_read_lock(&eventfs_srcu). eventfs_set_attrs() doesn't sleep so the
nested rcu_read_lock() is fine; lockdep_assert_held() pins the contract.
Comment in tracefs_drop_inode() said "RCU cycle" -- it is SRCU.
Fixes: 340f0c7067a9 ("eventfs: Update all the eventfs_inodes from the events descriptor")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260418191737.10289-1-devnexen@gmail.com
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
fs/tracefs/event_inode.c | 14 ++++++++++++++
fs/tracefs/inode.c | 5 ++++-
fs/tracefs/internal.h | 3 +++
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 8dd554508828..26b6453de30e 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -244,6 +244,8 @@ static void eventfs_set_attrs(struct eventfs_inode *ei, bool update_uid, kuid_t
{
struct eventfs_inode *ei_child;
+ lockdep_assert_held(&eventfs_mutex);
+
/* Update events/<system>/<event> */
if (WARN_ON_ONCE(level > 3))
return;
@@ -886,3 +888,15 @@ void eventfs_remove_events_dir(struct eventfs_inode *ei)
d_invalidate(dentry);
d_make_discardable(dentry);
}
+
+int eventfs_remount_lock(void)
+{
+ mutex_lock(&eventfs_mutex);
+ return srcu_read_lock(&eventfs_srcu);
+}
+
+void eventfs_remount_unlock(int srcu_idx)
+{
+ srcu_read_unlock(&eventfs_srcu, srcu_idx);
+ mutex_unlock(&eventfs_mutex);
+}
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 5602baf980f6..1e8a78c5e996 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -313,6 +313,7 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
struct inode *inode = d_inode(sb->s_root);
struct tracefs_inode *ti;
bool update_uid, update_gid;
+ int srcu_idx;
umode_t tmp_mode;
/*
@@ -337,6 +338,7 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
update_uid = fsi->opts & BIT(Opt_uid);
update_gid = fsi->opts & BIT(Opt_gid);
+ srcu_idx = eventfs_remount_lock();
rcu_read_lock();
list_for_each_entry_rcu(ti, &tracefs_inodes, list) {
if (update_uid) {
@@ -358,6 +360,7 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
eventfs_remount(ti, update_uid, update_gid);
}
rcu_read_unlock();
+ eventfs_remount_unlock(srcu_idx);
}
return 0;
@@ -403,7 +406,7 @@ static int tracefs_drop_inode(struct inode *inode)
* This inode is being freed and cannot be used for
* eventfs. Clear the flag so that it doesn't call into
* eventfs during the remount flag updates. The eventfs_inode
- * gets freed after an RCU cycle, so the content will still
+ * gets freed after an SRCU cycle, so the content will still
* be safe if the iteration is going on now.
*/
ti->flags &= ~TRACEFS_EVENT_INODE;
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index d83c2a25f288..a4a7f8431aff 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -76,4 +76,7 @@ struct inode *tracefs_get_inode(struct super_block *sb);
void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid);
void eventfs_d_release(struct dentry *dentry);
+int eventfs_remount_lock(void);
+void eventfs_remount_unlock(int srcu_idx);
+
#endif /* _TRACEFS_INTERNAL_H */
--
2.51.0
prev parent reply other threads:[~2026-04-20 18:10 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260420181151.771404678@kernel.org>
2026-04-20 18:11 ` [for-linus][PATCH 1/2] eventfs: Use list_add_tail_rcu() for SRCU-protected children list Steven Rostedt
2026-04-20 18:11 ` Steven Rostedt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260420181220.362882530@kernel.org \
--to=rostedt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=devnexen@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.