[PATCH v1] ext4: fix use-after-free in update_super_work when racing with umount

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: linux-ext4@vger.kernel.org
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Jan Kara <jack@suse.cz>, Ritesh Harjani <riteshh@linux.ibm.com>,
	Ye Bin <yebin10@huawei.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v1] ext4: fix use-after-free in update_super_work when racing with umount
Date: Fri, 13 Mar 2026 14:52:04 +0800	[thread overview]
Message-ID: <20260313065206.152645-1-jiayuan.chen@linux.dev> (raw)

Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
moved ext4_unregister_sysfs() before flushing s_sb_upd_work to prevent new
error work from being queued via /proc/fs/ext4/xx/mb_groups reads during
unmount. However, this introduced a use-after-free because
update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which
accesses the kobject's kernfs_node after it has been freed:

  update_super_work                ext4_put_super
  -----------------                --------------
                                   ext4_unregister_sysfs(sb)
                                     kobject_del(&sbi->s_kobj)
                                       __kobject_del()
                                         sysfs_remove_dir()
                                           kobj->sd = NULL
                                         sysfs_put(sd)
                                           kernfs_put()  // RCU free
  ext4_notify_error_sysfs(sbi)
    sysfs_notify(&sbi->s_kobj)
      kn = kobj->sd              // stale pointer
      kernfs_get(kn)             // UAF on freed kernfs_node
                                   ext4_journal_destroy()
                                     flush_work(&sbi->s_sb_upd_work)

The original blamed commit needed procfs removal before the work
flush to prevent /proc/fs/ext4/xx/mb_groups reads from queuing new error
work. But it bundled procfs removal and kobject_del together in
ext4_unregister_sysfs(), causing the sysfs kobject to be torn down too
early.

The correct teardown ordering has three constraints:

  1. procfs removal must happen before flushing s_sb_upd_work, to prevent
     /proc reads from queuing new error work that would BUG_ON.
  2. sysfs kobject removal must happen after flushing s_sb_upd_work, since
     the work calls sysfs_notify() which accesses the kernfs_node.
  3. sysfs kobject removal must happen before jbd2_journal_destroy(), since
     userspace could read the journal_task sysfs attribute and dereference
     j_task after the journal thread has been killed.

Fix this by:
  - Adding ext4_sb_release_proc() to remove procfs entries early.
  - Splitting ext4_journal_destroy() into ext4_journal_stop_work() and
    ext4_journal_finish(), so that ext4_unregister_sysfs() can be placed
    between them to satisfy all three ordering constraints.

Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 fs/ext4/ext4.h      |  1 +
 fs/ext4/ext4_jbd2.h | 45 ++++++++++++++++++++++++++++-----------------
 fs/ext4/super.c     | 34 +++++++++++++++++++++++-----------
 fs/ext4/sysfs.c     | 10 ++++++++++
 4 files changed, 62 insertions(+), 28 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b76966dc06c0..a693365d224c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3757,6 +3757,7 @@ extern const struct inode_operations ext4_fast_symlink_inode_operations;
 /* sysfs.c */
 extern void ext4_notify_error_sysfs(struct ext4_sb_info *sbi);
 extern int ext4_register_sysfs(struct super_block *sb);
+extern void ext4_sb_release_proc(struct super_block *sb);
 extern void ext4_unregister_sysfs(struct super_block *sb);
 extern int __init ext4_init_sysfs(void);
 extern void ext4_exit_sysfs(void);
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 63d17c5201b5..2b7d68b11578 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -430,32 +430,43 @@ static inline int ext4_should_dioread_nolock(struct inode *inode)
 }
 
 /*
- * Pass journal explicitly as it may not be cached in the sbi->s_journal in some
- * cases
+ * Stop new s_sb_upd_work from being queued and flush any pending work.
+ *
+ * At this point only two things can be operating on the journal:
+ * JBD2 thread performing transaction commit and s_sb_upd_work
+ * issuing sb update through the journal. Once we set
+ * EXT4_MF_JOURNAL_DESTROY, new ext4_handle_error() calls will not
+ * queue s_sb_upd_work and ext4_force_commit() makes sure any
+ * ext4_handle_error() calls from the running transaction commit are
+ * finished. Hence no new s_sb_upd_work can be queued after we
+ * flush it here.
  */
-static inline int ext4_journal_destroy(struct ext4_sb_info *sbi, journal_t *journal)
+static inline void ext4_journal_stop_work(struct ext4_sb_info *sbi)
 {
-	int err = 0;
-
-	/*
-	 * At this point only two things can be operating on the journal.
-	 * JBD2 thread performing transaction commit and s_sb_upd_work
-	 * issuing sb update through the journal. Once we set
-	 * EXT4_JOURNAL_DESTROY, new ext4_handle_error() calls will not
-	 * queue s_sb_upd_work and ext4_force_commit() makes sure any
-	 * ext4_handle_error() calls from the running transaction commit are
-	 * finished. Hence no new s_sb_upd_work can be queued after we
-	 * flush it here.
-	 */
 	ext4_set_mount_flag(sbi->s_sb, EXT4_MF_JOURNAL_DESTROY);
-
 	ext4_force_commit(sbi->s_sb);
 	flush_work(&sbi->s_sb_upd_work);
+}
+
+/*
+ * Destroy the journal. Must be called after ext4_journal_stop_work().
+ * Pass journal explicitly as it may not be cached in the sbi->s_journal
+ * in some cases.
+ */
+static inline int ext4_journal_finish(struct ext4_sb_info *sbi,
+				      journal_t *journal)
+{
+	int err;
 
 	err = jbd2_journal_destroy(journal);
 	sbi->s_journal = NULL;
-
 	return err;
 }
 
+static inline int ext4_journal_destroy(struct ext4_sb_info *sbi, journal_t *journal)
+{
+	ext4_journal_stop_work(sbi);
+	return ext4_journal_finish(sbi, journal);
+}
+
 #endif	/* _EXT4_JBD2_H */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 752f414aa06b..9bba783f44e1 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1280,16 +1280,12 @@ static void ext4_put_super(struct super_block *sb)
 	int err;
 
 	/*
-	 * Unregister sysfs before destroying jbd2 journal.
-	 * Since we could still access attr_journal_task attribute via sysfs
-	 * path which could have sbi->s_journal->j_task as NULL
-	 * Unregister sysfs before flush sbi->s_sb_upd_work.
-	 * Since user may read /proc/fs/ext4/xx/mb_groups during umount, If
-	 * read metadata verify failed then will queue error work.
-	 * update_super_work will call start_this_handle may trigger
-	 * BUG_ON.
+	 * Remove procfs entries before flush s_sb_upd_work. Since user may
+	 * read /proc/fs/ext4/xx/mb_groups during umount, if read metadata
+	 * verify failed then will queue error work. update_super_work will
+	 * call start_this_handle which may trigger BUG_ON.
 	 */
-	ext4_unregister_sysfs(sb);
+	ext4_sb_release_proc(sb);
 
 	if (___ratelimit(&ext4_mount_msg_ratelimit, "EXT4-fs unmount"))
 		ext4_msg(sb, KERN_INFO, "unmounting filesystem %pU.",
@@ -1301,14 +1297,30 @@ static void ext4_put_super(struct super_block *sb)
 	destroy_workqueue(sbi->rsv_conversion_wq);
 	ext4_release_orphan_info(sb);
 
+	/*
+	 * Flush s_sb_upd_work before unregistering sysfs, since
+	 * update_super_work calls ext4_notify_error_sysfs() which accesses
+	 * the kobject's kernfs_node via sysfs_notify(). Unregistering sysfs
+	 * before the flush could lead to a use-after-free on the
+	 * kernfs_node.
+	 *
+	 * Also unregister sysfs before destroying jbd2 journal, since
+	 * userspace could read the journal_task sysfs attribute while
+	 * jbd2_journal_destroy() is killing the journal thread, leading to
+	 * a NULL pointer dereference of j_task in journal_task_show().
+	 */
 	if (sbi->s_journal) {
 		aborted = is_journal_aborted(sbi->s_journal);
-		err = ext4_journal_destroy(sbi, sbi->s_journal);
+		ext4_journal_stop_work(sbi);
+		ext4_unregister_sysfs(sb);
+		err = ext4_journal_finish(sbi, sbi->s_journal);
 		if ((err < 0) && !aborted) {
 			ext4_abort(sb, -err, "Couldn't clean up the journal");
 		}
-	} else
+	} else {
 		flush_work(&sbi->s_sb_upd_work);
+		ext4_unregister_sysfs(sb);
+	}
 
 	ext4_es_unregister_shrinker(sbi);
 	timer_shutdown_sync(&sbi->s_err_report);
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index d2ecc1026c0c..f6947416c1e7 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -638,6 +638,16 @@ int ext4_register_sysfs(struct super_block *sb)
 	return 0;
 }
 
+void ext4_sb_release_proc(struct super_block *sb)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+
+	if (sbi->s_proc) {
+		remove_proc_subtree(sb->s_id, ext4_proc_root);
+		sbi->s_proc = NULL;
+	}
+}
+
 void ext4_unregister_sysfs(struct super_block *sb)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
-- 
2.43.0

next             reply	other threads:[~2026-03-13  6:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13  6:52 Jiayuan Chen [this message]
2026-03-17 11:38 ` [PATCH v1] ext4: fix use-after-free in update_super_work when racing with umount Jan Kara
2026-03-18  5:04   ` Jiayuan Chen
2026-03-18 16:56     ` Jan Kara

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:b76966dc06c dfblob:a693365d224 dfblob:63d17c5201b
dfblob:2b7d68b1157 dfblob:752f414aa06 dfblob:9bba783f44e
dfblob:d2ecc1026c0 dfblob:f6947416c1e )
 OR (
bs:"[PATCH v1] ext4: fix use-after-free in update_super_work when racing with umount" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260313065206.152645-1-jiayuan.chen@linux.dev \
    --to=jiayuan.chen@linux.dev \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox