From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Jiufei Xue <jiufei.xue@samsung.com>, Jan Kara <jack@suse.cz>,
Christian Brauner <brauner@kernel.org>,
Sasha Levin <sashal@kernel.org>,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty()
Date: Mon, 25 Aug 2025 08:14:50 -0400 [thread overview]
Message-ID: <20250825121505.2983941-1-sashal@kernel.org> (raw)
From: Jiufei Xue <jiufei.xue@samsung.com>
[ Upstream commit d02d2c98d25793902f65803ab853b592c7a96b29 ]
An use-after-free issue occurred when __mark_inode_dirty() get the
bdi_writeback that was in the progress of switching.
CPU: 1 PID: 562 Comm: systemd-random- Not tainted 6.6.56-gb4403bd46a8e #1
......
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mark_inode_dirty+0x124/0x418
lr : __mark_inode_dirty+0x118/0x418
sp : ffffffc08c9dbbc0
........
Call trace:
__mark_inode_dirty+0x124/0x418
generic_update_time+0x4c/0x60
file_modified+0xcc/0xd0
ext4_buffered_write_iter+0x58/0x124
ext4_file_write_iter+0x54/0x704
vfs_write+0x1c0/0x308
ksys_write+0x74/0x10c
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0xc0/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x40/0xe4
el0t_64_sync_handler+0x120/0x12c
el0t_64_sync+0x194/0x198
Root cause is:
systemd-random-seed kworker
----------------------------------------------------------------------
___mark_inode_dirty inode_switch_wbs_work_fn
spin_lock(&inode->i_lock);
inode_attach_wb
locked_inode_to_wb_and_lock_list
get inode->i_wb
spin_unlock(&inode->i_lock);
spin_lock(&wb->list_lock)
spin_lock(&inode->i_lock)
inode_io_list_move_locked
spin_unlock(&wb->list_lock)
spin_unlock(&inode->i_lock)
spin_lock(&old_wb->list_lock)
inode_do_switch_wbs
spin_lock(&inode->i_lock)
inode->i_wb = new_wb
spin_unlock(&inode->i_lock)
spin_unlock(&old_wb->list_lock)
wb_put_many(old_wb, nr_switched)
cgwb_release
old wb released
wb_wakeup_delayed() accesses wb,
then trigger the use-after-free
issue
Fix this race condition by holding inode spinlock until
wb_wakeup_delayed() finished.
Signed-off-by: Jiufei Xue <jiufei.xue@samsung.com>
Link: https://lore.kernel.org/20250728100715.3863241-1-jiufei.xue@samsung.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my thorough analysis of the commit and the kernel code, here's
my assessment:
**Backport Status: YES**
## Extensive Explanation:
This commit fixes a **critical use-after-free vulnerability** in the
writeback subsystem that should definitely be backported to stable
kernel trees. Here's why:
### 1. **Real Security Bug with Clear Impact**
The commit fixes a genuine use-after-free vulnerability that occurs in
`__mark_inode_dirty()`. The bug manifests as a kernel crash with a clear
call trace showing memory corruption. This is not a theoretical issue -
it has been observed in production (kernel 6.6.56).
### 2. **Race Condition Details**
The race condition occurs between two concurrent operations:
- **Thread A** (`__mark_inode_dirty`): Gets a reference to
`inode->i_wb`, releases the inode lock, then calls
`wb_wakeup_delayed(wb)`
- **Thread B** (`inode_switch_wbs_work_fn`): Switches the inode's
writeback context, releases the old wb via `wb_put_many()`, which can
trigger `cgwb_release` and free the wb structure
The vulnerability window exists because Thread A accesses the wb
structure (`wb_wakeup_delayed(wb)`) after releasing the inode lock but
before completing its operation, while Thread B can free that same wb
structure in parallel.
### 3. **Minimal and Contained Fix**
The fix is remarkably simple and surgical - it only reorders lock
releases:
```c
- spin_unlock(&wb->list_lock);
- spin_unlock(&inode->i_lock);
- trace_writeback_dirty_inode_enqueue(inode);
-
if (wakeup_bdi && (wb->bdi->capabilities & BDI_CAP_WRITEBACK))
wb_wakeup_delayed(wb);
+
+ spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
+ trace_writeback_dirty_inode_enqueue(inode);
```
The fix ensures that `wb_wakeup_delayed()` is called while still holding
the locks, preventing the wb from being freed during the operation. This
is a classic lock ordering fix with minimal code change (just moving 3
lines).
### 4. **Critical Subsystem**
The writeback subsystem is fundamental to filesystem operations - it
handles dirty page writeback, inode management, and I/O scheduling. A
use-after-free here can lead to:
- System crashes/panics
- Data corruption
- Security vulnerabilities (potential for exploitation)
### 5. **Matches Stable Criteria**
This fix perfectly aligns with stable kernel rules:
- **Fixes a real bug**: Documented use-after-free with crash trace
- **Already upstream**: Has been reviewed by Jan Kara (filesystem
maintainer) and merged by Christian Brauner
- **Small change**: Only 6 lines moved, no new logic
- **No new features**: Pure bug fix
- **Low regression risk**: Simply extends existing lock protection
slightly
### 6. **Similar Historical Fixes**
The git history shows multiple similar use-after-free fixes in this
subsystem have been backported:
- `f87904c07551` - writeback: avoid use-after-free after removing device
- `8e00c4e9dd85` - writeback: fix use-after-free in
finish_writeback_work()
- `614a4e377314` - writeback, cgroup: fix premature wb_put()
### 7. **CONFIG_CGROUP_WRITEBACK Dependency**
While this bug primarily affects systems with `CONFIG_CGROUP_WRITEBACK`
enabled (used for cgroup-aware writeback), this is a common
configuration in modern distributions, especially those using cgroups v2
for container workloads.
### 8. **No Side Effects**
The change has no functional side effects beyond fixing the race:
- The trace event still fires at the same logical point
- The wb_wakeup_delayed() still executes under the same conditions
- Only the lock hold duration is slightly extended to cover the wb
access
This is a textbook example of a commit that should be backported to
stable kernels - it fixes a serious bug with minimal, well-understood
changes in a critical subsystem.
fs/fs-writeback.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index cc57367fb641..a07b8cf73ae2 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2608,10 +2608,6 @@ void __mark_inode_dirty(struct inode *inode, int flags)
wakeup_bdi = inode_io_list_move_locked(inode, wb,
dirty_list);
- spin_unlock(&wb->list_lock);
- spin_unlock(&inode->i_lock);
- trace_writeback_dirty_inode_enqueue(inode);
-
/*
* If this is the first dirty inode for this bdi,
* we have to wake-up the corresponding bdi thread
@@ -2621,6 +2617,11 @@ void __mark_inode_dirty(struct inode *inode, int flags)
if (wakeup_bdi &&
(wb->bdi->capabilities & BDI_CAP_WRITEBACK))
wb_wakeup_delayed(wb);
+
+ spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
+ trace_writeback_dirty_inode_enqueue(inode);
+
return;
}
}
--
2.50.1
next reply other threads:[~2025-08-25 12:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 12:14 Sasha Levin [this message]
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] LoongArch: Save LBT before FPU in setup_sigcontext() Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] btrfs: clear block dirty if submit_one_sector() failed Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] platform/x86/amd: pmc: Drop SMU F/W match for Cezanne Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] LoongArch: Add cpuhotplug hooks to fix high cpu usage of vCPU threads Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.12] btrfs: zoned: skip ZONE FINISH of conventional zones Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-5.10] drm/amd/display: Don't warn when missing DCE encoder caps Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] Bluetooth: hci_sync: Avoid adding default advertising on startup Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] cpupower: Fix a bug where the -t option of the set subcommand was not working Sasha Levin
2025-08-25 12:15 ` [PATCH AUTOSEL 6.16-6.12] drm/rockchip: vop2: make vp registers nonvolatile Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250825121505.2983941-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=jiufei.xue@samsung.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox