* [PATCH] writeback: Fix use after free in inode_switch_wbs_work_fn()
@ 2026-04-13 9:36 Jan Kara
2026-04-13 16:30 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2026-04-13 9:36 UTC (permalink / raw)
To: Christian Brauner; +Cc: Tejun Heo, linux-fsdevel, Jan Kara, stable
inode_switch_wbs_work_fn() has a loop like:
wb_get(new_wb);
while (1) {
list = llist_del_all(&new_wb->switch_wbs_ctxs);
/* Nothing to do? */
if (!list)
break;
... process the items ...
}
Now adding of items to the list looks like:
wb_queue_isw()
if (llist_add(&isw->list, &wb->switch_wbs_ctxs))
queue_work(isw_wq, &wb->switch_work);
Because inode_switch_wbs_work_fn() loops when processing isw items, it
can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is
empty. This is a problem because in that case wb can get freed (no isw
items -> no wb reference) while the work is still pending causing
use-after-free issues.
We cannot just fix this by cancelling work when freeing wb because that
could still trigger problematic 0 -> 1 transitions on wb refcount due to
wb_get() in inode_switch_wbs_work_fn(). It could be all handled with
more careful code but that seems unnecessarily complex so let's avoid
that until it is proven that the looping actually brings practical
benefit. Just remove the loop from inode_switch_wbs_work_fn() instead.
That way when wb_queue_isw() queues work, we are guaranteed we have
added the first item to wb->switch_wbs_ctxs and nobody is going to
remove it (and drop the wb reference it holds) until the queued work
runs.
Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/fs-writeback.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 3c75ee025bda..d63baa1b6fec 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -570,28 +570,30 @@ void inode_switch_wbs_work_fn(struct work_struct *work)
struct inode_switch_wbs_context *isw, *next_isw;
struct llist_node *list;
+ list = llist_del_all(&new_wb->switch_wbs_ctxs);
/*
- * Grab out reference to wb so that it cannot get freed under us
+ * Nothing to do? That would be a problem as references held by isw
+ * items protect wb from freeing...
+ */
+ if (WARN_ON_ONCE(!list))
+ return;
+
+ /*
+ * Grab our reference to wb so that it cannot get freed under us
* after we process all the isw items.
*/
wb_get(new_wb);
- while (1) {
- list = llist_del_all(&new_wb->switch_wbs_ctxs);
- /* Nothing to do? */
- if (!list)
- break;
- /*
- * In addition to synchronizing among switchers, I_WB_SWITCH
- * tells the RCU protected stat update paths to grab the i_page
- * lock so that stat transfer can synchronize against them.
- * Let's continue after I_WB_SWITCH is guaranteed to be
- * visible.
- */
- synchronize_rcu();
+ /*
+ * In addition to synchronizing among switchers, I_WB_SWITCH
+ * tells the RCU protected stat update paths to grab the i_page
+ * lock so that stat transfer can synchronize against them.
+ * Let's continue after I_WB_SWITCH is guaranteed to be
+ * visible.
+ */
+ synchronize_rcu();
- llist_for_each_entry_safe(isw, next_isw, list, list)
- process_inode_switch_wbs(new_wb, isw);
- }
+ llist_for_each_entry_safe(isw, next_isw, list, list)
+ process_inode_switch_wbs(new_wb, isw);
wb_put(new_wb);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] writeback: Fix use after free in inode_switch_wbs_work_fn()
2026-04-13 9:36 [PATCH] writeback: Fix use after free in inode_switch_wbs_work_fn() Jan Kara
@ 2026-04-13 16:30 ` Tejun Heo
2026-04-13 17:26 ` Jan Kara
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2026-04-13 16:30 UTC (permalink / raw)
To: Jan Kara; +Cc: Christian Brauner, linux-fsdevel, stable
On Mon, Apr 13, 2026 at 11:36:19AM +0200, Jan Kara wrote:
> inode_switch_wbs_work_fn() has a loop like:
>
> wb_get(new_wb);
> while (1) {
> list = llist_del_all(&new_wb->switch_wbs_ctxs);
> /* Nothing to do? */
> if (!list)
> break;
> ... process the items ...
> }
>
> Now adding of items to the list looks like:
>
> wb_queue_isw()
> if (llist_add(&isw->list, &wb->switch_wbs_ctxs))
> queue_work(isw_wq, &wb->switch_work);
>
> Because inode_switch_wbs_work_fn() loops when processing isw items, it
> can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is
> empty. This is a problem because in that case wb can get freed (no isw
> items -> no wb reference) while the work is still pending causing
> use-after-free issues.
>
> We cannot just fix this by cancelling work when freeing wb because that
> could still trigger problematic 0 -> 1 transitions on wb refcount due to
> wb_get() in inode_switch_wbs_work_fn(). It could be all handled with
> more careful code but that seems unnecessarily complex so let's avoid
> that until it is proven that the looping actually brings practical
> benefit. Just remove the loop from inode_switch_wbs_work_fn() instead.
> That way when wb_queue_isw() queues work, we are guaranteed we have
> added the first item to wb->switch_wbs_ctxs and nobody is going to
> remove it (and drop the wb reference it holds) until the queued work
> runs.
>
> Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes")
> CC: stable@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>
Oh man, that's too subtle.
Acked-by: Tejun Heo <tj@kernel.org>
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] writeback: Fix use after free in inode_switch_wbs_work_fn()
2026-04-13 16:30 ` Tejun Heo
@ 2026-04-13 17:26 ` Jan Kara
0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2026-04-13 17:26 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, stable
On Mon 13-04-26 06:30:43, Tejun Heo wrote:
> On Mon, Apr 13, 2026 at 11:36:19AM +0200, Jan Kara wrote:
> > inode_switch_wbs_work_fn() has a loop like:
> >
> > wb_get(new_wb);
> > while (1) {
> > list = llist_del_all(&new_wb->switch_wbs_ctxs);
> > /* Nothing to do? */
> > if (!list)
> > break;
> > ... process the items ...
> > }
> >
> > Now adding of items to the list looks like:
> >
> > wb_queue_isw()
> > if (llist_add(&isw->list, &wb->switch_wbs_ctxs))
> > queue_work(isw_wq, &wb->switch_work);
> >
> > Because inode_switch_wbs_work_fn() loops when processing isw items, it
> > can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is
> > empty. This is a problem because in that case wb can get freed (no isw
> > items -> no wb reference) while the work is still pending causing
> > use-after-free issues.
> >
> > We cannot just fix this by cancelling work when freeing wb because that
> > could still trigger problematic 0 -> 1 transitions on wb refcount due to
> > wb_get() in inode_switch_wbs_work_fn(). It could be all handled with
> > more careful code but that seems unnecessarily complex so let's avoid
> > that until it is proven that the looping actually brings practical
> > benefit. Just remove the loop from inode_switch_wbs_work_fn() instead.
> > That way when wb_queue_isw() queues work, we are guaranteed we have
> > added the first item to wb->switch_wbs_ctxs and nobody is going to
> > remove it (and drop the wb reference it holds) until the queued work
> > runs.
> >
> > Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes")
> > CC: stable@vger.kernel.org
> > Signed-off-by: Jan Kara <jack@suse.cz>
>
> Oh man, that's too subtle.
>
> Acked-by: Tejun Heo <tj@kernel.org>
Thanks for review! I forgot to mention in the changelog that these UAF
issues were actually reliably hit in practice by one user on a Kubernetes
cluster. So it is subtle but actually practical issue...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-13 17:26 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 9:36 [PATCH] writeback: Fix use after free in inode_switch_wbs_work_fn() Jan Kara
2026-04-13 16:30 ` Tejun Heo
2026-04-13 17:26 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox