[PATCH] loop: add sync_blockdev() in __loop_clr

Linux block layer
 help / color / mirror / Atom feed

* [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
@ 2026-05-22  2:54 Ming Lei
  2026-05-22  3:28 ` Tetsuo Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2026-05-22  2:54 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Ming Lei, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Tetsuo Handa, Hongling Zeng,
	Bart Van Assche, Andrew Morton

The syzbot report describes a NULL pointer dereference in lo_rw_aio()
caused by a race between lo_release() and loop_queue_rq().
__loop_clr_fd() clears lo->lo_backing_file while an already-scheduled
asynchronous I/O work (lo_rw_aio) is about to execute on a kworker.
Since the kworker enters lo_rw_aio() after lo->lo_backing_file has been
set to NULL, it dereferences the NULL pointer when initializing the
kiocb, leading to a general protection fault.

Fix this by adding sync_blockdev() in __loop_clr_fd() to flush all
pending writeback I/O before clearing lo->lo_backing_file. Since the
loop disk is already closed at this point, no new I/O can be submitted
— only writeback remains.

Reported-by: syzbot+cd8a9a308e879a4e2c28@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
Reported-by: syzbot+bc273027d5643e48e5b3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3
Analyzed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Hongling Zeng <zenghongling@kylinos.cn>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/linux-block/9b2032d6-3f36-4d2b-8128-985c08a4fa37@I-love.SAKURA.ne.jp/
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/block/loop.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0000913f7efc..023e20a88a8a 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1118,6 +1118,9 @@ static void __loop_clr_fd(struct loop_device *lo)
 	struct file *filp;
 	gfp_t gfp = lo->old_gfp_mask;
 
+	/* flush outstanding writeback I/O before clearing lo_backing_file */
+	sync_blockdev(lo->lo_device);
+
 	spin_lock_irq(&lo->lo_lock);
 	filp = lo->lo_backing_file;
 	lo->lo_backing_file = NULL;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  2:54 [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF Ming Lei
@ 2026-05-22  3:28 ` Tetsuo Handa
  2026-05-22  3:45   ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Tetsuo Handa @ 2026-05-22  3:28 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: syzbot+cd8a9a308e879a4e2c28, syzbot+bc273027d5643e48e5b3,
	Hongling Zeng, Bart Van Assche, Andrew Morton

On 2026/05/22 11:54, Ming Lei wrote:
> Fix this by adding sync_blockdev() in __loop_clr_fd() to flush all
> pending writeback I/O before clearing lo->lo_backing_file. Since the
> loop disk is already closed at this point, no new I/O can be submitted
> — only writeback remains.

Why can you assume that synchronize_rcu() + drain_workqueue(lo->workqueue)
is unnecessary? Since we don't know exact commit which is causing this
problem, we don't know what has changed.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  3:28 ` Tetsuo Handa
@ 2026-05-22  3:45   ` Ming Lei
  2026-05-22  6:54     ` Tetsuo Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2026-05-22  3:45 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, linux-block, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Hongling Zeng, Bart Van Assche,
	Andrew Morton

On Fri, May 22, 2026 at 12:28:34PM +0900, Tetsuo Handa wrote:
> On 2026/05/22 11:54, Ming Lei wrote:
> > Fix this by adding sync_blockdev() in __loop_clr_fd() to flush all
> > pending writeback I/O before clearing lo->lo_backing_file. Since the
> > loop disk is already closed at this point, no new I/O can be submitted
> > — only writeback remains.
> 
> Why can you assume that synchronize_rcu() + drain_workqueue(lo->workqueue)
> is unnecessary? Since we don't know exact commit which is causing this
> problem, we don't know what has changed.

When sync_blockdev() returns, there can't be any inflight IO:

- no one can open this loop disk
- no dirty page cache any more

So why do you want to add rcu/drain_workqueue?

Fixes: 1fe0b1acb14d ("loop: only freeze the queue in __loop_clr_fd when needed")


Thanks,
Ming

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  3:45   ` Ming Lei
@ 2026-05-22  6:54     ` Tetsuo Handa
  2026-05-22  9:33       ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Tetsuo Handa @ 2026-05-22  6:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Hongling Zeng, Bart Van Assche,
	Andrew Morton

On 2026/05/22 12:45, Ming Lei wrote:
> On Fri, May 22, 2026 at 12:28:34PM +0900, Tetsuo Handa wrote:
>> On 2026/05/22 11:54, Ming Lei wrote:
>>> Fix this by adding sync_blockdev() in __loop_clr_fd() to flush all
>>> pending writeback I/O before clearing lo->lo_backing_file. Since the
>>> loop disk is already closed at this point, no new I/O can be submitted
>>> — only writeback remains.
>>
>> Why can you assume that synchronize_rcu() + drain_workqueue(lo->workqueue)
>> is unnecessary? Since we don't know exact commit which is causing this
>> problem, we don't know what has changed.
> 
> When sync_blockdev() returns, there can't be any inflight IO:
> 
> - no one can open this loop disk
> - no dirty page cache any more
> 
> So why do you want to add rcu/drain_workqueue?

The fact that __loop_clr_fd() is called before lo_rw_aio() completes suggests that
some works queued before __loop_clr_fd() is called have not been completed.

I don't know what sync_blockdev() guarantees, but even if sync_blockdev() can guarantee
both "loop_queue_rq() is no longer running" and "all pending AIO requests issued by
lo_rw_aio() have been completed", I think that sync_blockdev() cannot determine whether
lo->lo_device has pending work or not.

Therefore, I guess that at least drain_workqueue() is needed before sync_blockdev().
If sync_blockdev() cannot guarantee "loop_queue_rq() is no longer running",
I guess that both synchronize_rcu() + drain_workqueue() are needed before sync_blockdev().

(And I don't know whether sync_blockdev() is appropriate because we don't know what change
made this problem visible. We didn't hit this problem until Linux 7.0.)

> 
> Fixes: 1fe0b1acb14d ("loop: only freeze the queue in __loop_clr_fd when needed")

The Fixes: for these reports should be the commit which made this problem visible.
Or, this patch will be needlessly backported to stable kernels.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  6:54     ` Tetsuo Handa
@ 2026-05-22  9:33       ` Ming Lei
  2026-05-22  9:54         ` Tetsuo Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2026-05-22  9:33 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, linux-block, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Hongling Zeng, Bart Van Assche,
	Andrew Morton

On Fri, May 22, 2026 at 03:54:54PM +0900, Tetsuo Handa wrote:
> On 2026/05/22 12:45, Ming Lei wrote:
> > On Fri, May 22, 2026 at 12:28:34PM +0900, Tetsuo Handa wrote:
> >> On 2026/05/22 11:54, Ming Lei wrote:
> >>> Fix this by adding sync_blockdev() in __loop_clr_fd() to flush all
> >>> pending writeback I/O before clearing lo->lo_backing_file. Since the
> >>> loop disk is already closed at this point, no new I/O can be submitted
> >>> — only writeback remains.
> >>
> >> Why can you assume that synchronize_rcu() + drain_workqueue(lo->workqueue)
> >> is unnecessary? Since we don't know exact commit which is causing this
> >> problem, we don't know what has changed.
> > 
> > When sync_blockdev() returns, there can't be any inflight IO:
> > 
> > - no one can open this loop disk
> > - no dirty page cache any more
> > 
> > So why do you want to add rcu/drain_workqueue?
> 
> The fact that __loop_clr_fd() is called before lo_rw_aio() completes suggests that
> some works queued before __loop_clr_fd() is called have not been completed.
> 
> I don't know what sync_blockdev() guarantees, but even if sync_blockdev() can guarantee
> both "loop_queue_rq() is no longer running" and "all pending AIO requests issued by
> lo_rw_aio() have been completed", I think that sync_blockdev() cannot determine whether
> lo->lo_device has pending work or not.

Please do take a look at the commit log of this one and commit 1fe0b1acb14d

> 
> Therefore, I guess that at least drain_workqueue() is needed before sync_blockdev().
> If sync_blockdev() cannot guarantee "loop_queue_rq() is no longer running",
> I guess that both synchronize_rcu() + drain_workqueue() are needed before sync_blockdev().

Not at all.

What matters is that resources cleared in __loop_clr_fd() are not used any
more, and the resource is only accessed when handling io command from both
submission and completion path.

Again:

When the loop disk is closed, no foreground IO is possible, meantime
no one can open it because of ->open_mutex. So the only remained IO are
writeback, sync_blockdev() writes out all dirty pages and waits until
all are done, then there can't be any new IO, and it is safe to clear
these resources.

work function may still be run, but command list includes nothing, so
resources to be cleared won't be touched.

> 
> (And I don't know whether sync_blockdev() is appropriate because we don't know what change
> made this problem visible. We didn't hit this problem until Linux 7.0.)
> 
> > 
> > Fixes: 1fe0b1acb14d ("loop: only freeze the queue in __loop_clr_fd when needed")
> 
> The Fixes: for these reports should be the commit which made this problem visible.
> Or, this patch will be needlessly backported to stable kernels.

Will add it in V2.


Thanks,
Ming

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  9:33       ` Ming Lei
@ 2026-05-22  9:54         ` Tetsuo Handa
  2026-05-22 11:50           ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Tetsuo Handa @ 2026-05-22  9:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Hongling Zeng, Bart Van Assche,
	Andrew Morton

On 2026/05/22 18:33, Ming Lei wrote:
> Please do take a look at the commit log of this one and commit 1fe0b1acb14d

My question is what commit broke "->release is only called after all outstanding
I/O has completed" in the patch description of commit 1fe0b1acb14d ("loop: only
freeze the queue in __loop_clr_fd when needed").

>>> Fixes: 1fe0b1acb14d ("loop: only freeze the queue in __loop_clr_fd when needed")
>>
>> The Fixes: for these reports should be the commit which made this problem visible.
>> Or, this patch will be needlessly backported to stable kernels.
> 
> Will add it in V2.

Commit 1fe0b1acb14d relies on commits before 1fe0b1acb14d. An not-yet-identified commit
which went to the merge window for 7.1 broke what commit 1fe0b1acb14d relies on.
"Fixes: 1fe0b1acb14d" is wrong, and "Fixes: not-yet-identified-commit which went to
the merge window for 7.1" is expected.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF
  2026-05-22  9:54         ` Tetsuo Handa
@ 2026-05-22 11:50           ` Ming Lei
  0 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2026-05-22 11:50 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, linux-block, syzbot+cd8a9a308e879a4e2c28,
	syzbot+bc273027d5643e48e5b3, Hongling Zeng, Bart Van Assche,
	Andrew Morton

On Fri, May 22, 2026 at 06:54:39PM +0900, Tetsuo Handa wrote:
> On 2026/05/22 18:33, Ming Lei wrote:
> > Please do take a look at the commit log of this one and commit 1fe0b1acb14d
> 
> My question is what commit broke "->release is only called after all outstanding
> I/O has completed" in the patch description of commit 1fe0b1acb14d ("loop: only
> freeze the queue in __loop_clr_fd when needed").

oops, 1fe0b1acb14d ("loop: only freeze the queue in __loop_clr_fd when needed") is actually
correct.

The issue must be from somewhere else(writeback?), given blkdev_put_whole() does
run blkdev_flush_mapping() before calling ->release(). If you or anyone
can reproduce it, the following debug patch may provide some hint:

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0000913f7efc..85f140c7f0a4 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1855,6 +1855,8 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
        struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
        struct loop_device *lo = rq->q->queuedata;
 
+       WARN_ON_ONCE(!lo->lo_backing_file);
+
        blk_mq_start_request(rq);
 
        if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)


Thanks,
Ming

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-22 11:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22  2:54 [PATCH] loop: add sync_blockdev() in __loop_clr_fd() to prevent UAF Ming Lei
2026-05-22  3:28 ` Tetsuo Handa
2026-05-22  3:45   ` Ming Lei
2026-05-22  6:54     ` Tetsuo Handa
2026-05-22  9:33       ` Ming Lei
2026-05-22  9:54         ` Tetsuo Handa
2026-05-22 11:50           ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox