Linux filesystem development
 help / color / mirror / Atom feed
* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
       [not found]                 ` <d1b5a737-f0e3-4927-b762-430b37fbb2f9@I-love.SAKURA.ne.jp>
@ 2026-05-27  3:00                   ` Ming Lei
  2026-05-27 11:29                     ` Tetsuo Handa
  2026-05-28  5:43                     ` Hillf Danton
  0 siblings, 2 replies; 21+ messages in thread
From: Ming Lei @ 2026-05-27  3:00 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, Damien Le Moal,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner

On Wed, May 27, 2026 at 10:35:56AM +0900, Tetsuo Handa wrote:
> On 2026/05/27 10:20, Ming Lei wrote:
> >> Of course we should try to figure out the root cause first, but how can we do?
> > 
> > Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
> > which may cause data loss, so CC btrfs list and maintainer.
> 
> Why do you assume that the culprit is btrfs?
> 
> https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3 indicated that
> this similar race is also happening with jfs.

I just didn't see the above report on jfs.

It doesn't change anything, the same question still stands: unexpected write IO is issued
or crosses umount & last closing of loop disk.



Thanks,
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-27  3:00                   ` [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio() Ming Lei
@ 2026-05-27 11:29                     ` Tetsuo Handa
  2026-05-27 18:11                       ` Damien Le Moal
  2026-05-28  5:43                     ` Hillf Danton
  1 sibling, 1 reply; 21+ messages in thread
From: Tetsuo Handa @ 2026-05-27 11:29 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, Damien Le Moal,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner

On 2026/05/27 12:00, Ming Lei wrote:
> On Wed, May 27, 2026 at 10:35:56AM +0900, Tetsuo Handa wrote:
>> On 2026/05/27 10:20, Ming Lei wrote:
>>>> Of course we should try to figure out the root cause first, but how can we do?
>>>
>>> Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
>>> which may cause data loss, so CC btrfs list and maintainer.
>>

I had a conversation with Google AI mode, and received the following response.

--------------------------------------------------------------------------------
Technical Analysis: lo_rw_aio Null Pointer Dereference / UAF since v7.1-rc1


1. The Root Cause of the Timing Shift

This regression was introduced during the v7.1-rc1 merge window, primarily exposed by
Commit 65565ca5f99b ("block: unify the synchronous bi_end_io callbacks"), along with
helper refactorings like Commit 92c3737a2473 ("block: add a bio_submit_or_kill helper").

Prior to v7.0, the synchronous I/O completion path inherently contained execution lags (due
to serialized completion handling and context switches) before notifying upper layers. This
latency accidentally acted as a natural safety barrier. It ensured that by the time a file
system completed its final sync_filesystem() and initiated umount, the loop driver's internal
workqueue (lo_rw_aio) had already finished processing everything.

In v7.1, the unification and optimization of bi_end_io significantly minimized this latency.
The filesystem now learns of "I/O completion" much faster. Consequently, highly-concurrent
execution pipelines like btrfs or jfs proceed rapidly through kill_sb() and blkdev_put(),
ultimately invoking lo_release() -> __loop_clr_fd() while the loop driver's backend kworker
is still in the middle of executing the last sub-millisecond asynchronous file-backed I/O
request.


2. Why the Block Layer's Built-in Quiesce/Freeze Fails

There is an implicit assumption that standard block layer freeze mechanisms (blk_mq_freeze_queue())
protect the device lifetime during release. However, the v7.1 BIO helper refactoring introduced
a synchronization gap:

  1. The filesystem triggers its final metadata or journal updates (e.g., txCommit in jfs or
     delayed refcount updates in btrfs) right during the unmount/close boundary.
  2. Due to the optimized execution path, these requests bypass the block layer's active
     request-tracking metrics at the exact moment blk_mq_freeze_queue() or state validation
     checks evaluated them as zero.
  3. The block layer assumes the queue is safe and silent, allowing __loop_clr_fd() to
     progress and nullify lo->lo_backing_file (or trigger fput()).
  4. The leaked asynchronous kworker wakes up a fraction of a millisecond too late, attempts
     to access lo->lo_backing_file or invokes kiocb_end_write() -> file_inode(), leading to
     either a general protection fault (Null pointer dereference) or a Use-After-Free (UAF).


3. Why This Isn't Just an "Unexpected FS Bug"

While the write I/O originates from file systems like btrfs and jfs post-close, blaming the
file systems entirely ignores the underlying infrastructure change. The core issue is that the
block layer altered its synchronization behavior, breaking the barrier contract that
VFS and file systems historically relied on during the device release path.

Papering over this inside individual file systems would require adding heavy, duplicated
barriers inside every single filesystem's unmount path.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-27 11:29                     ` Tetsuo Handa
@ 2026-05-27 18:11                       ` Damien Le Moal
  2026-05-28  8:38                         ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Damien Le Moal @ 2026-05-27 18:11 UTC (permalink / raw)
  To: Tetsuo Handa, Ming Lei
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, linux-block, LKML,
	Andrew Morton, Linus Torvalds, linux-btrfs, David Sterba,
	linux-fsdevel, Christian Brauner

On 2026/05/27 20:29, Tetsuo Handa wrote:
> On 2026/05/27 12:00, Ming Lei wrote:
>> On Wed, May 27, 2026 at 10:35:56AM +0900, Tetsuo Handa wrote:
>>> On 2026/05/27 10:20, Ming Lei wrote:
>>>>> Of course we should try to figure out the root cause first, but how can we do?
>>>>
>>>> Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
>>>> which may cause data loss, so CC btrfs list and maintainer.
>>>
> 
> I had a conversation with Google AI mode, and received the following response.
> 
> --------------------------------------------------------------------------------
> Technical Analysis: lo_rw_aio Null Pointer Dereference / UAF since v7.1-rc1
> 
> 
> 1. The Root Cause of the Timing Shift
> 
> This regression was introduced during the v7.1-rc1 merge window, primarily exposed by
> Commit 65565ca5f99b ("block: unify the synchronous bi_end_io callbacks"), along with
> helper refactorings like Commit 92c3737a2473 ("block: add a bio_submit_or_kill helper").
> 
> Prior to v7.0, the synchronous I/O completion path inherently contained execution lags (due
> to serialized completion handling and context switches) before notifying upper layers. This
> latency accidentally acted as a natural safety barrier. It ensured that by the time a file
> system completed its final sync_filesystem() and initiated umount, the loop driver's internal
> workqueue (lo_rw_aio) had already finished processing everything.
> 
> In v7.1, the unification and optimization of bi_end_io significantly minimized this latency.
> The filesystem now learns of "I/O completion" much faster. Consequently, highly-concurrent
> execution pipelines like btrfs or jfs proceed rapidly through kill_sb() and blkdev_put(),
> ultimately invoking lo_release() -> __loop_clr_fd() while the loop driver's backend kworker
> is still in the middle of executing the last sub-millisecond asynchronous file-backed I/O
> request.
> 
> 
> 2. Why the Block Layer's Built-in Quiesce/Freeze Fails
> 
> There is an implicit assumption that standard block layer freeze mechanisms (blk_mq_freeze_queue())
> protect the device lifetime during release. However, the v7.1 BIO helper refactoring introduced
> a synchronization gap:
> 
>   1. The filesystem triggers its final metadata or journal updates (e.g., txCommit in jfs or
>      delayed refcount updates in btrfs) right during the unmount/close boundary.
>   2. Due to the optimized execution path, these requests bypass the block layer's active
>      request-tracking metrics at the exact moment blk_mq_freeze_queue() or state validation
>      checks evaluated them as zero.
>   3. The block layer assumes the queue is safe and silent, allowing __loop_clr_fd() to
>      progress and nullify lo->lo_backing_file (or trigger fput()).
>   4. The leaked asynchronous kworker wakes up a fraction of a millisecond too late, attempts
>      to access lo->lo_backing_file or invokes kiocb_end_write() -> file_inode(), leading to
>      either a general protection fault (Null pointer dereference) or a Use-After-Free (UAF).
> 
> 
> 3. Why This Isn't Just an "Unexpected FS Bug"
> 
> While the write I/O originates from file systems like btrfs and jfs post-close, blaming the
> file systems entirely ignores the underlying infrastructure change. The core issue is that the
> block layer altered its synchronization behavior, breaking the barrier contract that
> VFS and file systems historically relied on during the device release path.
> 
> Papering over this inside individual file systems would require adding heavy, duplicated
> barriers inside every single filesystem's unmount path.

It sounds like the VFS unmount call needs to have something that waits for
sync() to complete. Though, it really feels very strange that an FS can complete
unmount without itself ensuring that there are no more IOs in flight. The
generic VFS layer cannot know what the FS needs to flush on unmount, so waiting
on a generic sync might not be enough.

It really feels like this is a btrfs and jfs issue, unless the same can be
reproduced with any file system (XFS, ext4, f2fs, ...).

Just my 2 cents.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-27  3:00                   ` [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio() Ming Lei
  2026-05-27 11:29                     ` Tetsuo Handa
@ 2026-05-28  5:43                     ` Hillf Danton
  2026-05-28 23:00                       ` Hillf Danton
  1 sibling, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-05-28  5:43 UTC (permalink / raw)
  To: Ming Lei
  Cc: Tetsuo Handa, Jens Axboe, Bart Van Assche, Christoph Hellwig,
	Damien Le Moal, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner

On Tue, 26 May 2026 22:00:49 -0500 Ming Lei wrote:
>On Wed, May 27, 2026 at 10:35:56AM +0900, Tetsuo Handa wrote:
>> On 2026/05/27 10:20, Ming Lei wrote:
>> >> Of course we should try to figure out the root cause first, but how can we do?
>> > 
>> > Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
>> > which may cause data loss, so CC btrfs list and maintainer.
>> 
>> Why do you assume that the culprit is btrfs?
>> 
>> https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3 indicated that
>> this similar race is also happening with jfs.
>
> I just didn't see the above report on jfs.
> 
> It doesn't change anything, the same question still stands: unexpected write IO is issued
> or crosses umount & last closing of loop disk.
>
Given the loop workqueue that triggered the jfs warning, can you specify
the reason why the workqueue in question is NOT flushed while closing disk?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-27 18:11                       ` Damien Le Moal
@ 2026-05-28  8:38                         ` Christoph Hellwig
  2026-05-28 10:16                           ` Qu Wenruo
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2026-05-28  8:38 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Tetsuo Handa, Ming Lei, Jens Axboe, Bart Van Assche,
	Christoph Hellwig, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

On Thu, May 28, 2026 at 03:11:05AM +0900, Damien Le Moal wrote:
> It sounds like the VFS unmount call needs to have something that waits for
> sync() to complete. Though, it really feels very strange that an FS can complete

I don't think this is the VFS-controlled VFS file data writeback, which
we wait on, but some kind of fs controlled metadata.  And yes, it looks
like those file systems are buggy in that area.  We definitively had
such bugs in XFS before and fixed them.

e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
unmount")



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-28  8:38                         ` Christoph Hellwig
@ 2026-05-28 10:16                           ` Qu Wenruo
  2026-06-01 14:40                             ` Christoph Hellwig
  2026-06-01 15:29                             ` Ming Lei
  0 siblings, 2 replies; 21+ messages in thread
From: Qu Wenruo @ 2026-05-28 10:16 UTC (permalink / raw)
  To: Christoph Hellwig, Damien Le Moal
  Cc: Tetsuo Handa, Ming Lei, Jens Axboe, Bart Van Assche, linux-block,
	LKML, Andrew Morton, Linus Torvalds, linux-btrfs, David Sterba,
	linux-fsdevel, Christian Brauner



在 2026/5/28 18:08, Christoph Hellwig 写道:
> On Thu, May 28, 2026 at 03:11:05AM +0900, Damien Le Moal wrote:
>> It sounds like the VFS unmount call needs to have something that waits for
>> sync() to complete. Though, it really feels very strange that an FS can complete
> 
> I don't think this is the VFS-controlled VFS file data writeback, which
> we wait on, but some kind of fs controlled metadata.  And yes, it looks
> like those file systems are buggy in that area.  We definitively had
> such bugs in XFS before and fixed them.
> 
> e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
> unmount")
Considering the xfs fix is pretty old, it's before the fix hint thus no 
such mention in fstests.

Do you happen to know which test case is for that fix?
I'd like to adapt it for btrfs as a reproducer.

This syzbot report doesn't provide a reproducer.


Another thing is, if it's some btrfs bios on-the-fly after 
close_ctree(), the most common symptom should be NULL pointer 
dereference inside various btrfs endio functions.
As all those end_bbio_*() functions are referring to either fs_info or 
inode/eb, thus if the fs is unmounted before the bio finished, they 
should all cause use-after-free.

The only exception is discard, which is using blkdev_issue_discard() 
thus has no such reference to btrfs internal structure, but that's out 
of my understanding.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-28  5:43                     ` Hillf Danton
@ 2026-05-28 23:00                       ` Hillf Danton
  2026-05-29  0:14                         ` Tetsuo Handa
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-05-28 23:00 UTC (permalink / raw)
  To: Ming Lei
  Cc: Tetsuo Handa, Jens Axboe, Bart Van Assche, Christoph Hellwig,
	Damien Le Moal, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner

On Thu, 28 May 2026 13:43:31 +0800 Hillf Danton wrote:
>On Tue, 26 May 2026 22:00:49 -0500 Ming Lei wrote:
>>On Wed, May 27, 2026 at 10:35:56AM +0900, Tetsuo Handa wrote:
>>> On 2026/05/27 10:20, Ming Lei wrote:
>>> >> Of course we should try to figure out the root cause first, but how can we do?
>>> > 
>>> > Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
>>> > which may cause data loss, so CC btrfs list and maintainer.
>>> 
>>> Why do you assume that the culprit is btrfs?
>>> 
>>> https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3 indicated that
>>> this similar race is also happening with jfs.
>>
>> I just didn't see the above report on jfs.
>> 
>> It doesn't change anything, the same question still stands: unexpected write IO is issued
>> or crosses umount & last closing of loop disk.
>>
> Given the loop workqueue that triggered the jfs warning, can you specify
> the reason why the workqueue in question is NOT flushed while closing disk?
>
Got it, the loop workqueue is NOT flushed to avoid deadlock, see d292dc80686a
("loop: don't destroy lo->workqueue in __loop_clr_fd") for detail.
And the deadlock can be reproduced by flushing the loop workqueue with
disk->open_mutex held [1].

[1] Subject: Re: [syzbot] possible deadlock in blkdev_put (3)
https://lore.kernel.org/lkml/000000000000ea753505da2658d5@google.com/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-28 23:00                       ` Hillf Danton
@ 2026-05-29  0:14                         ` Tetsuo Handa
  2026-05-29  7:04                           ` Hillf Danton
  0 siblings, 1 reply; 21+ messages in thread
From: Tetsuo Handa @ 2026-05-29  0:14 UTC (permalink / raw)
  To: Hillf Danton, Ming Lei
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, Damien Le Moal,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner

On 2026/05/29 8:00, Hillf Danton wrote:
>> Given the loop workqueue that triggered the jfs warning, can you specify
>> the reason why the workqueue in question is NOT flushed while closing disk?
>>
> Got it, the loop workqueue is NOT flushed to avoid deadlock, see d292dc80686a
> ("loop: don't destroy lo->workqueue in __loop_clr_fd") for detail.
> And the deadlock can be reproduced by flushing the loop workqueue with
> disk->open_mutex held [1].
> 
> [1] Subject: Re: [syzbot] possible deadlock in blkdev_put (3)
> https://lore.kernel.org/lkml/000000000000ea753505da2658d5@google.com/

We can avoid the following lockdep warnings (including [1] you mentioned)

  https://syzkaller.appspot.com/bug?extid=2f62807dc3239b8f584e
  https://syzkaller.appspot.com/bug?extid=c4e9d077bcc86bee08dc
  https://syzkaller.appspot.com/bug?extid=0f427123ae84b3ba6dc7
  https://syzkaller.appspot.com/bug?extid=4feabfc9641267769c97
  https://syzkaller.appspot.com/bug?extid=fb0ff9bfe34ad282ebd4

caused by "drain_workqueue() with disk->open_mutex held" if we assign
caller-specific lockdep class to disk->open_mutex

  https://sourceforge.net/p/tomoyo/tomoyo.git/ci/c2245c765ebeba9dcb924d9171d8d470a9ac41c8/

.

Also, we can avoid lockdep warning caused by "drain_workqueue() with disk->open_mutex held" +
"holding system_transition_mutex" if we forbid binding to pseudo files as backing file
in the loop driver

  https://lkml.kernel.org/r/d38e4600-3c32-491f-aa49-905f4fad1bfb@I-love.SAKURA.ne.jp

which we can reproduce with

  echo 7:0 > /sys/power/resume
  losetup /dev/loop0 /sys/power/resume
  cat /dev/loop0 > /dev/null
  losetup -d /dev/loop0

.

Therefore, I think we can address this problem by "drain_workqueue() with disk->open_mutex
held" in the loop driver side.



However, the possibility that the last milli-second writeback request
(which runs during unmount operation) from filesystem fails due to

    if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)
        return BLK_STS_IOERR;

check in loop_queue_rq() will remain. Therefore, addressing this problem
within individual filesystem will be more strict solution. But guessing from
the pace jfs fixes bugs, it would take long time before we stop seeing
this problem...


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-29  0:14                         ` Tetsuo Handa
@ 2026-05-29  7:04                           ` Hillf Danton
  2026-05-29 22:05                             ` Hillf Danton
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-05-29  7:04 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, Damien Le Moal,
	Ming Lei, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner

On Fri, 29 May 2026 09:14:47 +0900 Tetsuo Handa wrote:
>On 2026/05/29 8:00, Hillf Danton wrote:
>>> Given the loop workqueue that triggered the jfs warning, can you specify
>>> the reason why the workqueue in question is NOT flushed while closing disk?
>>>
>> Got it, the loop workqueue is NOT flushed to avoid deadlock, see d292dc80686a
>> ("loop: don't destroy lo->workqueue in __loop_clr_fd") for detail.
>> And the deadlock can be reproduced by flushing the loop workqueue with
>> disk->open_mutex held [1].
>> 
>> [1] Subject: Re: [syzbot] possible deadlock in blkdev_put (3)
>> https://lore.kernel.org/lkml/000000000000ea753505da2658d5@google.com/
>
>We can avoid the following lockdep warnings (including [1] you mentioned)
>
>  https://syzkaller.appspot.com/bug?extid=2f62807dc3239b8f584e
>  https://syzkaller.appspot.com/bug?extid=c4e9d077bcc86bee08dc
>  https://syzkaller.appspot.com/bug?extid=0f427123ae84b3ba6dc7
>  https://syzkaller.appspot.com/bug?extid=4feabfc9641267769c97
>  https://syzkaller.appspot.com/bug?extid=fb0ff9bfe34ad282ebd4
>
>caused by "drain_workqueue() with disk->open_mutex held" if we assign
>caller-specific lockdep class to disk->open_mutex
>
>  https://sourceforge.net/p/tomoyo/tomoyo.git/ci/c2245c765ebeba9dcb924d9171d8d470a9ac41c8/
>
>.
>
>Also, we can avoid lockdep warning caused by "drain_workqueue() with disk->open_mutex held" +
>"holding system_transition_mutex" if we forbid binding to pseudo files as backing file
>in the loop driver
>
>  https://lkml.kernel.org/r/d38e4600-3c32-491f-aa49-905f4fad1bfb@I-love.SAKURA.ne.jp
>
>which we can reproduce with
>
>  echo 7:0 > /sys/power/resume
>  losetup /dev/loop0 /sys/power/resume
>  cat /dev/loop0 > /dev/null
>  losetup -d /dev/loop0
>
>.
>
>Therefore, I think we can address this problem by "drain_workqueue() with disk->open_mutex
>held" in the loop driver side.
>
Good news.
>
>
>However, the possibility that the last milli-second writeback request
>(which runs during unmount operation) from filesystem fails due to
>
>    if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)
>        return BLK_STS_IOERR;
>
>check in loop_queue_rq() will remain.

This conflicts with "There is no need to destroy the workqueue when
clearing unbinding a loop device from a backing file." in d292dc80686a

>Therefore, addressing this problem
>within individual filesystem will be more strict solution. But guessing from

Conflicts with "Another thing is, if it's some btrfs bios on-the-fly after 
close_ctree(), the most common symptom should be NULL pointer 
dereference inside various btrfs endio functions." [2] once more.

And you need to pay the fs guys more than two cents I think for cooking
a FIX.

[2] Subject: Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
https://lore.kernel.org/lkml/36571f8a-4df8-4152-b078-d82dbff4ad7e@suse.com/

>the pace jfs fixes bugs, it would take long time before we stop seeing
>this problem...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-29  7:04                           ` Hillf Danton
@ 2026-05-29 22:05                             ` Hillf Danton
  2026-05-30 23:57                               ` Tetsuo Handa
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-05-29 22:05 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jens Axboe, Bart Van Assche, Christoph Hellwig, Damien Le Moal,
	Ming Lei, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner,
	syzbot+78ad2c6a58c0a1faa5f5

On Fri, 29 May 2026 15:04:10 +0800 Hillf Danton wrote:
>On Fri, 29 May 2026 09:14:47 +0900 Tetsuo Handa wrote:
>>On 2026/05/29 8:00, Hillf Danton wrote:
>>>> Given the loop workqueue that triggered the jfs warning, can you specify
>>>> the reason why the workqueue in question is NOT flushed while closing disk?
>>>>
>>> Got it, the loop workqueue is NOT flushed to avoid deadlock, see d292dc80686a
>>> ("loop: don't destroy lo->workqueue in __loop_clr_fd") for detail.
>>> And the deadlock can be reproduced by flushing the loop workqueue with
>>> disk->open_mutex held [1].
>>> 
>>> [1] Subject: Re: [syzbot] possible deadlock in blkdev_put (3)
>>> https://lore.kernel.org/lkml/000000000000ea753505da2658d5@google.com/
>>
>>We can avoid the following lockdep warnings (including [1] you mentioned)
>>
>>  https://syzkaller.appspot.com/bug?extid=2f62807dc3239b8f584e
>>  https://syzkaller.appspot.com/bug?extid=c4e9d077bcc86bee08dc
>>  https://syzkaller.appspot.com/bug?extid=0f427123ae84b3ba6dc7
>>  https://syzkaller.appspot.com/bug?extid=4feabfc9641267769c97
>>  https://syzkaller.appspot.com/bug?extid=fb0ff9bfe34ad282ebd4
>>
>>caused by "drain_workqueue() with disk->open_mutex held" if we assign
>>caller-specific lockdep class to disk->open_mutex
>>
>>  https://sourceforge.net/p/tomoyo/tomoyo.git/ci/c2245c765ebeba9dcb924d9171d8d470a9ac41c8/
>>
>>.
>>
>>Also, we can avoid lockdep warning caused by "drain_workqueue() with disk->open_mutex held" +
>>"holding system_transition_mutex" if we forbid binding to pseudo files as backing file
>>in the loop driver
>>
>>  https://lkml.kernel.org/r/d38e4600-3c32-491f-aa49-905f4fad1bfb@I-love.SAKURA.ne.jp
>>
>>which we can reproduce with
>>
>>  echo 7:0 > /sys/power/resume
>>  losetup /dev/loop0 /sys/power/resume
>>  cat /dev/loop0 > /dev/null
>>  losetup -d /dev/loop0
>>
>>.
>>
>> Therefore, I think we can address this problem by "drain_workqueue() with disk->open_mutex
>> held" in the loop driver side.
>>
> Good news.
>
Bad news: Subject: [syzbot] [block?] possible deadlock in loop_process_work
[3] https://lore.kernel.org/lkml/6a19f5f7.5099cdd9.8e407.0004.GAE@google.com/

syzbot found the following issue on:

HEAD commit:    c1ecb239fa34 Add linux-next specific files for 20260522
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12fa6336580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=77a9211ff284de54
dashboard link: https://syzkaller.appspot.com/bug?extid=78ad2c6a58c0a1faa5f5
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/4cb88c910144/disk-c1ecb239.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/4a9bc938cf88/vmlinux-c1ecb239.xz
kernel image: https://storage.googleapis.com/syzbot-assets/684f1e33f264/bzImage-c1ecb239.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+78ad2c6a58c0a1faa5f5@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G             L
------------------------------------------------------
kworker/u8:15/1491 is trying to acquire lock:
ffff88805e1a6480 (sb_writers#5){.+.+}-{0:0}, at: do_req_filebacked drivers/block/loop.c:433 [inline]
ffff88805e1a6480 (sb_writers#5){.+.+}-{0:0}, at: loop_handle_cmd drivers/block/loop.c:1941 [inline]
ffff88805e1a6480 (sb_writers#5){.+.+}-{0:0}, at: loop_process_work+0x637/0x11b0 drivers/block/loop.c:1976

but task is already holding lock:
ffffc90006e27c40 ((work_completion)(&worker->work)){+.+.}-{0:0}, at: process_one_work+0x8be/0x1630 kernel/workqueue.c:3294

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #7 ((work_completion)(&worker->work)){+.+.}-{0:0}:
       process_one_work+0x8d7/0x1630 kernel/workqueue.c:3294
       process_scheduled_works kernel/workqueue.c:3401 [inline]
       worker_thread+0xb49/0x1140 kernel/workqueue.c:3482
       kthread+0x388/0x470 kernel/kthread.c:436
       ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #6 ((wq_completion)loop4){+.+.}-{0:0}:
       touch_wq_lockdep_map+0xcb/0x180 kernel/workqueue.c:4033
       __flush_workqueue+0x14b/0x14f0 kernel/workqueue.c:4075
       drain_workqueue+0xd3/0x390 kernel/workqueue.c:4239
       __loop_clr_fd drivers/block/loop.c:1130 [inline]
       lo_release+0x287/0x8f0 drivers/block/loop.c:1767
       bdev_release+0x541/0x660 block/bdev.c:-1
       blkdev_release+0x15/0x20 block/fops.c:705
       __fput+0x461/0xa70 fs/file_table.c:510
       fput_close_sync+0x11f/0x240 fs/file_table.c:615
       __do_sys_close fs/open.c:1511 [inline]
       __se_sys_close fs/open.c:1496 [inline]
       __x64_sys_close+0x7e/0x110 fs/open.c:1496
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #5 (&disk->open_mutex){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/rtmutex_api.c:559 [inline]
       mutex_lock_nested+0x5a/0x1d0 kernel/locking/rtmutex_api.c:578
       __del_gendisk+0x127/0x980 block/genhd.c:710
       del_gendisk+0xe7/0x160 block/genhd.c:823
       nbd_dev_remove drivers/block/nbd.c:268 [inline]
       nbd_dev_remove_work+0x47/0xe0 drivers/block/nbd.c:284
       process_one_work+0x98b/0x1630 kernel/workqueue.c:3318
       process_scheduled_works kernel/workqueue.c:3401 [inline]
       worker_thread+0xb49/0x1140 kernel/workqueue.c:3482
       kthread+0x388/0x470 kernel/kthread.c:436
       ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #4 (&set->update_nr_hwq_lock){++++}-{4:4}:
       down_read+0x97/0x200 kernel/locking/rwsem.c:1568
       add_disk_fwnode+0xe7/0x480 block/genhd.c:596
       add_disk include/linux/blkdev.h:794 [inline]
       nbd_dev_add+0x72c/0xb50 drivers/block/nbd.c:1984
       nbd_genl_connect+0x965/0x1c80 drivers/block/nbd.c:2125
       genl_family_rcv_msg_doit+0x22a/0x330 net/netlink/genetlink.c:1114
       genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
       genl_rcv_msg+0x61c/0x7a0 net/netlink/genetlink.c:1209
       netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2551
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x780/0x920 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1895
       sock_sendmsg_nosec+0x112/0x150 net/socket.c:797
       __sock_sendmsg net/socket.c:812 [inline]
       ____sys_sendmsg+0x55c/0x870 net/socket.c:2716
       ___sys_sendmsg+0x2a5/0x360 net/socket.c:2770
       __sys_sendmsg net/socket.c:2802 [inline]
       __do_sys_sendmsg net/socket.c:2807 [inline]
       __se_sys_sendmsg net/socket.c:2805 [inline]
       __x64_sys_sendmsg+0x1c3/0x2a0 net/socket.c:2805
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #3 (genl_mutex){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/rtmutex_api.c:559 [inline]
       mutex_lock_nested+0x5a/0x1d0 kernel/locking/rtmutex_api.c:578
       genl_lock net/netlink/genetlink.c:35 [inline]
       genl_lock_all net/netlink/genetlink.c:48 [inline]
       genl_register_family+0x7b9/0x17b0 net/netlink/genetlink.c:784
       vdpa_init+0x39/0x70 drivers/vdpa/vdpa.c:1565
       do_one_initcall+0x250/0x870 init/main.c:1347
       do_initcall_level+0x104/0x190 init/main.c:1409
       do_initcalls+0x59/0xa0 init/main.c:1425
       kernel_init_freeable+0x2a6/0x3e0 init/main.c:1658
       kernel_init+0x1d/0x1d0 init/main.c:1548
       ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #2 (cb_lock){++++}-{4:4}:
       down_read+0x97/0x200 kernel/locking/rwsem.c:1568
       genl_rcv+0x19/0x40 net/netlink/genetlink.c:1217
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x780/0x920 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1895
       sock_sendmsg_nosec+0x112/0x150 net/socket.c:797
       __sock_sendmsg net/socket.c:812 [inline]
       sock_sendmsg+0x1ca/0x2d0 net/socket.c:835
       splice_to_socket+0xae5/0x11f0 fs/splice.c:884
       do_splice_from fs/splice.c:936 [inline]
       do_splice+0xef8/0x1940 fs/splice.c:1349
       __do_splice fs/splice.c:1431 [inline]
       __do_sys_splice fs/splice.c:1634 [inline]
       __se_sys_splice+0x353/0x490 fs/splice.c:1616
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&pipe->mutex){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/rtmutex_api.c:559 [inline]
       mutex_lock_nested+0x5a/0x1d0 kernel/locking/rtmutex_api.c:578
       iter_file_splice_write+0x1f3/0x10f0 fs/splice.c:682
       do_splice_from fs/splice.c:936 [inline]
       do_splice+0xef8/0x1940 fs/splice.c:1349
       __do_splice fs/splice.c:1431 [inline]
       __do_sys_splice fs/splice.c:1634 [inline]
       __se_sys_splice+0x353/0x490 fs/splice.c:1616
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (sb_writers#5){.+.+}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3167 [inline]
       check_prevs_add kernel/locking/lockdep.c:3286 [inline]
       validate_chain kernel/locking/lockdep.c:3910 [inline]
       __lock_acquire+0x15a5/0x2d10 kernel/locking/lockdep.c:5239
       lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5870
       percpu_down_read_internal include/linux/percpu-rwsem.h:53 [inline]
       percpu_down_read_freezable include/linux/percpu-rwsem.h:83 [inline]
       __sb_start_write include/linux/fs/super.h:19 [inline]
       sb_start_write include/linux/fs/super.h:125 [inline]
       kiocb_start_write include/linux/fs.h:2767 [inline]
       lo_rw_aio+0xb1b/0xf00 drivers/block/loop.c:401
       do_req_filebacked drivers/block/loop.c:433 [inline]
       loop_handle_cmd drivers/block/loop.c:1941 [inline]
       loop_process_work+0x637/0x11b0 drivers/block/loop.c:1976
       process_one_work+0x98b/0x1630 kernel/workqueue.c:3318
       process_scheduled_works kernel/workqueue.c:3401 [inline]
       worker_thread+0xb49/0x1140 kernel/workqueue.c:3482
       kthread+0x388/0x470 kernel/kthread.c:436
       ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

other info that might help us debug this:

Chain exists of:
  sb_writers#5 --> (wq_completion)loop4 --> (work_completion)(&worker->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((work_completion)(&worker->work));
                               lock((wq_completion)loop4);
                               lock((work_completion)(&worker->work));
  rlock(sb_writers#5);

 *** DEADLOCK ***

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-29 22:05                             ` Hillf Danton
@ 2026-05-30 23:57                               ` Tetsuo Handa
  2026-06-07 10:54                                 ` [PATCH v4] " Tetsuo Handa
  0 siblings, 1 reply; 21+ messages in thread
From: Tetsuo Handa @ 2026-05-30 23:57 UTC (permalink / raw)
  To: Hillf Danton, Jens Axboe
  Cc: Bart Van Assche, Christoph Hellwig, Damien Le Moal, Ming Lei,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner

On 2026/05/30 7:05, Hillf Danton wrote:
>>> Therefore, I think we can address this problem by "drain_workqueue() with disk->open_mutex
>>> held" in the loop driver side.
>>>
>> Good news.
>>
> Bad news: Subject: [syzbot] [block?] possible deadlock in loop_process_work
> [3] https://lore.kernel.org/lkml/6a19f5f7.5099cdd9.8e407.0004.GAE@google.com/
> 

OK. I sent two patches

  https://lkml.kernel.org/r/147ed056-03d9-4214-b925-0f10fc00cf27@I-love.SAKURA.ne.jp
  https://lkml.kernel.org/r/148efba2-a0b6-47d7-ac76-b19d2f4b696c@I-love.SAKURA.ne.jp

as a preparation for evaluating the possibility of calling drain_workqueue() from
__loop_clr_fd(). But as far as syzbot has tested using linux-next tree

  https://syzkaller.appspot.com/bug?extid=c4e9d077bcc86bee08dc
  https://syzkaller.appspot.com/bug?extid=4feabfc9641267769c97

seems to remain even if we applied above patches.

Therefore, I think that we need to call drain_workqueue() from __loop_clr_fd()
without holding disk->open_mutex (if we address this NULL pointer dereference
problem by updating the loop driver).

"[PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()" was an attempt to call
drain_workqueue() from __loop_clr_fd() without holding disk->open_mutex, but Sashiko's
review ( https://sashiko.dev/#/patchset/fda8abc8-6aa2-463b-bf72-865f6b838034%40I-love.SAKURA.ne.jp )
mentioned that the "module_put(THIS_MODULE);" executed as the last step of __loop_clr_fd() has
a race window of concurrently triggering module unload operation because module refcount of
the loop driver can become 0 due to this module_put(THIS_MODULE) call. In other words,
we cannot safely manage refcount of the loop module without a support by the caller of
lo_release() (i.e. bdev_release()).

  void bdev_release(struct file *bdev_file)
  {
  (...snipped...)
  	if (bdev_is_partition(bdev))
  		blkdev_put_part(bdev);
  	else
  		blkdev_put_whole(bdev);
  	mutex_unlock(&disk->open_mutex); // <= Keeping holding disk->open_mutex until __loop_clr_fd() completes causes circular locking problem.
  
  	module_put(disk->fops->owner); // <= Calling after __loop_clr_fd() completed is required for managing module refcount safely.
  put_no_open:
  	blkdev_put_no_open(bdev);
  }

Therefore, I think that the only robust and safe approach is, although you won't be
happy to see layering violation / tricky code, either

  (a) allow __loop_clr_fd() to temporarily drop disk->open_mutex

or

  (b) add a new callback for the loop driver which is called between mutex_unlock(&disk->open_mutex) and module_put(disk->fops->owner)

. Jens, what do you think?

One might argue that this problem should be fixed on the filesystem side by
ensuring all filesystems wait for I/O requests safely. However, from the
perspective of defensive programming, the loop driver should be robust enough
to handle incomplete I/O serialization from underlying layers to prevent GPF.
Furthermore, without adding noisy debug printk() messages, it is extremely
difficult to pinpoint which specific layer or filesystem failed to wait for
the I/O requests.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-28 10:16                           ` Qu Wenruo
@ 2026-06-01 14:40                             ` Christoph Hellwig
  2026-06-01 16:29                               ` Brian Foster
  2026-06-01 15:29                             ` Ming Lei
  1 sibling, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2026-06-01 14:40 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Christoph Hellwig, Damien Le Moal, Tetsuo Handa, Ming Lei,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner, Brian Foster

On Thu, May 28, 2026 at 07:46:24PM +0930, Qu Wenruo wrote:
>> e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
>> unmount")
> Considering the xfs fix is pretty old, it's before the fix hint thus no 
> such mention in fstests.
>
> Do you happen to know which test case is for that fix?
> I'd like to adapt it for btrfs as a reproducer.

No.  Adding Brian who authored that commit.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-28 10:16                           ` Qu Wenruo
  2026-06-01 14:40                             ` Christoph Hellwig
@ 2026-06-01 15:29                             ` Ming Lei
  2026-06-01 21:51                               ` Hillf Danton
  1 sibling, 1 reply; 21+ messages in thread
From: Ming Lei @ 2026-06-01 15:29 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Christoph Hellwig, Damien Le Moal, Tetsuo Handa, Jens Axboe,
	Bart Van Assche, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner

On Thu, May 28, 2026 at 5:16 AM Qu Wenruo <wqu@suse.com> wrote:
>
>
>
> 在 2026/5/28 18:08, Christoph Hellwig 写道:
> > On Thu, May 28, 2026 at 03:11:05AM +0900, Damien Le Moal wrote:
> >> It sounds like the VFS unmount call needs to have something that waits for
> >> sync() to complete. Though, it really feels very strange that an FS can complete
> >
> > I don't think this is the VFS-controlled VFS file data writeback, which
> > we wait on, but some kind of fs controlled metadata.  And yes, it looks
> > like those file systems are buggy in that area.  We definitively had
> > such bugs in XFS before and fixed them.
> >
> > e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
> > unmount")
> Considering the xfs fix is pretty old, it's before the fix hint thus no
> such mention in fstests.
>
> Do you happen to know which test case is for that fix?
> I'd like to adapt it for btrfs as a reproducer.
>
> This syzbot report doesn't provide a reproducer.
>
>
> Another thing is, if it's some btrfs bios on-the-fly after
> close_ctree(), the most common symptom should be NULL pointer
> dereference inside various btrfs endio functions.
> As all those end_bbio_*() functions are referring to either fs_info or
> inode/eb, thus if the fs is unmounted before the bio finished, they
> should all cause use-after-free.
>
> The only exception is discard, which is using blkdev_issue_discard()
> thus has no such reference to btrfs internal structure, but that's out
> of my understanding.

syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.

https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28

Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
this bio submission context if this req isn't issued via wq.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 14:40                             ` Christoph Hellwig
@ 2026-06-01 16:29                               ` Brian Foster
  2026-06-01 22:27                                 ` Qu Wenruo
  0 siblings, 1 reply; 21+ messages in thread
From: Brian Foster @ 2026-06-01 16:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Qu Wenruo, Damien Le Moal, Tetsuo Handa, Ming Lei, Jens Axboe,
	Bart Van Assche, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner

On Mon, Jun 01, 2026 at 04:40:34PM +0200, Christoph Hellwig wrote:
> On Thu, May 28, 2026 at 07:46:24PM +0930, Qu Wenruo wrote:
> >> e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
> >> unmount")
> > Considering the xfs fix is pretty old, it's before the fix hint thus no 
> > such mention in fstests.
> >
> > Do you happen to know which test case is for that fix?
> > I'd like to adapt it for btrfs as a reproducer.
> 
> No.  Adding Brian who authored that commit.
> 

I haven't followed through the full thread here... But if you're just
looking for an existing test case associated with the commit above on
XFS, I did some quick digging and xfs/311 is the original reproducer for
that one.

Brian


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 15:29                             ` Ming Lei
@ 2026-06-01 21:51                               ` Hillf Danton
  2026-06-01 22:14                                 ` Ming Lei
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-06-01 21:51 UTC (permalink / raw)
  To: Ming Lei
  Cc: Qu Wenruo, Christoph Hellwig, Damien Le Moal, Tetsuo Handa,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

On Mon, 1 Jun 2026 10:29:25 -0500 Ming Lei wrote:
>On Thu, May 28, 2026 at 5:16 AM Qu Wenruo <wqu@suse.com> wrote:
>> 在 2026/5/28 18:08, Christoph Hellwig 写道:
>> > On Thu, May 28, 2026 at 03:11:05AM +0900, Damien Le Moal wrote:
>> >> It sounds like the VFS unmount call needs to have something that waits for
>> >> sync() to complete. Though, it really feels very strange that an FS can complete
>> >
>> > I don't think this is the VFS-controlled VFS file data writeback, which
>> > we wait on, but some kind of fs controlled metadata.  And yes, it looks
>> > like those file systems are buggy in that area.  We definitively had
>> > such bugs in XFS before and fixed them.
>> >
>> > e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
>> > unmount")
>> Considering the xfs fix is pretty old, it's before the fix hint thus no
>> such mention in fstests.
>>
>> Do you happen to know which test case is for that fix?
>> I'd like to adapt it for btrfs as a reproducer.
>>
>> This syzbot report doesn't provide a reproducer.
>>
>>
>> Another thing is, if it's some btrfs bios on-the-fly after
>> close_ctree(), the most common symptom should be NULL pointer
>> dereference inside various btrfs endio functions.
>> As all those end_bbio_*() functions are referring to either fs_info or
>> inode/eb, thus if the fs is unmounted before the bio finished, they
>> should all cause use-after-free.
>>
>> The only exception is discard, which is using blkdev_issue_discard()
>> thus has no such reference to btrfs internal structure, but that's out
>> of my understanding.
>
> syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.
>
> https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
>
> Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
> this bio submission context if this req isn't issued via wq.
>
I suspect this makes $.02 sense given the check of Lo_bound upon queuing rq.

static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
		const struct blk_mq_queue_data *bd)
{
	struct request *rq = bd->rq;
	struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
	struct loop_device *lo = rq->q->queuedata;

	blk_mq_start_request(rq);

	if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)
		return BLK_STS_IOERR;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 21:51                               ` Hillf Danton
@ 2026-06-01 22:14                                 ` Ming Lei
  2026-06-01 23:17                                   ` Hillf Danton
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2026-06-01 22:14 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Qu Wenruo, Christoph Hellwig, Damien Le Moal, Tetsuo Handa,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

On Tue, Jun 02, 2026 at 05:51:26AM +0800, Hillf Danton wrote:
> On Mon, 1 Jun 2026 10:29:25 -0500 Ming Lei wrote:
> >On Thu, May 28, 2026 at 5:16 AM Qu Wenruo <wqu@suse.com> wrote:
> >> 在 2026/5/28 18:08, Christoph Hellwig 写道:
> >> > On Thu, May 28, 2026 at 03:11:05AM +0900, Damien Le Moal wrote:
> >> >> It sounds like the VFS unmount call needs to have something that waits for
> >> >> sync() to complete. Though, it really feels very strange that an FS can complete
> >> >
> >> > I don't think this is the VFS-controlled VFS file data writeback, which
> >> > we wait on, but some kind of fs controlled metadata.  And yes, it looks
> >> > like those file systems are buggy in that area.  We definitively had
> >> > such bugs in XFS before and fixed them.
> >> >
> >> > e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
> >> > unmount")
> >> Considering the xfs fix is pretty old, it's before the fix hint thus no
> >> such mention in fstests.
> >>
> >> Do you happen to know which test case is for that fix?
> >> I'd like to adapt it for btrfs as a reproducer.
> >>
> >> This syzbot report doesn't provide a reproducer.
> >>
> >>
> >> Another thing is, if it's some btrfs bios on-the-fly after
> >> close_ctree(), the most common symptom should be NULL pointer
> >> dereference inside various btrfs endio functions.
> >> As all those end_bbio_*() functions are referring to either fs_info or
> >> inode/eb, thus if the fs is unmounted before the bio finished, they
> >> should all cause use-after-free.
> >>
> >> The only exception is discard, which is using blkdev_issue_discard()
> >> thus has no such reference to btrfs internal structure, but that's out
> >> of my understanding.
> >
> > syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.
> >
> > https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
> >
> > Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
> > this bio submission context if this req isn't issued via wq.
> >
> I suspect this makes $.02 sense given the check of Lo_bound upon queuing rq.

Can't lo->lo_state be updated after the check? It is totally lockless...


Thanks,
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 16:29                               ` Brian Foster
@ 2026-06-01 22:27                                 ` Qu Wenruo
  0 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2026-06-01 22:27 UTC (permalink / raw)
  To: Brian Foster, Christoph Hellwig
  Cc: Damien Le Moal, Tetsuo Handa, Ming Lei, Jens Axboe,
	Bart Van Assche, linux-block, LKML, Andrew Morton, Linus Torvalds,
	linux-btrfs, David Sterba, linux-fsdevel, Christian Brauner



在 2026/6/2 01:59, Brian Foster 写道:
> On Mon, Jun 01, 2026 at 04:40:34PM +0200, Christoph Hellwig wrote:
>> On Thu, May 28, 2026 at 07:46:24PM +0930, Qu Wenruo wrote:
>>>> e.g. 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against
>>>> unmount")
>>> Considering the xfs fix is pretty old, it's before the fix hint thus no
>>> such mention in fstests.
>>>
>>> Do you happen to know which test case is for that fix?
>>> I'd like to adapt it for btrfs as a reproducer.
>>
>> No.  Adding Brian who authored that commit.
>>
> 
> I haven't followed through the full thread here... But if you're just
> looking for an existing test case associated with the commit above on
> XFS, I did some quick digging and xfs/311 is the original reproducer for
> that one.

Thanks a lot! I'll use the same delayed umount to verify the behavior of 
btrfs.

Thanks,
Qu

> 
> Brian
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 22:14                                 ` Ming Lei
@ 2026-06-01 23:17                                   ` Hillf Danton
  2026-06-01 23:36                                     ` Ming Lei
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-06-01 23:17 UTC (permalink / raw)
  To: Ming Lei
  Cc: Qu Wenruo, Christoph Hellwig, Damien Le Moal, Tetsuo Handa,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

On Mon, 1 Jun 2026 17:14:59 -0500 Ming Lei wrote:
> On Tue, Jun 02, 2026 at 05:51:26AM +0800, Hillf Danton wrote:
> > On OnMon, 1 Jun 2026 10:29:25 -0500 Ming Lei wrote:
> > > syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.
> > >
> > > https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
> > >
> > > Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
> > > this bio submission context if this req isn't issued via wq.
> > >
> > I suspect this makes $.02 sense given the check of Lo_bound upon queuing rq.
> 
> Can't lo->lo_state be updated after the check? It is totally lockless...
>
Sounds good hm... do you mean it is UNWISE to not flush the loop workqueue
when closing disk?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 23:17                                   ` Hillf Danton
@ 2026-06-01 23:36                                     ` Ming Lei
  2026-06-02  2:02                                       ` Hillf Danton
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2026-06-01 23:36 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Qu Wenruo, Christoph Hellwig, Damien Le Moal, Tetsuo Handa,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

On Tue, Jun 02, 2026 at 07:17:30AM +0800, Hillf Danton wrote:
> On Mon, 1 Jun 2026 17:14:59 -0500 Ming Lei wrote:
> > On Tue, Jun 02, 2026 at 05:51:26AM +0800, Hillf Danton wrote:
> > > On OnMon, 1 Jun 2026 10:29:25 -0500 Ming Lei wrote:
> > > > syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.
> > > >
> > > > https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
> > > >
> > > > Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
> > > > this bio submission context if this req isn't issued via wq.
> > > >
> > > I suspect this makes $.02 sense given the check of Lo_bound upon queuing rq.
> > 
> > Can't lo->lo_state be updated after the check? It is totally lockless...
> >
> Sounds good hm... do you mean it is UNWISE to not flush the loop workqueue
> when closing disk?

Quite the opposite, it is wise to not flush wq in __loop_clr_fd(), please
see my previous comment.


Thanks,
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-06-01 23:36                                     ` Ming Lei
@ 2026-06-02  2:02                                       ` Hillf Danton
  0 siblings, 0 replies; 21+ messages in thread
From: Hillf Danton @ 2026-06-02  2:02 UTC (permalink / raw)
  To: Ming Lei
  Cc: Qu Wenruo, Christoph Hellwig, Damien Le Moal, Tetsuo Handa,
	Jens Axboe, Bart Van Assche, linux-block, LKML, Andrew Morton,
	Linus Torvalds, linux-btrfs, David Sterba, linux-fsdevel,
	Christian Brauner

on Mon, 1 Jun 2026 18:36:19 -0500 Ming Lei wrote:
> On Tue, Jun 02, 2026 at 07:17:30AM +0800, Hillf Danton wrote:
> > On Mon, 1 Jun 2026 17:14:59 -0500 Ming Lei wrote:
> > > On Tue, Jun 02, 2026 at 05:51:26AM +0800, Hillf Danton wrote:
> > > > On OnMon, 1 Jun 2026 10:29:25 -0500 Ming Lei wrote:
> > > > > syzbot log shows the null-ptr-deref  is on WRITE, instead of DISCARD.
> > > > >
> > > > > https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28
> > > > >
> > > > > Adding WARN_ON(!lo->lo_backing_file) in loop_queue_rq() might capture
> > > > > this bio submission context if this req isn't issued via wq.
> > > > >
> > > > I suspect this makes $.02 sense given the check of Lo_bound upon queuing rq.
> > > 
> > > Can't lo->lo_state be updated after the check? It is totally lockless...
> > >
> > Sounds good hm... do you mean it is UNWISE to not flush the loop workqueue
> > when closing disk?
> 
> Quite the opposite, it is wise to not flush wq in __loop_clr_fd(), please
> see my previous comment.
>
When queuing rq, if lo_state is updated after checking Lo_bond, I see nothing
that prevents syzbot from reporting null-ptr-deref exists. Can you tippoint
why flush is NOT needed if you are right?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4] loop: Fix NULL pointer dereference in lo_rw_aio()
  2026-05-30 23:57                               ` Tetsuo Handa
@ 2026-06-07 10:54                                 ` Tetsuo Handa
  0 siblings, 0 replies; 21+ messages in thread
From: Tetsuo Handa @ 2026-06-07 10:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, Christoph Hellwig, Damien Le Moal, Ming Lei,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner, Hillf Danton

syzbot is reporting NULL pointer dereference in lo_rw_aio() [1][2].
An analysis by the Gemini AI collaborator [3] considers that this problem
is caused by a timing shift primarily exposed by commit 65565ca5f99b
("block: unify the synchronous bi_end_io callbacks"), along with helper
refactorings like commit 92c3737a2473 ("block: add a bio_submit_or_kill
helper").

But due to difficulty of reproducing this race, discussion about what is
happening and how to fix this problem is stalling. Also, we haven't
identified how many filesystems are subjected to this problem.

Therefore, this patch introduces a grace period for flushing pending I/O
requests (which should be a good thing from the perspective of defensive
programming) so that we won't hit NULL pointer dereference problem, and
also emits BUG: message in order to help filesystem developers identify
the caller of an I/O request that failed to wait for completion so that
filesystem developers can fix such caller to wait for completion.

Note that emitting BUG: message is enabled only if CONFIG_KCOV=y, for
this check is a waste of computation resources for almost all users.

Link: https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28 [1]
Link: https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3 [2]
Link: https://lkml.kernel.org/r/fbb3edda-f108-4e5b-acf2-266f043f8125@I-love.SAKURA.ne.jp [3]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/block/loop.c | 82 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0000913f7efc..4ff254d8b623 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -85,8 +85,26 @@ struct loop_cmd {
 	struct bio_vec *bvec;
 	struct cgroup_subsys_state *blkcg_css;
 	struct cgroup_subsys_state *memcg_css;
+#ifdef CONFIG_KCOV
+	unsigned long stack_entries[30];
+	int stack_nr;
+	pid_t pid;
+	char comm[TASK_COMM_LEN];
+#endif
 };
 
+static void loop_check_io_race(struct loop_device *lo, struct loop_cmd *cmd)
+{
+#ifdef CONFIG_KCOV
+	if (unlikely(data_race(READ_ONCE(lo->lo_state)) == Lo_rundown)) {
+		pr_err("BUG: %s/%u is doing I/O request on loop%d in Lo_rundown state.\n",
+		       cmd->comm, cmd->pid, lo->lo_number);
+		printk("Call trace:\n");
+		stack_trace_print(cmd->stack_entries, cmd->stack_nr, 4);
+	}
+#endif
+}
+
 #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
 #define LOOP_DEFAULT_HW_Q_DEPTH 128
 
@@ -1747,8 +1765,59 @@ static void lo_release(struct gendisk *disk)
 	need_clear = (lo->lo_state == Lo_rundown);
 	mutex_unlock(&lo->lo_mutex);
 
-	if (need_clear)
+	if (need_clear) {
+		/*
+		 * Temporarily release disk->open_mutex in order to flush pending I/O
+		 * requests before clearing the backing device.
+		 *
+		 * This is a layering violation. But since bdev->bd_disk->fops->release()
+		 * (which is mapped to lo_release()) is the final function which
+		 * blkdev_put_whole() from bdev_release() calls immediately before
+		 * releasing disk->open_mutex, this changes nothing except opens a new
+		 * race window for allowing disk->fops->open() (which is mapped to
+		 * lo_open()) to be called.
+		 *
+		 * Even if lo_open() is called from blkdev_get_whole() due to this race,
+		 * the Lo_rundown state guarantees that lo_open() will fail with -ENXIO.
+		 * Thus, there will be effectively no change caused by this violation.
+		 */
+		mutex_unlock(&lo->lo_disk->open_mutex);
+		/*
+		 * Now that loop_queue_rq() sees lo->lo_state != Lo_bound,
+		 * wait for already started loop_queue_rq() to complete.
+		 */
+		synchronize_rcu();
+		/*
+		 * Now that no more works are scheduled by loop_queue_rq(),
+		 * wait for already scheduled works to complete.
+		 */
+		drain_workqueue(lo->workqueue);
+		/*
+		 * Now that no more AIO requests are scheduled by lo_rw_aio(),
+		 * wait for already started AIO to complete.
+		 *
+		 * Due to synchronize_rcu() + drain_workqueue() sequence above,
+		 * calling blk_mq_unfreeze_queue() immediately after blk_mq_freeze_queue()
+		 * returns has to be safe, for loop_queue_rq() no longer schedules new
+		 * lo_rw_aio() works and lo_rw_aio() no longer submits new AIO requests.
+		 *
+		 * Deferring blk_mq_unfreeze_queue() does not help because we are about
+		 * to clear the backing device and drop the refcount for the backing device.
+		 * There is nothing we can do if blk_mq_freeze_queue() fails to flush.
+		 */
+		blk_mq_unfreeze_queue(lo->lo_queue, blk_mq_freeze_queue(lo->lo_queue));
+		/*
+		 * Perform remaining cleanup, with disk->open_mutex held.
+		 *
+		 * The lo->lo_state should remain Lo_rundown despite we temporarily
+		 * released disk->open_mutex, for I am the only and the last user of
+		 * this loop device because lo_open() cannot succeed.
+		 */
+		mutex_lock(&lo->lo_disk->open_mutex);
+		if (WARN_ON(data_race(READ_ONCE(lo->lo_state)) != Lo_rundown))
+			return;
 		__loop_clr_fd(lo);
+	}
 }
 
 static void lo_free_disk(struct gendisk *disk)
@@ -1855,10 +1924,18 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
 	struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
 	struct loop_device *lo = rq->q->queuedata;
 
+#ifdef CONFIG_KCOV
+	cmd->stack_nr = stack_trace_save(cmd->stack_entries, ARRAY_SIZE(cmd->stack_entries), 0);
+	cmd->pid = current->pid;
+	get_task_comm(cmd->comm, current);
+#endif
+
 	blk_mq_start_request(rq);
 
-	if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)
+	if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound) {
+		loop_check_io_race(lo, cmd);
 		return BLK_STS_IOERR;
+	}
 
 	switch (req_op(rq)) {
 	case REQ_OP_FLUSH:
@@ -1901,6 +1978,7 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
 	int ret = 0;
 	struct mem_cgroup *old_memcg = NULL;
 
+	loop_check_io_race(lo, cmd);
 	if (write && (lo->lo_flags & LO_FLAGS_READ_ONLY)) {
 		ret = -EIO;
 		goto failed;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-06-07 10:55 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ag0lS_CbKO9R5CV8@fedora>
     [not found] ` <94076bc9-2c09-4bb6-8468-b6b8af419cb9@I-love.SAKURA.ne.jp>
     [not found]   ` <ag1nfIFcykmQHbkk@fedora>
     [not found]     ` <1ab8c579-eb76-4227-8a72-6ec819135219@I-love.SAKURA.ne.jp>
     [not found]       ` <ag1223nAa0wZ8ALC@fedora>
     [not found]         ` <fda8abc8-6aa2-463b-bf72-865f6b838034@I-love.SAKURA.ne.jp>
     [not found]           ` <ahRocb0Vs_m6RF_O@fedora>
     [not found]             ` <1a9f53d4-6f48-4df8-a3d8-2b0e442a163a@I-love.SAKURA.ne.jp>
     [not found]               ` <ahZGxoI6oHQ_vSrx@fedora>
     [not found]                 ` <d1b5a737-f0e3-4927-b762-430b37fbb2f9@I-love.SAKURA.ne.jp>
2026-05-27  3:00                   ` [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio() Ming Lei
2026-05-27 11:29                     ` Tetsuo Handa
2026-05-27 18:11                       ` Damien Le Moal
2026-05-28  8:38                         ` Christoph Hellwig
2026-05-28 10:16                           ` Qu Wenruo
2026-06-01 14:40                             ` Christoph Hellwig
2026-06-01 16:29                               ` Brian Foster
2026-06-01 22:27                                 ` Qu Wenruo
2026-06-01 15:29                             ` Ming Lei
2026-06-01 21:51                               ` Hillf Danton
2026-06-01 22:14                                 ` Ming Lei
2026-06-01 23:17                                   ` Hillf Danton
2026-06-01 23:36                                     ` Ming Lei
2026-06-02  2:02                                       ` Hillf Danton
2026-05-28  5:43                     ` Hillf Danton
2026-05-28 23:00                       ` Hillf Danton
2026-05-29  0:14                         ` Tetsuo Handa
2026-05-29  7:04                           ` Hillf Danton
2026-05-29 22:05                             ` Hillf Danton
2026-05-30 23:57                               ` Tetsuo Handa
2026-06-07 10:54                                 ` [PATCH v4] " Tetsuo Handa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox