public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
@ 2026-03-15 11:24 John
  2026-03-17  0:15 ` Joanne Koong
  0 siblings, 1 reply; 9+ messages in thread
From: John @ 2026-03-15 11:24 UTC (permalink / raw)
  To: linux-fsdevel@vger.kernel.org
  Cc: Joanne Koong, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi

Kernel: 6.19.6-arch1-1
Component: fs/fuse, fs/fs-writeback

--- SUMMARY ---

A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.

--- BACKGROUND ---

This issue was originally reported in:
https://github.com/containers/fuse-overlayfs/issues/386

The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.

The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:

https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3

--- REPRODUCTION ---

mkdir -p /dev/shm/test/up
mkdir -p /dev/shm/test/tmp
mkdir -p /dev/shm/test/data

fuse-overlayfs -o "static_nlink,noacl,\
  upperdir=/dev/shm/test/up,\
  lowerdir=$HOME/.config/mozilla/firefox,\
  workdir=/dev/shm/test/tmp" \
  /dev/shm/test/data

firefox --profile /dev/shm/test/data/PROFILENAME

Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.

The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:

# reboot
Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.

--- CALL TRACE (6.19.6) ---

Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
Mar 15 06:44:42 kernel:       Not tainted 6.19.6-arch1-1 #1
Mar 15 06:44:42 kernel: task:kworker/u128:0  state:D stack:0     pid:106160 tgid:106160 ppid:2
Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
Mar 15 06:44:42 kernel: Call Trace:
Mar 15 06:44:42 kernel:  <TASK>
Mar 15 06:44:42 kernel:  __schedule+0x457/0x1720
Mar 15 06:44:42 kernel:  schedule+0x27/0xd0
Mar 15 06:44:42 kernel:  wb_wait_for_completion+0x97/0xe0
Mar 15 06:44:42 kernel:  sync_inodes_sb+0xf8/0x2e0
Mar 15 06:44:42 kernel:  __iterate_supers+0xdc/0x160
Mar 15 06:44:42 kernel:  ksys_sync+0x43/0xb0
Mar 15 06:44:42 kernel:  pm_fs_sync_work_fn+0x17/0xa0
Mar 15 06:44:42 kernel:  process_one_work+0x193/0x350
Mar 15 06:44:42 kernel:  worker_thread+0x1a1/0x310
Mar 15 06:44:42 kernel:  kthread+0xfc/0x240
Mar 15 06:44:42 kernel:  ret_from_fork+0x243/0x280
Mar 15 06:44:42 kernel:  ret_from_fork_asm+0x1a/0x30
Mar 15 06:44:42 kernel:  </TASK>

--- MORE CONTEXT ---

Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.

The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.

Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.

I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below.

References:
1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386
2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411
3. Debian bug #1120058: https://bugs.debian.org/1120058

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-15 11:24 [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) John
@ 2026-03-17  0:15 ` Joanne Koong
  2026-03-17 21:07   ` John
  2026-03-17 23:25   ` Joanne Koong
  0 siblings, 2 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-17  0:15 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara

On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote:
>

Hi John,

Thanks for your detailed report.

> Kernel: 6.19.6-arch1-1
> Component: fs/fuse, fs/fs-writeback
>
> --- SUMMARY ---
>
> A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.
>
> --- BACKGROUND ---
>
> This issue was originally reported in:
> https://github.com/containers/fuse-overlayfs/issues/386
>
> The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.
>
> The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:
>
> https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
> https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3
>
> --- REPRODUCTION ---
>
> mkdir -p /dev/shm/test/up
> mkdir -p /dev/shm/test/tmp
> mkdir -p /dev/shm/test/data
>
> fuse-overlayfs -o "static_nlink,noacl,\
>   upperdir=/dev/shm/test/up,\
>   lowerdir=$HOME/.config/mozilla/firefox,\
>   workdir=/dev/shm/test/tmp" \
>   /dev/shm/test/data
>
> firefox --profile /dev/shm/test/data/PROFILENAME
>
> Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.
>
> The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:
>
> # reboot
> Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.
>
> --- CALL TRACE (6.19.6) ---
>
> Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
> Mar 15 06:44:42 kernel:       Not tainted 6.19.6-arch1-1 #1
> Mar 15 06:44:42 kernel: task:kworker/u128:0  state:D stack:0     pid:106160 tgid:106160 ppid:2
> Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
> Mar 15 06:44:42 kernel: Call Trace:
> Mar 15 06:44:42 kernel:  <TASK>
> Mar 15 06:44:42 kernel:  __schedule+0x457/0x1720
> Mar 15 06:44:42 kernel:  schedule+0x27/0xd0
> Mar 15 06:44:42 kernel:  wb_wait_for_completion+0x97/0xe0
> Mar 15 06:44:42 kernel:  sync_inodes_sb+0xf8/0x2e0
> Mar 15 06:44:42 kernel:  __iterate_supers+0xdc/0x160
> Mar 15 06:44:42 kernel:  ksys_sync+0x43/0xb0
> Mar 15 06:44:42 kernel:  pm_fs_sync_work_fn+0x17/0xa0
> Mar 15 06:44:42 kernel:  process_one_work+0x193/0x350
> Mar 15 06:44:42 kernel:  worker_thread+0x1a1/0x310
> Mar 15 06:44:42 kernel:  kthread+0xfc/0x240
> Mar 15 06:44:42 kernel:  ret_from_fork+0x243/0x280
> Mar 15 06:44:42 kernel:  ret_from_fork_asm+0x1a/0x30
> Mar 15 06:44:42 kernel:  </TASK>
>
> --- MORE CONTEXT ---
>
> Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.
>
> The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.

I'll need to run the repro and verify this is the issue, but I think
it's because it's hitting this call chain if there's a dirty folio
that is already under writeback that needs to have writeback issued on
it again:

wb_workfn()
  wb_do_writeback()
    wb_writeback()
      writeback_sb_inodes()
        __writeback_single_inode()
          do_writepages()
            fuse_writepages()
              iomap_writepages()
                writeback_iter()
                  writeback_get_folio()
                    folio_prepare_writeback()

where in the folio_prepare_writeback() logic:

        if (!folio_test_dirty(folio))
                return false;

        if (folio_test_writeback(folio)) {
                if (wbc->sync_mode == WB_SYNC_NONE)
                        return false;
                folio_wait_writeback(folio);


Could you verify if this fixes the issue?:

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8f8069fb76ba..4cbb22d80acf 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb,
                 * Don't bother with new inodes or inodes being freed, first
                 * kind does not need periodic writeout yet, and for the latter
                 * kind writeout is handled by the freer.
+                *
+                * For sync(2), skip inodes whose mappings have no data
+                * integrity guarantees (eg FUSE).
                 */
                spin_lock(&inode->i_lock);
-               if (inode_state_read(inode) & (I_NEW | I_FREEING |
I_WILL_FREE)) {
+               if ((inode_state_read(inode) & (I_NEW | I_FREEING |
I_WILL_FREE)) ||
+                   (wbc.for_sync &&
mapping_no_data_integrity(inode->i_mapping))) {
                        redirty_tail_locked(inode, wb);
                        spin_unlock(&inode->i_lock);
                        continue;

I think the changes from commit f9a49aa302a0 ("fs/writeback: skip
AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed
even with the above patch since afaics, the writeback we wait on could
have been from a previous background/periodic writeback run before
sync was triggered.

Thanks,
Joanne

>
> Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.
>
> I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below.
>
> References:
> 1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386
> 2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411
> 3. Debian bug #1120058: https://bugs.debian.org/1120058

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-17  0:15 ` Joanne Koong
@ 2026-03-17 21:07   ` John
  2026-03-17 22:55     ` Joanne Koong
  2026-03-18  3:50     ` Joanne Koong
  2026-03-17 23:25   ` Joanne Koong
  1 sibling, 2 replies; 9+ messages in thread
From: John @ 2026-03-17 21:07 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara


On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote:

> Could you verify if this fixes the issue?:
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 8f8069fb76ba..4cbb22d80acf 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb,
>                  * Don't bother with new inodes or inodes being freed, first
>                  * kind does not need periodic writeout yet, and for the latter
>                  * kind writeout is handled by the freer.
> +                *
> +                * For sync(2), skip inodes whose mappings have no data
> +                * integrity guarantees (eg FUSE).
>                  */
>                 spin_lock(&inode->i_lock);
> -               if (inode_state_read(inode) & (I_NEW | I_FREEING |
> I_WILL_FREE)) {
> +               if ((inode_state_read(inode) & (I_NEW | I_FREEING |
> I_WILL_FREE)) ||
> +                   (wbc.for_sync &&
> mapping_no_data_integrity(inode->i_mapping))) {
>                         redirty_tail_locked(inode, wb);
>                         spin_unlock(&inode->i_lock);
>                         continue;
>
> I think the changes from commit f9a49aa302a0 ("fs/writeback: skip
> AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed
> even with the above patch since afaics, the writeback we wait on could
> have been from a previous background/periodic writeback run before
> sync was triggered.

Yes, it does! Amazing!

I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times.

Is this something you plan to submit/backport to the stable kernels series?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-17 21:07   ` John
@ 2026-03-17 22:55     ` Joanne Koong
  2026-03-18  3:50     ` Joanne Koong
  1 sibling, 0 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-17 22:55 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara

On Tue, Mar 17, 2026 at 2:08 PM John <therealgraysky@proton.me> wrote:
>
>
> On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote:
>
> > Could you verify if this fixes the issue?:
> >
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 8f8069fb76ba..4cbb22d80acf 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb,
> >                  * Don't bother with new inodes or inodes being freed, first
> >                  * kind does not need periodic writeout yet, and for the latter
> >                  * kind writeout is handled by the freer.
> > +                *
> > +                * For sync(2), skip inodes whose mappings have no data
> > +                * integrity guarantees (eg FUSE).
> >                  */
> >                 spin_lock(&inode->i_lock);
> > -               if (inode_state_read(inode) & (I_NEW | I_FREEING |
> > I_WILL_FREE)) {
> > +               if ((inode_state_read(inode) & (I_NEW | I_FREEING |
> > I_WILL_FREE)) ||
> > +                   (wbc.for_sync &&
> > mapping_no_data_integrity(inode->i_mapping))) {
> >                         redirty_tail_locked(inode, wb);
> >                         spin_unlock(&inode->i_lock);
> >                         continue;
> >
> > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip
> > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed
> > even with the above patch since afaics, the writeback we wait on could
> > have been from a previous background/periodic writeback run before
> > sync was triggered.
>
> Yes, it does! Amazing!
>
> I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times.

Thank you for testing this.

>
> Is this something you plan to submit/backport to the stable kernels series?

Definitely. I will submit this patch upstream to the fs mailing list
either today or tomorrow and cc stable@.

Thanks,
Joanne
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-17  0:15 ` Joanne Koong
  2026-03-17 21:07   ` John
@ 2026-03-17 23:25   ` Joanne Koong
  2026-03-18 22:31     ` Joanne Koong
  1 sibling, 1 reply; 9+ messages in thread
From: Joanne Koong @ 2026-03-17 23:25 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara

On Mon, Mar 16, 2026 at 5:15 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote:
> >
>
> Hi John,
>
> Thanks for your detailed report.
>
> > Kernel: 6.19.6-arch1-1
> > Component: fs/fuse, fs/fs-writeback
> >
> > --- SUMMARY ---
> >
> > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.
> >
> > --- BACKGROUND ---
> >
> > This issue was originally reported in:
> > https://github.com/containers/fuse-overlayfs/issues/386
> >
> > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.
> >
> > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:
> >
> > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
> > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3
> >
> > --- REPRODUCTION ---
> >
> > mkdir -p /dev/shm/test/up
> > mkdir -p /dev/shm/test/tmp
> > mkdir -p /dev/shm/test/data
> >
> > fuse-overlayfs -o "static_nlink,noacl,\
> >   upperdir=/dev/shm/test/up,\
> >   lowerdir=$HOME/.config/mozilla/firefox,\
> >   workdir=/dev/shm/test/tmp" \
> >   /dev/shm/test/data
> >
> > firefox --profile /dev/shm/test/data/PROFILENAME
> >
> > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.
> >
> > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:
> >
> > # reboot
> > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.
> >
> > --- CALL TRACE (6.19.6) ---
> >
> > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
> > Mar 15 06:44:42 kernel:       Not tainted 6.19.6-arch1-1 #1
> > Mar 15 06:44:42 kernel: task:kworker/u128:0  state:D stack:0     pid:106160 tgid:106160 ppid:2
> > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
> > Mar 15 06:44:42 kernel: Call Trace:
> > Mar 15 06:44:42 kernel:  <TASK>
> > Mar 15 06:44:42 kernel:  __schedule+0x457/0x1720
> > Mar 15 06:44:42 kernel:  schedule+0x27/0xd0
> > Mar 15 06:44:42 kernel:  wb_wait_for_completion+0x97/0xe0
> > Mar 15 06:44:42 kernel:  sync_inodes_sb+0xf8/0x2e0
> > Mar 15 06:44:42 kernel:  __iterate_supers+0xdc/0x160
> > Mar 15 06:44:42 kernel:  ksys_sync+0x43/0xb0
> > Mar 15 06:44:42 kernel:  pm_fs_sync_work_fn+0x17/0xa0
> > Mar 15 06:44:42 kernel:  process_one_work+0x193/0x350
> > Mar 15 06:44:42 kernel:  worker_thread+0x1a1/0x310
> > Mar 15 06:44:42 kernel:  kthread+0xfc/0x240
> > Mar 15 06:44:42 kernel:  ret_from_fork+0x243/0x280
> > Mar 15 06:44:42 kernel:  ret_from_fork_asm+0x1a/0x30
> > Mar 15 06:44:42 kernel:  </TASK>
> >
> > --- MORE CONTEXT ---
> >
> > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.
> >
> > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.
>
> I'll need to run the repro and verify this is the issue, but I think
> it's because it's hitting this call chain if there's a dirty folio
> that is already under writeback that needs to have writeback issued on
> it again:
>
> wb_workfn()
>   wb_do_writeback()
>     wb_writeback()
>       writeback_sb_inodes()
>         __writeback_single_inode()
>           do_writepages()
>             fuse_writepages()
>               iomap_writepages()
>                 writeback_iter()
>                   writeback_get_folio()
>                     folio_prepare_writeback()
>
> where in the folio_prepare_writeback() logic:
>
>         if (!folio_test_dirty(folio))
>                 return false;
>
>         if (folio_test_writeback(folio)) {
>                 if (wbc->sync_mode == WB_SYNC_NONE)
>                         return false;
>                 folio_wait_writeback(folio);
>

I couldn't get the firefox+youtube scenario running because my VM
lacks graphics support and I couldn't figure out how to get that to
work, but I reproduced the bug using fuse-overlayfs with a synthetic
I/O workload:

(while true; do dd if=/dev/urandom
of=/dev/shm/test/data/test.default/file1 bs=1M count=10 2>/dev/null;
done) &
sleep 2
systemctl suspend

which gave pretty much the same stack trace:

[  154.892336] INFO: task kworker/u64:1:119 blocked for more than 100 seconds.
[  154.892967]       Not tainted 7.0.0-rc1-g6dcceeb72856-dirty #1792
[  154.893452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  154.894048] task:kworker/u64:1   state:D stack:0     pid:119
tgid:119   ppid:2      task_flags:0x4208160 flags:0x00080000
[  154.894879] Workqueue: pm_fs_sync pm_fs_sync_work_fn
[  154.895309] Call Trace:
[  154.895514]  <TASK>
[  154.895759]  __schedule+0xeaa/0x3d60
[  154.896060]  ? __filemap_fdatawait_range+0xb0/0x160
[  154.896435]  ? __pfx___schedule+0x10/0x10
[  154.896744]  ? bdi_split_work_to_wbs+0x50b/0x990
[  154.897093]  schedule+0x7e/0x2e0
[  154.897344]  wb_wait_for_completion+0x14b/0x1e0
[  154.897690]  ? __pfx_wb_wait_for_completion+0x10/0x10
[  154.898112]  ? __pfx_autoremove_wake_function+0x10/0x10
[  154.898550]  ? __pfx_mutex_unlock+0x10/0x10
[  154.898915]  ? iput+0x61/0x600
[  154.899175]  sync_inodes_sb+0x1be/0x740
[  154.899490]  ? __pfx_down_read+0x10/0x10
[  154.899814]  ? __pfx_sync_inodes_sb+0x10/0x10
[  154.900165]  ? __pfx_super_lock+0x10/0x10
[  154.900495]  ? _raw_spin_lock+0x84/0xe0
[  154.900826]  ? __pfx__raw_spin_lock+0x10/0x10
[  154.901154]  __iterate_supers+0x176/0x280
[  154.901455]  ? __pfx_sync_inodes_one_sb+0x10/0x10
[  154.901845]  ? _raw_spin_unlock_irq+0xe/0x30
[  154.902169]  ksys_sync+0x87/0xf0
[  154.902428]  ? __pfx_ksys_sync+0x10/0x10
[  154.902772]  ? kvm_clock_get_cycles+0x18/0x30
[  154.903131]  ? ktime_get+0x65/0x140
[  154.903417]  pm_fs_sync_work_fn+0x17/0xc0
[  154.903750]  process_one_work+0x656/0x1150
[  154.904083]  ? assign_work+0x122/0x3e0
[  154.904367]  worker_thread+0x5de/0xcb0
[  154.904670]  ? __pfx_worker_thread+0x10/0x10
[  154.905052]  kthread+0x34d/0x450

Further debugging showed it's not due to that call chain with the
folio_wait_writeback() in folio_prepare_writeback() scenario described
in the previous message, but due to the fuse daemon being stuck and
thus unable to fully service the ->write_inode() by the wb_workfn:

PID 121 (kworker/u64:2+flush-0:51):
[<0>] request_wait_answer+0x215/0x560 [fuse]
[<0>] __fuse_simple_request+0x341/0xc80 [fuse]
[<0>] fuse_flush_times+0x2ff/0x410 [fuse]
[<0>] fuse_write_inode+0x93/0x100 [fuse]
[<0>] __writeback_single_inode+0x5e1/0x930
[<0>] writeback_sb_inodes+0x53e/0xe40
[<0>] __writeback_inodes_wb+0xb7/0x200
[<0>] wb_writeback+0x571/0x790
[<0>] wb_workfn+0x73a/0xb90
[<0>] process_one_work+0x656/0x1150
[<0>] worker_thread+0x5de/0xcb0
[<0>] kthread+0x34d/0x450
[<0>] ret_from_fork+0x3ca/0x640
[<0>] ret_from_fork_asm+0x1a/0x30

In the kernel suspend code though, the sync (pm_sleep_fs_sync())
happens before freezing any userspace processes
(suspend_freeze_processes()) so it's odd for the fuse daemon to be
stuck.

It turns out the daemon is stuck because systemd freezes the user
session cgroups first before invoking the kernel suspend:

> ps aux | grep -i "fuse-overlayfs"
vmuser       826  0.0  0.0  12316  2616 ?        Ss   15:02   0:00
fuse-overlayfs -o ...

> cgroup=$(cat /proc/826/cgroup | head -1 | cut -d: -f3)
> echo "cgroup: $cgroup"
> cat /sys/fs/cgroup${cgroup}/cgroup.freeze 2>/dev/null || echo "No cgroup freeze file"
> cat /sys/fs/cgroup${cgroup}/cgroup.events 2>/dev/null | grep frozen
cgroup: /user.slice/user-1001.slice/session-3.scope
1
frozen 1

Even though the root cause is the cgroup freezer rather than the
folio_prepare_writeback() scenario, the fix proposed previously is
still the fix for this.

Thanks,
Joanne

>
> Could you verify if this fixes the issue?:
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 8f8069fb76ba..4cbb22d80acf 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb,
>                  * Don't bother with new inodes or inodes being freed, first
>                  * kind does not need periodic writeout yet, and for the latter
>                  * kind writeout is handled by the freer.
> +                *
> +                * For sync(2), skip inodes whose mappings have no data
> +                * integrity guarantees (eg FUSE).
>                  */
>                 spin_lock(&inode->i_lock);
> -               if (inode_state_read(inode) & (I_NEW | I_FREEING |
> I_WILL_FREE)) {
> +               if ((inode_state_read(inode) & (I_NEW | I_FREEING |
> I_WILL_FREE)) ||
> +                   (wbc.for_sync &&
> mapping_no_data_integrity(inode->i_mapping))) {
>                         redirty_tail_locked(inode, wb);
>                         spin_unlock(&inode->i_lock);
>                         continue;
>
> I think the changes from commit f9a49aa302a0 ("fs/writeback: skip
> AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed
> even with the above patch since afaics, the writeback we wait on could
> have been from a previous background/periodic writeback run before
> sync was triggered.
>
> Thanks,
> Joanne
>
> >
> > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.
> >
> > I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below.
> >
> > References:
> > 1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386
> > 2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411
> > 3. Debian bug #1120058: https://bugs.debian.org/1120058

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-17 21:07   ` John
  2026-03-17 22:55     ` Joanne Koong
@ 2026-03-18  3:50     ` Joanne Koong
  2026-03-18 20:37       ` John
  1 sibling, 1 reply; 9+ messages in thread
From: Joanne Koong @ 2026-03-18  3:50 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org,
	Miklos Szeredi, Jan Kara

On Tue, Mar 17, 2026 at 2:08 PM John <therealgraysky@proton.me> wrote:
>
>
> On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote:
>
> > Could you verify if this fixes the issue?:
> >
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 8f8069fb76ba..4cbb22d80acf 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb,
> >                  * Don't bother with new inodes or inodes being freed, first
> >                  * kind does not need periodic writeout yet, and for the latter
> >                  * kind writeout is handled by the freer.
> > +                *
> > +                * For sync(2), skip inodes whose mappings have no data
> > +                * integrity guarantees (eg FUSE).
> >                  */
> >                 spin_lock(&inode->i_lock);
> > -               if (inode_state_read(inode) & (I_NEW | I_FREEING |
> > I_WILL_FREE)) {
> > +               if ((inode_state_read(inode) & (I_NEW | I_FREEING |
> > I_WILL_FREE)) ||
> > +                   (wbc.for_sync &&
> > mapping_no_data_integrity(inode->i_mapping))) {
> >                         redirty_tail_locked(inode, wb);
> >                         spin_unlock(&inode->i_lock);
> >                         continue;
> >
> > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip
> > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed
> > even with the above patch since afaics, the writeback we wait on could
> > have been from a previous background/periodic writeback run before
> > sync was triggered.
>
> Yes, it does! Amazing!
>
> I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times.
>

I think this is the cleaner fix:

writeback: skip sync(2) inode writeback for filesystems with no data
integrity guarantees

Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
guarantee data persistence on sync (eg fuse) and skip sync(2) inode
writeback for superblocks with this flag set.

This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
in wait_sb_inodes()"). The flag belongs at the superblock level because
data integrity is a filesystem-wide property,  not a per-inode one. This
also allows sync_inodes_one_sb() to skip the entire filesystem
efficiently, rather than iterating every dirty inode only to skip each
one individually.

Without this, sync(2) triggers writeback on FUSE inodes and may block
waiting for the daemon to complete issued writeback or setattr (from
->write_inode()) requests.

This restores fuse to its prior behavior before tmp folios were removed,
where sync was essentially a no-op.

Reported-by: John <therealgraysky@proton.me>
Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and
internal rb tree")
Cc: <stable@vger.kernel.org>
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fs-writeback.c              |  7 +------
 fs/fuse/file.c                 |  4 +---
 fs/fuse/inode.c                |  1 +
 fs/sync.c                      |  2 +-
 include/linux/fs/super_types.h |  1 +
 include/linux/pagemap.h        | 11 -----------
 6 files changed, 5 insertions(+), 21 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8f8069fb76ba..7a02483e0d8d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb)
                 * The mapping can appear untagged while still on-list since we
                 * do not have the mapping lock. Skip it here, wb completion
                 * will remove it.
-                *
-                * If the mapping does not have data integrity semantics,
-                * there's no need to wait for the writeout to complete, as the
-                * mapping cannot guarantee that data is persistently stored.
                 */
-               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) ||
-                   mapping_no_data_integrity(mapping))
+               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
                        continue;

                spin_unlock_irq(&sb->s_inode_wblist_lock);
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 7294bd347412..111ccc5bdda3 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode,
unsigned int flags)

        inode->i_fop = &fuse_file_operations;
        inode->i_data.a_ops = &fuse_file_aops;
-       if (fc->writeback_cache) {
+       if (fc->writeback_cache)
                mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
-               mapping_set_no_data_integrity(&inode->i_data);
-       }

        INIT_LIST_HEAD(&fi->write_files);
        INIT_LIST_HEAD(&fi->queued_writes);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index af8ad96829fd..bae7d9ac3a43 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb)
        sb->s_export_op = &fuse_export_operations;
        sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE;
        sb->s_iflags |= SB_I_NOIDMAP;
+       sb->s_iflags |= SB_I_NO_DATA_INTEGRITY;
        if (sb->s_user_ns != &init_user_ns)
                sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
        sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
diff --git a/fs/sync.c b/fs/sync.c
index 942a60cfedfb..88c08e2f76b2 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem);

 static void sync_inodes_one_sb(struct super_block *sb, void *arg)
 {
-       if (!sb_rdonly(sb))
+       if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY))
                sync_inodes_sb(sb);
 }

diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
index fa7638b81246..383050e7fdf5 100644
--- a/include/linux/fs/super_types.h
+++ b/include/linux/fs/super_types.h
@@ -338,5 +338,6 @@ struct super_block {
 #define SB_I_NOUMASK   0x00001000      /* VFS does not apply umask */
 #define SB_I_NOIDMAP   0x00002000      /* No idmapped mounts on this
superblock */
 #define SB_I_ALLOW_HSM 0x00004000      /* Allow HSM events on this
superblock */
+#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data
persistence on sync */

 #endif /* _LINUX_FS_SUPER_TYPES_H */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ec442af3f886..31a848485ad9 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -210,7 +210,6 @@ enum mapping_flags {
        AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9,
        AS_KERNEL_FILE = 10,    /* mapping for a fake kernel file that shouldn't
                                   account usage to user cgroups */
-       AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */
        /* Bits 16-25 are used for FOLIO_ORDER */
        AS_FOLIO_ORDER_BITS = 5,
        AS_FOLIO_ORDER_MIN = 16,
@@ -346,16 +345,6 @@ static inline bool
mapping_writeback_may_deadlock_on_reclaim(const struct addres
        return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
 }

-static inline void mapping_set_no_data_integrity(struct address_space *mapping)
-{
-       set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
-}
-
-static inline bool mapping_no_data_integrity(const struct
address_space *mapping)
-{
-       return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
-}
-
 static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
 {
        return mapping->gfp_mask;

I tested it locally on my simulated repro and it fixed the issue for
me. Could you verify that this patch fixes the issue for you as well?

Thanks,
Joanne

> Is this something you plan to submit/backport to the stable kernels series?
>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-18  3:50     ` Joanne Koong
@ 2026-03-18 20:37       ` John
  2026-03-18 20:45         ` Joanne Koong
  0 siblings, 1 reply; 9+ messages in thread
From: John @ 2026-03-18 20:37 UTC (permalink / raw)
  To: Joanne Koong
  Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org,
	Miklos Szeredi, Jan Kara


On Tuesday, March 17th, 2026 at 11:51 PM, Joanne Koong <joannelkoong@gmail.com> wrote:
> I think this is the cleaner fix:
> 
> writeback: skip sync(2) inode writeback for filesystems with no data
> integrity guarantees
> 
> Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
> guarantee data persistence on sync (eg fuse) and skip sync(2) inode
> writeback for superblocks with this flag set.
> 
> This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
> commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
> in wait_sb_inodes()"). The flag belongs at the superblock level because
> data integrity is a filesystem-wide property,  not a per-inode one. This
> also allows sync_inodes_one_sb() to skip the entire filesystem
> efficiently, rather than iterating every dirty inode only to skip each
> one individually.
> 
> Without this, sync(2) triggers writeback on FUSE inodes and may block
> waiting for the daemon to complete issued writeback or setattr (from
> ->write_inode()) requests.
> 
> This restores fuse to its prior behavior before tmp folios were removed,
> where sync was essentially a no-op.
> 
> Reported-by: John <therealgraysky@proton.me>
> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and
> internal rb tree")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fs-writeback.c              |  7 +------
>  fs/fuse/file.c                 |  4 +---
>  fs/fuse/inode.c                |  1 +
>  fs/sync.c                      |  2 +-
>  include/linux/fs/super_types.h |  1 +
>  include/linux/pagemap.h        | 11 -----------
>  6 files changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 8f8069fb76ba..7a02483e0d8d 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb)
>                  * The mapping can appear untagged while still on-list since we
>                  * do not have the mapping lock. Skip it here, wb completion
>                  * will remove it.
> -                *
> -                * If the mapping does not have data integrity semantics,
> -                * there's no need to wait for the writeout to complete, as the
> -                * mapping cannot guarantee that data is persistently stored.
>                  */
> -               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) ||
> -                   mapping_no_data_integrity(mapping))
> +               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
>                         continue;
> 
>                 spin_unlock_irq(&sb->s_inode_wblist_lock);
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 7294bd347412..111ccc5bdda3 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode,
> unsigned int flags)
> 
>         inode->i_fop = &fuse_file_operations;
>         inode->i_data.a_ops = &fuse_file_aops;
> -       if (fc->writeback_cache) {
> +       if (fc->writeback_cache)
>                 mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> -               mapping_set_no_data_integrity(&inode->i_data);
> -       }
> 
>         INIT_LIST_HEAD(&fi->write_files);
>         INIT_LIST_HEAD(&fi->queued_writes);
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index af8ad96829fd..bae7d9ac3a43 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb)
>         sb->s_export_op = &fuse_export_operations;
>         sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE;
>         sb->s_iflags |= SB_I_NOIDMAP;
> +       sb->s_iflags |= SB_I_NO_DATA_INTEGRITY;
>         if (sb->s_user_ns != &init_user_ns)
>                 sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
>         sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
> diff --git a/fs/sync.c b/fs/sync.c
> index 942a60cfedfb..88c08e2f76b2 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem);
> 
>  static void sync_inodes_one_sb(struct super_block *sb, void *arg)
>  {
> -       if (!sb_rdonly(sb))
> +       if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY))
>                 sync_inodes_sb(sb);
>  }
> 
> diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
> index fa7638b81246..383050e7fdf5 100644
> --- a/include/linux/fs/super_types.h
> +++ b/include/linux/fs/super_types.h
> @@ -338,5 +338,6 @@ struct super_block {
>  #define SB_I_NOUMASK   0x00001000      /* VFS does not apply umask */
>  #define SB_I_NOIDMAP   0x00002000      /* No idmapped mounts on this
> superblock */
>  #define SB_I_ALLOW_HSM 0x00004000      /* Allow HSM events on this
> superblock */
> +#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data
> persistence on sync */
> 
>  #endif /* _LINUX_FS_SUPER_TYPES_H */
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index ec442af3f886..31a848485ad9 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -210,7 +210,6 @@ enum mapping_flags {
>         AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9,
>         AS_KERNEL_FILE = 10,    /* mapping for a fake kernel file that shouldn't
>                                    account usage to user cgroups */
> -       AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */
>         /* Bits 16-25 are used for FOLIO_ORDER */
>         AS_FOLIO_ORDER_BITS = 5,
>         AS_FOLIO_ORDER_MIN = 16,
> @@ -346,16 +345,6 @@ static inline bool
> mapping_writeback_may_deadlock_on_reclaim(const struct addres
>         return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
>  }
> 
> -static inline void mapping_set_no_data_integrity(struct address_space *mapping)
> -{
> -       set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
> -}
> -
> -static inline bool mapping_no_data_integrity(const struct
> address_space *mapping)
> -{
> -       return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
> -}
> -
>  static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
>  {
>         return mapping->gfp_mask;
> 
> I tested it locally on my simulated repro and it fixed the issue for
> me. Could you verify that this patch fixes the issue for you as well?

This new patch also corrects the bug. Same test as before. Thank you!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-18 20:37       ` John
@ 2026-03-18 20:45         ` Joanne Koong
  0 siblings, 0 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-18 20:45 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org,
	Miklos Szeredi, Jan Kara

On Wed, Mar 18, 2026 at 1:37 PM John <therealgraysky@proton.me> wrote:
>
>
> On Tuesday, March 17th, 2026 at 11:51 PM, Joanne Koong <joannelkoong@gmail.com> wrote:
> > I think this is the cleaner fix:
> >
> > writeback: skip sync(2) inode writeback for filesystems with no data
> > integrity guarantees
> >
> > Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
> > guarantee data persistence on sync (eg fuse) and skip sync(2) inode
> > writeback for superblocks with this flag set.
> >
> > This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
> > commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
> > in wait_sb_inodes()"). The flag belongs at the superblock level because
> > data integrity is a filesystem-wide property,  not a per-inode one. This
> > also allows sync_inodes_one_sb() to skip the entire filesystem
> > efficiently, rather than iterating every dirty inode only to skip each
> > one individually.
> >
> > Without this, sync(2) triggers writeback on FUSE inodes and may block
> > waiting for the daemon to complete issued writeback or setattr (from
> > ->write_inode()) requests.
> >
> > This restores fuse to its prior behavior before tmp folios were removed,
> > where sync was essentially a no-op.
> >
> > Reported-by: John <therealgraysky@proton.me>
> > Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and
> > internal rb tree")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  fs/fs-writeback.c              |  7 +------
> >  fs/fuse/file.c                 |  4 +---
> >  fs/fuse/inode.c                |  1 +
> >  fs/sync.c                      |  2 +-
> >  include/linux/fs/super_types.h |  1 +
> >  include/linux/pagemap.h        | 11 -----------
> >  6 files changed, 5 insertions(+), 21 deletions(-)
> >
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 8f8069fb76ba..7a02483e0d8d 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb)
> >                  * The mapping can appear untagged while still on-list since we
> >                  * do not have the mapping lock. Skip it here, wb completion
> >                  * will remove it.
> > -                *
> > -                * If the mapping does not have data integrity semantics,
> > -                * there's no need to wait for the writeout to complete, as the
> > -                * mapping cannot guarantee that data is persistently stored.
> >                  */
> > -               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) ||
> > -                   mapping_no_data_integrity(mapping))
> > +               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> >                         continue;
> >
> >                 spin_unlock_irq(&sb->s_inode_wblist_lock);
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index 7294bd347412..111ccc5bdda3 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode,
> > unsigned int flags)
> >
> >         inode->i_fop = &fuse_file_operations;
> >         inode->i_data.a_ops = &fuse_file_aops;
> > -       if (fc->writeback_cache) {
> > +       if (fc->writeback_cache)
> >                 mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> > -               mapping_set_no_data_integrity(&inode->i_data);
> > -       }
> >
> >         INIT_LIST_HEAD(&fi->write_files);
> >         INIT_LIST_HEAD(&fi->queued_writes);
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index af8ad96829fd..bae7d9ac3a43 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb)
> >         sb->s_export_op = &fuse_export_operations;
> >         sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE;
> >         sb->s_iflags |= SB_I_NOIDMAP;
> > +       sb->s_iflags |= SB_I_NO_DATA_INTEGRITY;
> >         if (sb->s_user_ns != &init_user_ns)
> >                 sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
> >         sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
> > diff --git a/fs/sync.c b/fs/sync.c
> > index 942a60cfedfb..88c08e2f76b2 100644
> > --- a/fs/sync.c
> > +++ b/fs/sync.c
> > @@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem);
> >
> >  static void sync_inodes_one_sb(struct super_block *sb, void *arg)
> >  {
> > -       if (!sb_rdonly(sb))
> > +       if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY))
> >                 sync_inodes_sb(sb);
> >  }
> >
> > diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
> > index fa7638b81246..383050e7fdf5 100644
> > --- a/include/linux/fs/super_types.h
> > +++ b/include/linux/fs/super_types.h
> > @@ -338,5 +338,6 @@ struct super_block {
> >  #define SB_I_NOUMASK   0x00001000      /* VFS does not apply umask */
> >  #define SB_I_NOIDMAP   0x00002000      /* No idmapped mounts on this
> > superblock */
> >  #define SB_I_ALLOW_HSM 0x00004000      /* Allow HSM events on this
> > superblock */
> > +#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data
> > persistence on sync */
> >
> >  #endif /* _LINUX_FS_SUPER_TYPES_H */
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index ec442af3f886..31a848485ad9 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -210,7 +210,6 @@ enum mapping_flags {
> >         AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9,
> >         AS_KERNEL_FILE = 10,    /* mapping for a fake kernel file that shouldn't
> >                                    account usage to user cgroups */
> > -       AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */
> >         /* Bits 16-25 are used for FOLIO_ORDER */
> >         AS_FOLIO_ORDER_BITS = 5,
> >         AS_FOLIO_ORDER_MIN = 16,
> > @@ -346,16 +345,6 @@ static inline bool
> > mapping_writeback_may_deadlock_on_reclaim(const struct addres
> >         return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
> >  }
> >
> > -static inline void mapping_set_no_data_integrity(struct address_space *mapping)
> > -{
> > -       set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
> > -}
> > -
> > -static inline bool mapping_no_data_integrity(const struct
> > address_space *mapping)
> > -{
> > -       return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
> > -}
> > -
> >  static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
> >  {
> >         return mapping->gfp_mask;
> >
> > I tested it locally on my simulated repro and it fixed the issue for
> > me. Could you verify that this patch fixes the issue for you as well?
>
> This new patch also corrects the bug. Same test as before. Thank you!

Great, I will submit this patch to the mailing list today. Thank you
for verifying!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
  2026-03-17 23:25   ` Joanne Koong
@ 2026-03-18 22:31     ` Joanne Koong
  0 siblings, 0 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-18 22:31 UTC (permalink / raw)
  To: John
  Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net,
	linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara

On Tue, Mar 17, 2026 at 4:25 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Mar 16, 2026 at 5:15 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote:
> > >
> >
> > Hi John,
> >
> > Thanks for your detailed report.
> >
> > > Kernel: 6.19.6-arch1-1
> > > Component: fs/fuse, fs/fs-writeback
> > >
> > > --- SUMMARY ---
> > >
> > > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.
> > >
> > > --- BACKGROUND ---
> > >
> > > This issue was originally reported in:
> > > https://github.com/containers/fuse-overlayfs/issues/386
> > >
> > > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.
> > >
> > > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:
> > >
> > > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
> > > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3
> > >
> > > --- REPRODUCTION ---
> > >
> > > mkdir -p /dev/shm/test/up
> > > mkdir -p /dev/shm/test/tmp
> > > mkdir -p /dev/shm/test/data
> > >
> > > fuse-overlayfs -o "static_nlink,noacl,\
> > >   upperdir=/dev/shm/test/up,\
> > >   lowerdir=$HOME/.config/mozilla/firefox,\
> > >   workdir=/dev/shm/test/tmp" \
> > >   /dev/shm/test/data
> > >
> > > firefox --profile /dev/shm/test/data/PROFILENAME
> > >
> > > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.
> > >
> > > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:
> > >
> > > # reboot
> > > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.
> > >
> > > --- CALL TRACE (6.19.6) ---
> > >
> > > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
> > > Mar 15 06:44:42 kernel:       Not tainted 6.19.6-arch1-1 #1
> > > Mar 15 06:44:42 kernel: task:kworker/u128:0  state:D stack:0     pid:106160 tgid:106160 ppid:2
> > > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
> > > Mar 15 06:44:42 kernel: Call Trace:
> > > Mar 15 06:44:42 kernel:  <TASK>
> > > Mar 15 06:44:42 kernel:  __schedule+0x457/0x1720
> > > Mar 15 06:44:42 kernel:  schedule+0x27/0xd0
> > > Mar 15 06:44:42 kernel:  wb_wait_for_completion+0x97/0xe0
> > > Mar 15 06:44:42 kernel:  sync_inodes_sb+0xf8/0x2e0
> > > Mar 15 06:44:42 kernel:  __iterate_supers+0xdc/0x160
> > > Mar 15 06:44:42 kernel:  ksys_sync+0x43/0xb0
> > > Mar 15 06:44:42 kernel:  pm_fs_sync_work_fn+0x17/0xa0
> > > Mar 15 06:44:42 kernel:  process_one_work+0x193/0x350
> > > Mar 15 06:44:42 kernel:  worker_thread+0x1a1/0x310
> > > Mar 15 06:44:42 kernel:  kthread+0xfc/0x240
> > > Mar 15 06:44:42 kernel:  ret_from_fork+0x243/0x280
> > > Mar 15 06:44:42 kernel:  ret_from_fork_asm+0x1a/0x30
> > > Mar 15 06:44:42 kernel:  </TASK>
> > >
> > > --- MORE CONTEXT ---
> > >
> > > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.
> > >
> > > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.
> >
> > I'll need to run the repro and verify this is the issue, but I think
> > it's because it's hitting this call chain if there's a dirty folio
> > that is already under writeback that needs to have writeback issued on
> > it again:
> >
> > wb_workfn()
> >   wb_do_writeback()
> >     wb_writeback()
> >       writeback_sb_inodes()
> >         __writeback_single_inode()
> >           do_writepages()
> >             fuse_writepages()
> >               iomap_writepages()
> >                 writeback_iter()
> >                   writeback_get_folio()
> >                     folio_prepare_writeback()
> >
> > where in the folio_prepare_writeback() logic:
> >
> >         if (!folio_test_dirty(folio))
> >                 return false;
> >
> >         if (folio_test_writeback(folio)) {
> >                 if (wbc->sync_mode == WB_SYNC_NONE)
> >                         return false;
> >                 folio_wait_writeback(folio);
> >
>
> I couldn't get the firefox+youtube scenario running because my VM
> lacks graphics support and I couldn't figure out how to get that to
> work, but I reproduced the bug using fuse-overlayfs with a synthetic
> I/O workload:

(fwiw, my repro triggers the hang even prior to the tmp pages being
removed, so I think the firefox + youtube scenario you're running into
is indeed the folio_prepare_writeback() scenario)

>
> (while true; do dd if=/dev/urandom
> of=/dev/shm/test/data/test.default/file1 bs=1M count=10 2>/dev/null;
> done) &
> sleep 2
> systemctl suspend
>
> which gave pretty much the same stack trace:
>
> [  154.892336] INFO: task kworker/u64:1:119 blocked for more than 100 seconds.
> [  154.892967]       Not tainted 7.0.0-rc1-g6dcceeb72856-dirty #1792
> [  154.893452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  154.894048] task:kworker/u64:1   state:D stack:0     pid:119
> tgid:119   ppid:2      task_flags:0x4208160 flags:0x00080000
> [  154.894879] Workqueue: pm_fs_sync pm_fs_sync_work_fn
> [  154.895309] Call Trace:
> [  154.895514]  <TASK>
> [  154.895759]  __schedule+0xeaa/0x3d60
> [  154.896060]  ? __filemap_fdatawait_range+0xb0/0x160
> [  154.896435]  ? __pfx___schedule+0x10/0x10
> [  154.896744]  ? bdi_split_work_to_wbs+0x50b/0x990
> [  154.897093]  schedule+0x7e/0x2e0
> [  154.897344]  wb_wait_for_completion+0x14b/0x1e0
> [  154.897690]  ? __pfx_wb_wait_for_completion+0x10/0x10
> [  154.898112]  ? __pfx_autoremove_wake_function+0x10/0x10
> [  154.898550]  ? __pfx_mutex_unlock+0x10/0x10
> [  154.898915]  ? iput+0x61/0x600
> [  154.899175]  sync_inodes_sb+0x1be/0x740
> [  154.899490]  ? __pfx_down_read+0x10/0x10
> [  154.899814]  ? __pfx_sync_inodes_sb+0x10/0x10
> [  154.900165]  ? __pfx_super_lock+0x10/0x10
> [  154.900495]  ? _raw_spin_lock+0x84/0xe0
> [  154.900826]  ? __pfx__raw_spin_lock+0x10/0x10
> [  154.901154]  __iterate_supers+0x176/0x280
> [  154.901455]  ? __pfx_sync_inodes_one_sb+0x10/0x10
> [  154.901845]  ? _raw_spin_unlock_irq+0xe/0x30
> [  154.902169]  ksys_sync+0x87/0xf0
> [  154.902428]  ? __pfx_ksys_sync+0x10/0x10
> [  154.902772]  ? kvm_clock_get_cycles+0x18/0x30
> [  154.903131]  ? ktime_get+0x65/0x140
> [  154.903417]  pm_fs_sync_work_fn+0x17/0xc0
> [  154.903750]  process_one_work+0x656/0x1150
> [  154.904083]  ? assign_work+0x122/0x3e0
> [  154.904367]  worker_thread+0x5de/0xcb0
> [  154.904670]  ? __pfx_worker_thread+0x10/0x10
> [  154.905052]  kthread+0x34d/0x450
>
> Further debugging showed it's not due to that call chain with the
> folio_wait_writeback() in folio_prepare_writeback() scenario described
> in the previous message, but due to the fuse daemon being stuck and
> thus unable to fully service the ->write_inode() by the wb_workfn:
>
> PID 121 (kworker/u64:2+flush-0:51):
> [<0>] request_wait_answer+0x215/0x560 [fuse]
> [<0>] __fuse_simple_request+0x341/0xc80 [fuse]
> [<0>] fuse_flush_times+0x2ff/0x410 [fuse]
> [<0>] fuse_write_inode+0x93/0x100 [fuse]
> [<0>] __writeback_single_inode+0x5e1/0x930
> [<0>] writeback_sb_inodes+0x53e/0xe40
> [<0>] __writeback_inodes_wb+0xb7/0x200
> [<0>] wb_writeback+0x571/0x790
> [<0>] wb_workfn+0x73a/0xb90
> [<0>] process_one_work+0x656/0x1150
> [<0>] worker_thread+0x5de/0xcb0
> [<0>] kthread+0x34d/0x450
> [<0>] ret_from_fork+0x3ca/0x640
> [<0>] ret_from_fork_asm+0x1a/0x30
>
> In the kernel suspend code though, the sync (pm_sleep_fs_sync())
> happens before freezing any userspace processes
> (suspend_freeze_processes()) so it's odd for the fuse daemon to be
> stuck.
>
> It turns out the daemon is stuck because systemd freezes the user
> session cgroups first before invoking the kernel suspend:
>
> > ps aux | grep -i "fuse-overlayfs"
> vmuser       826  0.0  0.0  12316  2616 ?        Ss   15:02   0:00
> fuse-overlayfs -o ...
>
> > cgroup=$(cat /proc/826/cgroup | head -1 | cut -d: -f3)
> > echo "cgroup: $cgroup"
> > cat /sys/fs/cgroup${cgroup}/cgroup.freeze 2>/dev/null || echo "No cgroup freeze file"
> > cat /sys/fs/cgroup${cgroup}/cgroup.events 2>/dev/null | grep frozen
> cgroup: /user.slice/user-1001.slice/session-3.scope
> 1
> frozen 1
>
> Even though the root cause is the cgroup freezer rather than the
> folio_prepare_writeback() scenario, the fix proposed previously is
> still the fix for this.
>
> Thanks,
> Joanne
>
> >
> > > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-18 22:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-15 11:24 [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) John
2026-03-17  0:15 ` Joanne Koong
2026-03-17 21:07   ` John
2026-03-17 22:55     ` Joanne Koong
2026-03-18  3:50     ` Joanne Koong
2026-03-18 20:37       ` John
2026-03-18 20:45         ` Joanne Koong
2026-03-17 23:25   ` Joanne Koong
2026-03-18 22:31     ` Joanne Koong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox