* [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
@ 2026-03-15 11:24 John
2026-03-17 0:15 ` Joanne Koong
0 siblings, 1 reply; 9+ messages in thread
From: John @ 2026-03-15 11:24 UTC (permalink / raw)
To: linux-fsdevel@vger.kernel.org
Cc: Joanne Koong, linux-fuse@lists.sourceforge.net,
linux-pm@vger.kernel.org, Miklos Szeredi
Kernel: 6.19.6-arch1-1
Component: fs/fuse, fs/fs-writeback
--- SUMMARY ---
A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.
--- BACKGROUND ---
This issue was originally reported in:
https://github.com/containers/fuse-overlayfs/issues/386
The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.
The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:
https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3
--- REPRODUCTION ---
mkdir -p /dev/shm/test/up
mkdir -p /dev/shm/test/tmp
mkdir -p /dev/shm/test/data
fuse-overlayfs -o "static_nlink,noacl,\
upperdir=/dev/shm/test/up,\
lowerdir=$HOME/.config/mozilla/firefox,\
workdir=/dev/shm/test/tmp" \
/dev/shm/test/data
firefox --profile /dev/shm/test/data/PROFILENAME
Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.
The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:
# reboot
Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.
--- CALL TRACE (6.19.6) ---
Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
Mar 15 06:44:42 kernel: Not tainted 6.19.6-arch1-1 #1
Mar 15 06:44:42 kernel: task:kworker/u128:0 state:D stack:0 pid:106160 tgid:106160 ppid:2
Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
Mar 15 06:44:42 kernel: Call Trace:
Mar 15 06:44:42 kernel: <TASK>
Mar 15 06:44:42 kernel: __schedule+0x457/0x1720
Mar 15 06:44:42 kernel: schedule+0x27/0xd0
Mar 15 06:44:42 kernel: wb_wait_for_completion+0x97/0xe0
Mar 15 06:44:42 kernel: sync_inodes_sb+0xf8/0x2e0
Mar 15 06:44:42 kernel: __iterate_supers+0xdc/0x160
Mar 15 06:44:42 kernel: ksys_sync+0x43/0xb0
Mar 15 06:44:42 kernel: pm_fs_sync_work_fn+0x17/0xa0
Mar 15 06:44:42 kernel: process_one_work+0x193/0x350
Mar 15 06:44:42 kernel: worker_thread+0x1a1/0x310
Mar 15 06:44:42 kernel: kthread+0xfc/0x240
Mar 15 06:44:42 kernel: ret_from_fork+0x243/0x280
Mar 15 06:44:42 kernel: ret_from_fork_asm+0x1a/0x30
Mar 15 06:44:42 kernel: </TASK>
--- MORE CONTEXT ---
Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.
The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.
Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.
I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below.
References:
1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386
2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411
3. Debian bug #1120058: https://bugs.debian.org/1120058
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-15 11:24 [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) John @ 2026-03-17 0:15 ` Joanne Koong 2026-03-17 21:07 ` John 2026-03-17 23:25 ` Joanne Koong 0 siblings, 2 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-17 0:15 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote: > Hi John, Thanks for your detailed report. > Kernel: 6.19.6-arch1-1 > Component: fs/fuse, fs/fs-writeback > > --- SUMMARY --- > > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path. > > --- BACKGROUND --- > > This issue was originally reported in: > https://github.com/containers/fuse-overlayfs/issues/386 > > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach. > > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6: > > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3 > > --- REPRODUCTION --- > > mkdir -p /dev/shm/test/up > mkdir -p /dev/shm/test/tmp > mkdir -p /dev/shm/test/data > > fuse-overlayfs -o "static_nlink,noacl,\ > upperdir=/dev/shm/test/up,\ > lowerdir=$HOME/.config/mozilla/firefox,\ > workdir=/dev/shm/test/tmp" \ > /dev/shm/test/data > > firefox --profile /dev/shm/test/data/PROFILENAME > > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu. > > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked: > > # reboot > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation. > > --- CALL TRACE (6.19.6) --- > > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds. > Mar 15 06:44:42 kernel: Not tainted 6.19.6-arch1-1 #1 > Mar 15 06:44:42 kernel: task:kworker/u128:0 state:D stack:0 pid:106160 tgid:106160 ppid:2 > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn > Mar 15 06:44:42 kernel: Call Trace: > Mar 15 06:44:42 kernel: <TASK> > Mar 15 06:44:42 kernel: __schedule+0x457/0x1720 > Mar 15 06:44:42 kernel: schedule+0x27/0xd0 > Mar 15 06:44:42 kernel: wb_wait_for_completion+0x97/0xe0 > Mar 15 06:44:42 kernel: sync_inodes_sb+0xf8/0x2e0 > Mar 15 06:44:42 kernel: __iterate_supers+0xdc/0x160 > Mar 15 06:44:42 kernel: ksys_sync+0x43/0xb0 > Mar 15 06:44:42 kernel: pm_fs_sync_work_fn+0x17/0xa0 > Mar 15 06:44:42 kernel: process_one_work+0x193/0x350 > Mar 15 06:44:42 kernel: worker_thread+0x1a1/0x310 > Mar 15 06:44:42 kernel: kthread+0xfc/0x240 > Mar 15 06:44:42 kernel: ret_from_fork+0x243/0x280 > Mar 15 06:44:42 kernel: ret_from_fork_asm+0x1a/0x30 > Mar 15 06:44:42 kernel: </TASK> > > --- MORE CONTEXT --- > > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress. > > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed. I'll need to run the repro and verify this is the issue, but I think it's because it's hitting this call chain if there's a dirty folio that is already under writeback that needs to have writeback issued on it again: wb_workfn() wb_do_writeback() wb_writeback() writeback_sb_inodes() __writeback_single_inode() do_writepages() fuse_writepages() iomap_writepages() writeback_iter() writeback_get_folio() folio_prepare_writeback() where in the folio_prepare_writeback() logic: if (!folio_test_dirty(folio)) return false; if (folio_test_writeback(folio)) { if (wbc->sync_mode == WB_SYNC_NONE) return false; folio_wait_writeback(folio); Could you verify if this fixes the issue?: diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 8f8069fb76ba..4cbb22d80acf 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb, * Don't bother with new inodes or inodes being freed, first * kind does not need periodic writeout yet, and for the latter * kind writeout is handled by the freer. + * + * For sync(2), skip inodes whose mappings have no data + * integrity guarantees (eg FUSE). */ spin_lock(&inode->i_lock); - if (inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) { + if ((inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) || + (wbc.for_sync && mapping_no_data_integrity(inode->i_mapping))) { redirty_tail_locked(inode, wb); spin_unlock(&inode->i_lock); continue; I think the changes from commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed even with the above patch since afaics, the writeback we wait on could have been from a previous background/periodic writeback run before sync was triggered. Thanks, Joanne > > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion. > > I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below. > > References: > 1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386 > 2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411 > 3. Debian bug #1120058: https://bugs.debian.org/1120058 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-17 0:15 ` Joanne Koong @ 2026-03-17 21:07 ` John 2026-03-17 22:55 ` Joanne Koong 2026-03-18 3:50 ` Joanne Koong 2026-03-17 23:25 ` Joanne Koong 1 sibling, 2 replies; 9+ messages in thread From: John @ 2026-03-17 21:07 UTC (permalink / raw) To: Joanne Koong Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote: > Could you verify if this fixes the issue?: > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 8f8069fb76ba..4cbb22d80acf 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb, > * Don't bother with new inodes or inodes being freed, first > * kind does not need periodic writeout yet, and for the latter > * kind writeout is handled by the freer. > + * > + * For sync(2), skip inodes whose mappings have no data > + * integrity guarantees (eg FUSE). > */ > spin_lock(&inode->i_lock); > - if (inode_state_read(inode) & (I_NEW | I_FREEING | > I_WILL_FREE)) { > + if ((inode_state_read(inode) & (I_NEW | I_FREEING | > I_WILL_FREE)) || > + (wbc.for_sync && > mapping_no_data_integrity(inode->i_mapping))) { > redirty_tail_locked(inode, wb); > spin_unlock(&inode->i_lock); > continue; > > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed > even with the above patch since afaics, the writeback we wait on could > have been from a previous background/periodic writeback run before > sync was triggered. Yes, it does! Amazing! I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times. Is this something you plan to submit/backport to the stable kernels series? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-17 21:07 ` John @ 2026-03-17 22:55 ` Joanne Koong 2026-03-18 3:50 ` Joanne Koong 1 sibling, 0 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-17 22:55 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Tue, Mar 17, 2026 at 2:08 PM John <therealgraysky@proton.me> wrote: > > > On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote: > > > Could you verify if this fixes the issue?: > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 8f8069fb76ba..4cbb22d80acf 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb, > > * Don't bother with new inodes or inodes being freed, first > > * kind does not need periodic writeout yet, and for the latter > > * kind writeout is handled by the freer. > > + * > > + * For sync(2), skip inodes whose mappings have no data > > + * integrity guarantees (eg FUSE). > > */ > > spin_lock(&inode->i_lock); > > - if (inode_state_read(inode) & (I_NEW | I_FREEING | > > I_WILL_FREE)) { > > + if ((inode_state_read(inode) & (I_NEW | I_FREEING | > > I_WILL_FREE)) || > > + (wbc.for_sync && > > mapping_no_data_integrity(inode->i_mapping))) { > > redirty_tail_locked(inode, wb); > > spin_unlock(&inode->i_lock); > > continue; > > > > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip > > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed > > even with the above patch since afaics, the writeback we wait on could > > have been from a previous background/periodic writeback run before > > sync was triggered. > > Yes, it does! Amazing! > > I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times. Thank you for testing this. > > Is this something you plan to submit/backport to the stable kernels series? Definitely. I will submit this patch upstream to the fs mailing list either today or tomorrow and cc stable@. Thanks, Joanne > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-17 21:07 ` John 2026-03-17 22:55 ` Joanne Koong @ 2026-03-18 3:50 ` Joanne Koong 2026-03-18 20:37 ` John 1 sibling, 1 reply; 9+ messages in thread From: Joanne Koong @ 2026-03-18 3:50 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Tue, Mar 17, 2026 at 2:08 PM John <therealgraysky@proton.me> wrote: > > > On Monday, March 16th, 2026 at 8:16 PM, Joanne Koong wrote: > > > Could you verify if this fixes the issue?: > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 8f8069fb76ba..4cbb22d80acf 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb, > > * Don't bother with new inodes or inodes being freed, first > > * kind does not need periodic writeout yet, and for the latter > > * kind writeout is handled by the freer. > > + * > > + * For sync(2), skip inodes whose mappings have no data > > + * integrity guarantees (eg FUSE). > > */ > > spin_lock(&inode->i_lock); > > - if (inode_state_read(inode) & (I_NEW | I_FREEING | > > I_WILL_FREE)) { > > + if ((inode_state_read(inode) & (I_NEW | I_FREEING | > > I_WILL_FREE)) || > > + (wbc.for_sync && > > mapping_no_data_integrity(inode->i_mapping))) { > > redirty_tail_locked(inode, wb); > > spin_unlock(&inode->i_lock); > > continue; > > > > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip > > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed > > even with the above patch since afaics, the writeback we wait on could > > have been from a previous background/periodic writeback run before > > sync was triggered. > > Yes, it does! Amazing! > > I applied it to 6.19.8, upon rebooting and following the steps I outlined above I was able suspend and wake up three consecutive times. > I think this is the cleaner fix: writeback: skip sync(2) inode writeback for filesystems with no data integrity guarantees Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot guarantee data persistence on sync (eg fuse) and skip sync(2) inode writeback for superblocks with this flag set. This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()"). The flag belongs at the superblock level because data integrity is a filesystem-wide property, not a per-inode one. This also allows sync_inodes_one_sb() to skip the entire filesystem efficiently, rather than iterating every dirty inode only to skip each one individually. Without this, sync(2) triggers writeback on FUSE inodes and may block waiting for the daemon to complete issued writeback or setattr (from ->write_inode()) requests. This restores fuse to its prior behavior before tmp folios were removed, where sync was essentially a no-op. Reported-by: John <therealgraysky@proton.me> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree") Cc: <stable@vger.kernel.org> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> --- fs/fs-writeback.c | 7 +------ fs/fuse/file.c | 4 +--- fs/fuse/inode.c | 1 + fs/sync.c | 2 +- include/linux/fs/super_types.h | 1 + include/linux/pagemap.h | 11 ----------- 6 files changed, 5 insertions(+), 21 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 8f8069fb76ba..7a02483e0d8d 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb) * The mapping can appear untagged while still on-list since we * do not have the mapping lock. Skip it here, wb completion * will remove it. - * - * If the mapping does not have data integrity semantics, - * there's no need to wait for the writeout to complete, as the - * mapping cannot guarantee that data is persistently stored. */ - if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) || - mapping_no_data_integrity(mapping)) + if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) continue; spin_unlock_irq(&sb->s_inode_wblist_lock); diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 7294bd347412..111ccc5bdda3 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags) inode->i_fop = &fuse_file_operations; inode->i_data.a_ops = &fuse_file_aops; - if (fc->writeback_cache) { + if (fc->writeback_cache) mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data); - mapping_set_no_data_integrity(&inode->i_data); - } INIT_LIST_HEAD(&fi->write_files); INIT_LIST_HEAD(&fi->queued_writes); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index af8ad96829fd..bae7d9ac3a43 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb) sb->s_export_op = &fuse_export_operations; sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; sb->s_iflags |= SB_I_NOIDMAP; + sb->s_iflags |= SB_I_NO_DATA_INTEGRITY; if (sb->s_user_ns != &init_user_ns) sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION); diff --git a/fs/sync.c b/fs/sync.c index 942a60cfedfb..88c08e2f76b2 100644 --- a/fs/sync.c +++ b/fs/sync.c @@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem); static void sync_inodes_one_sb(struct super_block *sb, void *arg) { - if (!sb_rdonly(sb)) + if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY)) sync_inodes_sb(sb); } diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h index fa7638b81246..383050e7fdf5 100644 --- a/include/linux/fs/super_types.h +++ b/include/linux/fs/super_types.h @@ -338,5 +338,6 @@ struct super_block { #define SB_I_NOUMASK 0x00001000 /* VFS does not apply umask */ #define SB_I_NOIDMAP 0x00002000 /* No idmapped mounts on this superblock */ #define SB_I_ALLOW_HSM 0x00004000 /* Allow HSM events on this superblock */ +#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data persistence on sync */ #endif /* _LINUX_FS_SUPER_TYPES_H */ diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index ec442af3f886..31a848485ad9 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -210,7 +210,6 @@ enum mapping_flags { AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, AS_KERNEL_FILE = 10, /* mapping for a fake kernel file that shouldn't account usage to user cgroups */ - AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */ /* Bits 16-25 are used for FOLIO_ORDER */ AS_FOLIO_ORDER_BITS = 5, AS_FOLIO_ORDER_MIN = 16, @@ -346,16 +345,6 @@ static inline bool mapping_writeback_may_deadlock_on_reclaim(const struct addres return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags); } -static inline void mapping_set_no_data_integrity(struct address_space *mapping) -{ - set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); -} - -static inline bool mapping_no_data_integrity(const struct address_space *mapping) -{ - return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); -} - static inline gfp_t mapping_gfp_mask(const struct address_space *mapping) { return mapping->gfp_mask; I tested it locally on my simulated repro and it fixed the issue for me. Could you verify that this patch fixes the issue for you as well? Thanks, Joanne > Is this something you plan to submit/backport to the stable kernels series? > ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-18 3:50 ` Joanne Koong @ 2026-03-18 20:37 ` John 2026-03-18 20:45 ` Joanne Koong 0 siblings, 1 reply; 9+ messages in thread From: John @ 2026-03-18 20:37 UTC (permalink / raw) To: Joanne Koong Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Tuesday, March 17th, 2026 at 11:51 PM, Joanne Koong <joannelkoong@gmail.com> wrote: > I think this is the cleaner fix: > > writeback: skip sync(2) inode writeback for filesystems with no data > integrity guarantees > > Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot > guarantee data persistence on sync (eg fuse) and skip sync(2) inode > writeback for superblocks with this flag set. > > This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in > commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings > in wait_sb_inodes()"). The flag belongs at the superblock level because > data integrity is a filesystem-wide property, not a per-inode one. This > also allows sync_inodes_one_sb() to skip the entire filesystem > efficiently, rather than iterating every dirty inode only to skip each > one individually. > > Without this, sync(2) triggers writeback on FUSE inodes and may block > waiting for the daemon to complete issued writeback or setattr (from > ->write_inode()) requests. > > This restores fuse to its prior behavior before tmp folios were removed, > where sync was essentially a no-op. > > Reported-by: John <therealgraysky@proton.me> > Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and > internal rb tree") > Cc: <stable@vger.kernel.org> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- > fs/fs-writeback.c | 7 +------ > fs/fuse/file.c | 4 +--- > fs/fuse/inode.c | 1 + > fs/sync.c | 2 +- > include/linux/fs/super_types.h | 1 + > include/linux/pagemap.h | 11 ----------- > 6 files changed, 5 insertions(+), 21 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 8f8069fb76ba..7a02483e0d8d 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb) > * The mapping can appear untagged while still on-list since we > * do not have the mapping lock. Skip it here, wb completion > * will remove it. > - * > - * If the mapping does not have data integrity semantics, > - * there's no need to wait for the writeout to complete, as the > - * mapping cannot guarantee that data is persistently stored. > */ > - if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) || > - mapping_no_data_integrity(mapping)) > + if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) > continue; > > spin_unlock_irq(&sb->s_inode_wblist_lock); > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 7294bd347412..111ccc5bdda3 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode, > unsigned int flags) > > inode->i_fop = &fuse_file_operations; > inode->i_data.a_ops = &fuse_file_aops; > - if (fc->writeback_cache) { > + if (fc->writeback_cache) > mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data); > - mapping_set_no_data_integrity(&inode->i_data); > - } > > INIT_LIST_HEAD(&fi->write_files); > INIT_LIST_HEAD(&fi->queued_writes); > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index af8ad96829fd..bae7d9ac3a43 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb) > sb->s_export_op = &fuse_export_operations; > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > sb->s_iflags |= SB_I_NOIDMAP; > + sb->s_iflags |= SB_I_NO_DATA_INTEGRITY; > if (sb->s_user_ns != &init_user_ns) > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION); > diff --git a/fs/sync.c b/fs/sync.c > index 942a60cfedfb..88c08e2f76b2 100644 > --- a/fs/sync.c > +++ b/fs/sync.c > @@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem); > > static void sync_inodes_one_sb(struct super_block *sb, void *arg) > { > - if (!sb_rdonly(sb)) > + if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY)) > sync_inodes_sb(sb); > } > > diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h > index fa7638b81246..383050e7fdf5 100644 > --- a/include/linux/fs/super_types.h > +++ b/include/linux/fs/super_types.h > @@ -338,5 +338,6 @@ struct super_block { > #define SB_I_NOUMASK 0x00001000 /* VFS does not apply umask */ > #define SB_I_NOIDMAP 0x00002000 /* No idmapped mounts on this > superblock */ > #define SB_I_ALLOW_HSM 0x00004000 /* Allow HSM events on this > superblock */ > +#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data > persistence on sync */ > > #endif /* _LINUX_FS_SUPER_TYPES_H */ > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index ec442af3f886..31a848485ad9 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -210,7 +210,6 @@ enum mapping_flags { > AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, > AS_KERNEL_FILE = 10, /* mapping for a fake kernel file that shouldn't > account usage to user cgroups */ > - AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */ > /* Bits 16-25 are used for FOLIO_ORDER */ > AS_FOLIO_ORDER_BITS = 5, > AS_FOLIO_ORDER_MIN = 16, > @@ -346,16 +345,6 @@ static inline bool > mapping_writeback_may_deadlock_on_reclaim(const struct addres > return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags); > } > > -static inline void mapping_set_no_data_integrity(struct address_space *mapping) > -{ > - set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); > -} > - > -static inline bool mapping_no_data_integrity(const struct > address_space *mapping) > -{ > - return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); > -} > - > static inline gfp_t mapping_gfp_mask(const struct address_space *mapping) > { > return mapping->gfp_mask; > > I tested it locally on my simulated repro and it fixed the issue for > me. Could you verify that this patch fixes the issue for you as well? This new patch also corrects the bug. Same test as before. Thank you! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-18 20:37 ` John @ 2026-03-18 20:45 ` Joanne Koong 0 siblings, 0 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-18 20:45 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Wed, Mar 18, 2026 at 1:37 PM John <therealgraysky@proton.me> wrote: > > > On Tuesday, March 17th, 2026 at 11:51 PM, Joanne Koong <joannelkoong@gmail.com> wrote: > > I think this is the cleaner fix: > > > > writeback: skip sync(2) inode writeback for filesystems with no data > > integrity guarantees > > > > Add SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot > > guarantee data persistence on sync (eg fuse) and skip sync(2) inode > > writeback for superblocks with this flag set. > > > > This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in > > commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings > > in wait_sb_inodes()"). The flag belongs at the superblock level because > > data integrity is a filesystem-wide property, not a per-inode one. This > > also allows sync_inodes_one_sb() to skip the entire filesystem > > efficiently, rather than iterating every dirty inode only to skip each > > one individually. > > > > Without this, sync(2) triggers writeback on FUSE inodes and may block > > waiting for the daemon to complete issued writeback or setattr (from > > ->write_inode()) requests. > > > > This restores fuse to its prior behavior before tmp folios were removed, > > where sync was essentially a no-op. > > > > Reported-by: John <therealgraysky@proton.me> > > Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and > > internal rb tree") > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > --- > > fs/fs-writeback.c | 7 +------ > > fs/fuse/file.c | 4 +--- > > fs/fuse/inode.c | 1 + > > fs/sync.c | 2 +- > > include/linux/fs/super_types.h | 1 + > > include/linux/pagemap.h | 11 ----------- > > 6 files changed, 5 insertions(+), 21 deletions(-) > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 8f8069fb76ba..7a02483e0d8d 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -2774,13 +2774,8 @@ static void wait_sb_inodes(struct super_block *sb) > > * The mapping can appear untagged while still on-list since we > > * do not have the mapping lock. Skip it here, wb completion > > * will remove it. > > - * > > - * If the mapping does not have data integrity semantics, > > - * there's no need to wait for the writeout to complete, as the > > - * mapping cannot guarantee that data is persistently stored. > > */ > > - if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) || > > - mapping_no_data_integrity(mapping)) > > + if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) > > continue; > > > > spin_unlock_irq(&sb->s_inode_wblist_lock); > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > > index 7294bd347412..111ccc5bdda3 100644 > > --- a/fs/fuse/file.c > > +++ b/fs/fuse/file.c > > @@ -3205,10 +3205,8 @@ void fuse_init_file_inode(struct inode *inode, > > unsigned int flags) > > > > inode->i_fop = &fuse_file_operations; > > inode->i_data.a_ops = &fuse_file_aops; > > - if (fc->writeback_cache) { > > + if (fc->writeback_cache) > > mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data); > > - mapping_set_no_data_integrity(&inode->i_data); > > - } > > > > INIT_LIST_HEAD(&fi->write_files); > > INIT_LIST_HEAD(&fi->queued_writes); > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > > index af8ad96829fd..bae7d9ac3a43 100644 > > --- a/fs/fuse/inode.c > > +++ b/fs/fuse/inode.c > > @@ -1769,6 +1769,7 @@ static void fuse_sb_defaults(struct super_block *sb) > > sb->s_export_op = &fuse_export_operations; > > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > > sb->s_iflags |= SB_I_NOIDMAP; > > + sb->s_iflags |= SB_I_NO_DATA_INTEGRITY; > > if (sb->s_user_ns != &init_user_ns) > > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > > sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION); > > diff --git a/fs/sync.c b/fs/sync.c > > index 942a60cfedfb..88c08e2f76b2 100644 > > --- a/fs/sync.c > > +++ b/fs/sync.c > > @@ -73,7 +73,7 @@ EXPORT_SYMBOL(sync_filesystem); > > > > static void sync_inodes_one_sb(struct super_block *sb, void *arg) > > { > > - if (!sb_rdonly(sb)) > > + if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_NO_DATA_INTEGRITY)) > > sync_inodes_sb(sb); > > } > > > > diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h > > index fa7638b81246..383050e7fdf5 100644 > > --- a/include/linux/fs/super_types.h > > +++ b/include/linux/fs/super_types.h > > @@ -338,5 +338,6 @@ struct super_block { > > #define SB_I_NOUMASK 0x00001000 /* VFS does not apply umask */ > > #define SB_I_NOIDMAP 0x00002000 /* No idmapped mounts on this > > superblock */ > > #define SB_I_ALLOW_HSM 0x00004000 /* Allow HSM events on this > > superblock */ > > +#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data > > persistence on sync */ > > > > #endif /* _LINUX_FS_SUPER_TYPES_H */ > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > > index ec442af3f886..31a848485ad9 100644 > > --- a/include/linux/pagemap.h > > +++ b/include/linux/pagemap.h > > @@ -210,7 +210,6 @@ enum mapping_flags { > > AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, > > AS_KERNEL_FILE = 10, /* mapping for a fake kernel file that shouldn't > > account usage to user cgroups */ > > - AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */ > > /* Bits 16-25 are used for FOLIO_ORDER */ > > AS_FOLIO_ORDER_BITS = 5, > > AS_FOLIO_ORDER_MIN = 16, > > @@ -346,16 +345,6 @@ static inline bool > > mapping_writeback_may_deadlock_on_reclaim(const struct addres > > return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags); > > } > > > > -static inline void mapping_set_no_data_integrity(struct address_space *mapping) > > -{ > > - set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); > > -} > > - > > -static inline bool mapping_no_data_integrity(const struct > > address_space *mapping) > > -{ > > - return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); > > -} > > - > > static inline gfp_t mapping_gfp_mask(const struct address_space *mapping) > > { > > return mapping->gfp_mask; > > > > I tested it locally on my simulated repro and it fixed the issue for > > me. Could you verify that this patch fixes the issue for you as well? > > This new patch also corrects the bug. Same test as before. Thank you! Great, I will submit this patch to the mailing list today. Thank you for verifying! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-17 0:15 ` Joanne Koong 2026-03-17 21:07 ` John @ 2026-03-17 23:25 ` Joanne Koong 2026-03-18 22:31 ` Joanne Koong 1 sibling, 1 reply; 9+ messages in thread From: Joanne Koong @ 2026-03-17 23:25 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Mon, Mar 16, 2026 at 5:15 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote: > > > > Hi John, > > Thanks for your detailed report. > > > Kernel: 6.19.6-arch1-1 > > Component: fs/fuse, fs/fs-writeback > > > > --- SUMMARY --- > > > > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path. > > > > --- BACKGROUND --- > > > > This issue was originally reported in: > > https://github.com/containers/fuse-overlayfs/issues/386 > > > > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach. > > > > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6: > > > > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a > > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3 > > > > --- REPRODUCTION --- > > > > mkdir -p /dev/shm/test/up > > mkdir -p /dev/shm/test/tmp > > mkdir -p /dev/shm/test/data > > > > fuse-overlayfs -o "static_nlink,noacl,\ > > upperdir=/dev/shm/test/up,\ > > lowerdir=$HOME/.config/mozilla/firefox,\ > > workdir=/dev/shm/test/tmp" \ > > /dev/shm/test/data > > > > firefox --profile /dev/shm/test/data/PROFILENAME > > > > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu. > > > > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked: > > > > # reboot > > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation. > > > > --- CALL TRACE (6.19.6) --- > > > > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds. > > Mar 15 06:44:42 kernel: Not tainted 6.19.6-arch1-1 #1 > > Mar 15 06:44:42 kernel: task:kworker/u128:0 state:D stack:0 pid:106160 tgid:106160 ppid:2 > > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn > > Mar 15 06:44:42 kernel: Call Trace: > > Mar 15 06:44:42 kernel: <TASK> > > Mar 15 06:44:42 kernel: __schedule+0x457/0x1720 > > Mar 15 06:44:42 kernel: schedule+0x27/0xd0 > > Mar 15 06:44:42 kernel: wb_wait_for_completion+0x97/0xe0 > > Mar 15 06:44:42 kernel: sync_inodes_sb+0xf8/0x2e0 > > Mar 15 06:44:42 kernel: __iterate_supers+0xdc/0x160 > > Mar 15 06:44:42 kernel: ksys_sync+0x43/0xb0 > > Mar 15 06:44:42 kernel: pm_fs_sync_work_fn+0x17/0xa0 > > Mar 15 06:44:42 kernel: process_one_work+0x193/0x350 > > Mar 15 06:44:42 kernel: worker_thread+0x1a1/0x310 > > Mar 15 06:44:42 kernel: kthread+0xfc/0x240 > > Mar 15 06:44:42 kernel: ret_from_fork+0x243/0x280 > > Mar 15 06:44:42 kernel: ret_from_fork_asm+0x1a/0x30 > > Mar 15 06:44:42 kernel: </TASK> > > > > --- MORE CONTEXT --- > > > > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress. > > > > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed. > > I'll need to run the repro and verify this is the issue, but I think > it's because it's hitting this call chain if there's a dirty folio > that is already under writeback that needs to have writeback issued on > it again: > > wb_workfn() > wb_do_writeback() > wb_writeback() > writeback_sb_inodes() > __writeback_single_inode() > do_writepages() > fuse_writepages() > iomap_writepages() > writeback_iter() > writeback_get_folio() > folio_prepare_writeback() > > where in the folio_prepare_writeback() logic: > > if (!folio_test_dirty(folio)) > return false; > > if (folio_test_writeback(folio)) { > if (wbc->sync_mode == WB_SYNC_NONE) > return false; > folio_wait_writeback(folio); > I couldn't get the firefox+youtube scenario running because my VM lacks graphics support and I couldn't figure out how to get that to work, but I reproduced the bug using fuse-overlayfs with a synthetic I/O workload: (while true; do dd if=/dev/urandom of=/dev/shm/test/data/test.default/file1 bs=1M count=10 2>/dev/null; done) & sleep 2 systemctl suspend which gave pretty much the same stack trace: [ 154.892336] INFO: task kworker/u64:1:119 blocked for more than 100 seconds. [ 154.892967] Not tainted 7.0.0-rc1-g6dcceeb72856-dirty #1792 [ 154.893452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 154.894048] task:kworker/u64:1 state:D stack:0 pid:119 tgid:119 ppid:2 task_flags:0x4208160 flags:0x00080000 [ 154.894879] Workqueue: pm_fs_sync pm_fs_sync_work_fn [ 154.895309] Call Trace: [ 154.895514] <TASK> [ 154.895759] __schedule+0xeaa/0x3d60 [ 154.896060] ? __filemap_fdatawait_range+0xb0/0x160 [ 154.896435] ? __pfx___schedule+0x10/0x10 [ 154.896744] ? bdi_split_work_to_wbs+0x50b/0x990 [ 154.897093] schedule+0x7e/0x2e0 [ 154.897344] wb_wait_for_completion+0x14b/0x1e0 [ 154.897690] ? __pfx_wb_wait_for_completion+0x10/0x10 [ 154.898112] ? __pfx_autoremove_wake_function+0x10/0x10 [ 154.898550] ? __pfx_mutex_unlock+0x10/0x10 [ 154.898915] ? iput+0x61/0x600 [ 154.899175] sync_inodes_sb+0x1be/0x740 [ 154.899490] ? __pfx_down_read+0x10/0x10 [ 154.899814] ? __pfx_sync_inodes_sb+0x10/0x10 [ 154.900165] ? __pfx_super_lock+0x10/0x10 [ 154.900495] ? _raw_spin_lock+0x84/0xe0 [ 154.900826] ? __pfx__raw_spin_lock+0x10/0x10 [ 154.901154] __iterate_supers+0x176/0x280 [ 154.901455] ? __pfx_sync_inodes_one_sb+0x10/0x10 [ 154.901845] ? _raw_spin_unlock_irq+0xe/0x30 [ 154.902169] ksys_sync+0x87/0xf0 [ 154.902428] ? __pfx_ksys_sync+0x10/0x10 [ 154.902772] ? kvm_clock_get_cycles+0x18/0x30 [ 154.903131] ? ktime_get+0x65/0x140 [ 154.903417] pm_fs_sync_work_fn+0x17/0xc0 [ 154.903750] process_one_work+0x656/0x1150 [ 154.904083] ? assign_work+0x122/0x3e0 [ 154.904367] worker_thread+0x5de/0xcb0 [ 154.904670] ? __pfx_worker_thread+0x10/0x10 [ 154.905052] kthread+0x34d/0x450 Further debugging showed it's not due to that call chain with the folio_wait_writeback() in folio_prepare_writeback() scenario described in the previous message, but due to the fuse daemon being stuck and thus unable to fully service the ->write_inode() by the wb_workfn: PID 121 (kworker/u64:2+flush-0:51): [<0>] request_wait_answer+0x215/0x560 [fuse] [<0>] __fuse_simple_request+0x341/0xc80 [fuse] [<0>] fuse_flush_times+0x2ff/0x410 [fuse] [<0>] fuse_write_inode+0x93/0x100 [fuse] [<0>] __writeback_single_inode+0x5e1/0x930 [<0>] writeback_sb_inodes+0x53e/0xe40 [<0>] __writeback_inodes_wb+0xb7/0x200 [<0>] wb_writeback+0x571/0x790 [<0>] wb_workfn+0x73a/0xb90 [<0>] process_one_work+0x656/0x1150 [<0>] worker_thread+0x5de/0xcb0 [<0>] kthread+0x34d/0x450 [<0>] ret_from_fork+0x3ca/0x640 [<0>] ret_from_fork_asm+0x1a/0x30 In the kernel suspend code though, the sync (pm_sleep_fs_sync()) happens before freezing any userspace processes (suspend_freeze_processes()) so it's odd for the fuse daemon to be stuck. It turns out the daemon is stuck because systemd freezes the user session cgroups first before invoking the kernel suspend: > ps aux | grep -i "fuse-overlayfs" vmuser 826 0.0 0.0 12316 2616 ? Ss 15:02 0:00 fuse-overlayfs -o ... > cgroup=$(cat /proc/826/cgroup | head -1 | cut -d: -f3) > echo "cgroup: $cgroup" > cat /sys/fs/cgroup${cgroup}/cgroup.freeze 2>/dev/null || echo "No cgroup freeze file" > cat /sys/fs/cgroup${cgroup}/cgroup.events 2>/dev/null | grep frozen cgroup: /user.slice/user-1001.slice/session-3.scope 1 frozen 1 Even though the root cause is the cgroup freezer rather than the folio_prepare_writeback() scenario, the fix proposed previously is still the fix for this. Thanks, Joanne > > Could you verify if this fixes the issue?: > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 8f8069fb76ba..4cbb22d80acf 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1990,9 +1990,13 @@ static long writeback_sb_inodes(struct super_block *sb, > * Don't bother with new inodes or inodes being freed, first > * kind does not need periodic writeout yet, and for the latter > * kind writeout is handled by the freer. > + * > + * For sync(2), skip inodes whose mappings have no data > + * integrity guarantees (eg FUSE). > */ > spin_lock(&inode->i_lock); > - if (inode_state_read(inode) & (I_NEW | I_FREEING | > I_WILL_FREE)) { > + if ((inode_state_read(inode) & (I_NEW | I_FREEING | > I_WILL_FREE)) || > + (wbc.for_sync && > mapping_no_data_integrity(inode->i_mapping))) { > redirty_tail_locked(inode, wb); > spin_unlock(&inode->i_lock); > continue; > > I think the changes from commit f9a49aa302a0 ("fs/writeback: skip > AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") are still needed > even with the above patch since afaics, the writeback we wait on could > have been from a previous background/periodic writeback run before > sync was triggered. > > Thanks, > Joanne > > > > > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion. > > > > I am using Arch Linux but users of Fedora F43 and Debian testing are hitting this bug also. See refs below. > > > > References: > > 1. fuse-overlayfs upstream issue: https://github.com/containers/fuse-overlayfs/issues/386 > > 2. profile-sync-daemon issue: https://github.com/graysky2/profile-sync-daemon/issues/411 > > 3. Debian bug #1120058: https://bugs.debian.org/1120058 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) 2026-03-17 23:25 ` Joanne Koong @ 2026-03-18 22:31 ` Joanne Koong 0 siblings, 0 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-18 22:31 UTC (permalink / raw) To: John Cc: linux-fsdevel@vger.kernel.org, linux-fuse@lists.sourceforge.net, linux-pm@vger.kernel.org, Miklos Szeredi, Jan Kara On Tue, Mar 17, 2026 at 4:25 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Mon, Mar 16, 2026 at 5:15 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote: > > > > > > > Hi John, > > > > Thanks for your detailed report. > > > > > Kernel: 6.19.6-arch1-1 > > > Component: fs/fuse, fs/fs-writeback > > > > > > --- SUMMARY --- > > > > > > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path. > > > > > > --- BACKGROUND --- > > > > > > This issue was originally reported in: > > > https://github.com/containers/fuse-overlayfs/issues/386 > > > > > > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach. > > > > > > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6: > > > > > > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a > > > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3 > > > > > > --- REPRODUCTION --- > > > > > > mkdir -p /dev/shm/test/up > > > mkdir -p /dev/shm/test/tmp > > > mkdir -p /dev/shm/test/data > > > > > > fuse-overlayfs -o "static_nlink,noacl,\ > > > upperdir=/dev/shm/test/up,\ > > > lowerdir=$HOME/.config/mozilla/firefox,\ > > > workdir=/dev/shm/test/tmp" \ > > > /dev/shm/test/data > > > > > > firefox --profile /dev/shm/test/data/PROFILENAME > > > > > > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu. > > > > > > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked: > > > > > > # reboot > > > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation. > > > > > > --- CALL TRACE (6.19.6) --- > > > > > > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds. > > > Mar 15 06:44:42 kernel: Not tainted 6.19.6-arch1-1 #1 > > > Mar 15 06:44:42 kernel: task:kworker/u128:0 state:D stack:0 pid:106160 tgid:106160 ppid:2 > > > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn > > > Mar 15 06:44:42 kernel: Call Trace: > > > Mar 15 06:44:42 kernel: <TASK> > > > Mar 15 06:44:42 kernel: __schedule+0x457/0x1720 > > > Mar 15 06:44:42 kernel: schedule+0x27/0xd0 > > > Mar 15 06:44:42 kernel: wb_wait_for_completion+0x97/0xe0 > > > Mar 15 06:44:42 kernel: sync_inodes_sb+0xf8/0x2e0 > > > Mar 15 06:44:42 kernel: __iterate_supers+0xdc/0x160 > > > Mar 15 06:44:42 kernel: ksys_sync+0x43/0xb0 > > > Mar 15 06:44:42 kernel: pm_fs_sync_work_fn+0x17/0xa0 > > > Mar 15 06:44:42 kernel: process_one_work+0x193/0x350 > > > Mar 15 06:44:42 kernel: worker_thread+0x1a1/0x310 > > > Mar 15 06:44:42 kernel: kthread+0xfc/0x240 > > > Mar 15 06:44:42 kernel: ret_from_fork+0x243/0x280 > > > Mar 15 06:44:42 kernel: ret_from_fork_asm+0x1a/0x30 > > > Mar 15 06:44:42 kernel: </TASK> > > > > > > --- MORE CONTEXT --- > > > > > > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress. > > > > > > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed. > > > > I'll need to run the repro and verify this is the issue, but I think > > it's because it's hitting this call chain if there's a dirty folio > > that is already under writeback that needs to have writeback issued on > > it again: > > > > wb_workfn() > > wb_do_writeback() > > wb_writeback() > > writeback_sb_inodes() > > __writeback_single_inode() > > do_writepages() > > fuse_writepages() > > iomap_writepages() > > writeback_iter() > > writeback_get_folio() > > folio_prepare_writeback() > > > > where in the folio_prepare_writeback() logic: > > > > if (!folio_test_dirty(folio)) > > return false; > > > > if (folio_test_writeback(folio)) { > > if (wbc->sync_mode == WB_SYNC_NONE) > > return false; > > folio_wait_writeback(folio); > > > > I couldn't get the firefox+youtube scenario running because my VM > lacks graphics support and I couldn't figure out how to get that to > work, but I reproduced the bug using fuse-overlayfs with a synthetic > I/O workload: (fwiw, my repro triggers the hang even prior to the tmp pages being removed, so I think the firefox + youtube scenario you're running into is indeed the folio_prepare_writeback() scenario) > > (while true; do dd if=/dev/urandom > of=/dev/shm/test/data/test.default/file1 bs=1M count=10 2>/dev/null; > done) & > sleep 2 > systemctl suspend > > which gave pretty much the same stack trace: > > [ 154.892336] INFO: task kworker/u64:1:119 blocked for more than 100 seconds. > [ 154.892967] Not tainted 7.0.0-rc1-g6dcceeb72856-dirty #1792 > [ 154.893452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 154.894048] task:kworker/u64:1 state:D stack:0 pid:119 > tgid:119 ppid:2 task_flags:0x4208160 flags:0x00080000 > [ 154.894879] Workqueue: pm_fs_sync pm_fs_sync_work_fn > [ 154.895309] Call Trace: > [ 154.895514] <TASK> > [ 154.895759] __schedule+0xeaa/0x3d60 > [ 154.896060] ? __filemap_fdatawait_range+0xb0/0x160 > [ 154.896435] ? __pfx___schedule+0x10/0x10 > [ 154.896744] ? bdi_split_work_to_wbs+0x50b/0x990 > [ 154.897093] schedule+0x7e/0x2e0 > [ 154.897344] wb_wait_for_completion+0x14b/0x1e0 > [ 154.897690] ? __pfx_wb_wait_for_completion+0x10/0x10 > [ 154.898112] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 154.898550] ? __pfx_mutex_unlock+0x10/0x10 > [ 154.898915] ? iput+0x61/0x600 > [ 154.899175] sync_inodes_sb+0x1be/0x740 > [ 154.899490] ? __pfx_down_read+0x10/0x10 > [ 154.899814] ? __pfx_sync_inodes_sb+0x10/0x10 > [ 154.900165] ? __pfx_super_lock+0x10/0x10 > [ 154.900495] ? _raw_spin_lock+0x84/0xe0 > [ 154.900826] ? __pfx__raw_spin_lock+0x10/0x10 > [ 154.901154] __iterate_supers+0x176/0x280 > [ 154.901455] ? __pfx_sync_inodes_one_sb+0x10/0x10 > [ 154.901845] ? _raw_spin_unlock_irq+0xe/0x30 > [ 154.902169] ksys_sync+0x87/0xf0 > [ 154.902428] ? __pfx_ksys_sync+0x10/0x10 > [ 154.902772] ? kvm_clock_get_cycles+0x18/0x30 > [ 154.903131] ? ktime_get+0x65/0x140 > [ 154.903417] pm_fs_sync_work_fn+0x17/0xc0 > [ 154.903750] process_one_work+0x656/0x1150 > [ 154.904083] ? assign_work+0x122/0x3e0 > [ 154.904367] worker_thread+0x5de/0xcb0 > [ 154.904670] ? __pfx_worker_thread+0x10/0x10 > [ 154.905052] kthread+0x34d/0x450 > > Further debugging showed it's not due to that call chain with the > folio_wait_writeback() in folio_prepare_writeback() scenario described > in the previous message, but due to the fuse daemon being stuck and > thus unable to fully service the ->write_inode() by the wb_workfn: > > PID 121 (kworker/u64:2+flush-0:51): > [<0>] request_wait_answer+0x215/0x560 [fuse] > [<0>] __fuse_simple_request+0x341/0xc80 [fuse] > [<0>] fuse_flush_times+0x2ff/0x410 [fuse] > [<0>] fuse_write_inode+0x93/0x100 [fuse] > [<0>] __writeback_single_inode+0x5e1/0x930 > [<0>] writeback_sb_inodes+0x53e/0xe40 > [<0>] __writeback_inodes_wb+0xb7/0x200 > [<0>] wb_writeback+0x571/0x790 > [<0>] wb_workfn+0x73a/0xb90 > [<0>] process_one_work+0x656/0x1150 > [<0>] worker_thread+0x5de/0xcb0 > [<0>] kthread+0x34d/0x450 > [<0>] ret_from_fork+0x3ca/0x640 > [<0>] ret_from_fork_asm+0x1a/0x30 > > In the kernel suspend code though, the sync (pm_sleep_fs_sync()) > happens before freezing any userspace processes > (suspend_freeze_processes()) so it's odd for the fuse daemon to be > stuck. > > It turns out the daemon is stuck because systemd freezes the user > session cgroups first before invoking the kernel suspend: > > > ps aux | grep -i "fuse-overlayfs" > vmuser 826 0.0 0.0 12316 2616 ? Ss 15:02 0:00 > fuse-overlayfs -o ... > > > cgroup=$(cat /proc/826/cgroup | head -1 | cut -d: -f3) > > echo "cgroup: $cgroup" > > cat /sys/fs/cgroup${cgroup}/cgroup.freeze 2>/dev/null || echo "No cgroup freeze file" > > cat /sys/fs/cgroup${cgroup}/cgroup.events 2>/dev/null | grep frozen > cgroup: /user.slice/user-1001.slice/session-3.scope > 1 > frozen 1 > > Even though the root cause is the cgroup freezer rather than the > folio_prepare_writeback() scenario, the fix proposed previously is > still the fix for this. > > Thanks, > Joanne > > > > > > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-18 22:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-15 11:24 [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) John 2026-03-17 0:15 ` Joanne Koong 2026-03-17 21:07 ` John 2026-03-17 22:55 ` Joanne Koong 2026-03-18 3:50 ` Joanne Koong 2026-03-18 20:37 ` John 2026-03-18 20:45 ` Joanne Koong 2026-03-17 23:25 ` Joanne Koong 2026-03-18 22:31 ` Joanne Koong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox