Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

From: Joanne Koong <joannelkoong@gmail.com>
To: John <therealgraysky@proton.me>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	 "linux-fuse@lists.sourceforge.net"
	<linux-fuse@lists.sourceforge.net>,
	 "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>, Jan Kara <jack@suse.cz>
Subject: Re: [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete)
Date: Wed, 18 Mar 2026 15:31:21 -0700	[thread overview]
Message-ID: <CAJnrk1YMcRfrMRLVc+jVE9RqB9344+mVL0rj+_ESKGzp2ZdMRg@mail.gmail.com> (raw)
In-Reply-To: <CAJnrk1a-asuvfrbKXbEwwDSctvemF+6zfhdnuzO65Pt8HsFSRw@mail.gmail.com>

On Tue, Mar 17, 2026 at 4:25 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Mar 16, 2026 at 5:15 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Sun, Mar 15, 2026 at 4:24 AM John <therealgraysky@proton.me> wrote:
> > >
> >
> > Hi John,
> >
> > Thanks for your detailed report.
> >
> > > Kernel: 6.19.6-arch1-1
> > > Component: fs/fuse, fs/fs-writeback
> > >
> > > --- SUMMARY ---
> > >
> > > A suspend-to-RAM hang in wb_wait_for_completion() via sync_inodes_sb() persists on 6.19.6 when fuse-overlayfs is mounted on tmpfs. The fix introduced in 6.19~rc6 for Debian bug #1120058 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()") does not cover this code path.
> > >
> > > --- BACKGROUND ---
> > >
> > > This issue was originally reported in:
> > > https://github.com/containers/fuse-overlayfs/issues/386
> > >
> > > The fuse-overlayfs developer identified it as a kernel issue. It was subsequently bisected to commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree", merged in 6.16) and tracked in Debian as bug #1120058. The fix applied in 6.19~rc6 targets wait_sb_inodes(), but the hang in the reporter's original trace and in this report occurs in sync_inodes_sb() via a separate code path that the fix does not reach.
> > >
> > > The issue is also known to syzbot as "INFO: task hung in sync_inodes_sb" with open instances on linux-5.15, linux-6.1, and linux-6.6:
> > >
> > > https://syzkaller.appspot.com/bug?extid=e0232bd63c6e293aaf6a
> > > https://syzkaller.appspot.com/bug?extid=4983a35cf671e5ed55b3
> > >
> > > --- REPRODUCTION ---
> > >
> > > mkdir -p /dev/shm/test/up
> > > mkdir -p /dev/shm/test/tmp
> > > mkdir -p /dev/shm/test/data
> > >
> > > fuse-overlayfs -o "static_nlink,noacl,\
> > >   upperdir=/dev/shm/test/up,\
> > >   lowerdir=$HOME/.config/mozilla/firefox,\
> > >   workdir=/dev/shm/test/tmp" \
> > >   /dev/shm/test/data
> > >
> > > firefox --profile /dev/shm/test/data/PROFILENAME
> > >
> > > Browse to youtube and start playing a video then trigger suspend while the video is playing in the browser. I used XFCE4 suspend from the menu.
> > >
> > > The display goes blank and does not recover. The system does not enter suspend. Switching to a tty shows the system is alive but X11 is frozen. Reboot is blocked:
> > >
> > > # reboot
> > > Call to Reboot failed: Action suspend already in progress, refusing requested reboot operation.
> > >
> > > --- CALL TRACE (6.19.6) ---
> > >
> > > Mar 15 06:44:42 kernel: INFO: task kworker/u128:0:106160 blocked for more than 122 seconds.
> > > Mar 15 06:44:42 kernel:       Not tainted 6.19.6-arch1-1 #1
> > > Mar 15 06:44:42 kernel: task:kworker/u128:0  state:D stack:0     pid:106160 tgid:106160 ppid:2
> > > Mar 15 06:44:42 kernel: Workqueue: pm_fs_sync pm_fs_sync_work_fn
> > > Mar 15 06:44:42 kernel: Call Trace:
> > > Mar 15 06:44:42 kernel:  <TASK>
> > > Mar 15 06:44:42 kernel:  __schedule+0x457/0x1720
> > > Mar 15 06:44:42 kernel:  schedule+0x27/0xd0
> > > Mar 15 06:44:42 kernel:  wb_wait_for_completion+0x97/0xe0
> > > Mar 15 06:44:42 kernel:  sync_inodes_sb+0xf8/0x2e0
> > > Mar 15 06:44:42 kernel:  __iterate_supers+0xdc/0x160
> > > Mar 15 06:44:42 kernel:  ksys_sync+0x43/0xb0
> > > Mar 15 06:44:42 kernel:  pm_fs_sync_work_fn+0x17/0xa0
> > > Mar 15 06:44:42 kernel:  process_one_work+0x193/0x350
> > > Mar 15 06:44:42 kernel:  worker_thread+0x1a1/0x310
> > > Mar 15 06:44:42 kernel:  kthread+0xfc/0x240
> > > Mar 15 06:44:42 kernel:  ret_from_fork+0x243/0x280
> > > Mar 15 06:44:42 kernel:  ret_from_fork_asm+0x1a/0x30
> > > Mar 15 06:44:42 kernel:  </TASK>
> > >
> > > --- MORE CONTEXT ---
> > >
> > > Compared to the original report (which ran through systemd-sleep -> pm_suspend -> ksys_sync), the sync call in 6.19 has been moved into a kernel workqueue via pm_fs_sync_work_fn. The hang point is identical: wb_wait_for_completion() inside sync_inodes_sb() never returns because the FUSE daemon (fuse-overlayfs) is unable to complete writeback while the suspend freezer is in progress.
> > >
> > > The AS_NO_DATA_INTEGRITY fix targets wait_sb_inodes(), which is a separate function from sync_inodes_sb(). The hung writeback completion wait in sync_inodes_sb() is not gated by the AS_NO_DATA_INTEGRITY check and remains unaddressed.
> >
> > I'll need to run the repro and verify this is the issue, but I think
> > it's because it's hitting this call chain if there's a dirty folio
> > that is already under writeback that needs to have writeback issued on
> > it again:
> >
> > wb_workfn()
> >   wb_do_writeback()
> >     wb_writeback()
> >       writeback_sb_inodes()
> >         __writeback_single_inode()
> >           do_writepages()
> >             fuse_writepages()
> >               iomap_writepages()
> >                 writeback_iter()
> >                   writeback_get_folio()
> >                     folio_prepare_writeback()
> >
> > where in the folio_prepare_writeback() logic:
> >
> >         if (!folio_test_dirty(folio))
> >                 return false;
> >
> >         if (folio_test_writeback(folio)) {
> >                 if (wbc->sync_mode == WB_SYNC_NONE)
> >                         return false;
> >                 folio_wait_writeback(folio);
> >
>
> I couldn't get the firefox+youtube scenario running because my VM
> lacks graphics support and I couldn't figure out how to get that to
> work, but I reproduced the bug using fuse-overlayfs with a synthetic
> I/O workload:

(fwiw, my repro triggers the hang even prior to the tmp pages being
removed, so I think the firefox + youtube scenario you're running into
is indeed the folio_prepare_writeback() scenario)

>
> (while true; do dd if=/dev/urandom
> of=/dev/shm/test/data/test.default/file1 bs=1M count=10 2>/dev/null;
> done) &
> sleep 2
> systemctl suspend
>
> which gave pretty much the same stack trace:
>
> [  154.892336] INFO: task kworker/u64:1:119 blocked for more than 100 seconds.
> [  154.892967]       Not tainted 7.0.0-rc1-g6dcceeb72856-dirty #1792
> [  154.893452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  154.894048] task:kworker/u64:1   state:D stack:0     pid:119
> tgid:119   ppid:2      task_flags:0x4208160 flags:0x00080000
> [  154.894879] Workqueue: pm_fs_sync pm_fs_sync_work_fn
> [  154.895309] Call Trace:
> [  154.895514]  <TASK>
> [  154.895759]  __schedule+0xeaa/0x3d60
> [  154.896060]  ? __filemap_fdatawait_range+0xb0/0x160
> [  154.896435]  ? __pfx___schedule+0x10/0x10
> [  154.896744]  ? bdi_split_work_to_wbs+0x50b/0x990
> [  154.897093]  schedule+0x7e/0x2e0
> [  154.897344]  wb_wait_for_completion+0x14b/0x1e0
> [  154.897690]  ? __pfx_wb_wait_for_completion+0x10/0x10
> [  154.898112]  ? __pfx_autoremove_wake_function+0x10/0x10
> [  154.898550]  ? __pfx_mutex_unlock+0x10/0x10
> [  154.898915]  ? iput+0x61/0x600
> [  154.899175]  sync_inodes_sb+0x1be/0x740
> [  154.899490]  ? __pfx_down_read+0x10/0x10
> [  154.899814]  ? __pfx_sync_inodes_sb+0x10/0x10
> [  154.900165]  ? __pfx_super_lock+0x10/0x10
> [  154.900495]  ? _raw_spin_lock+0x84/0xe0
> [  154.900826]  ? __pfx__raw_spin_lock+0x10/0x10
> [  154.901154]  __iterate_supers+0x176/0x280
> [  154.901455]  ? __pfx_sync_inodes_one_sb+0x10/0x10
> [  154.901845]  ? _raw_spin_unlock_irq+0xe/0x30
> [  154.902169]  ksys_sync+0x87/0xf0
> [  154.902428]  ? __pfx_ksys_sync+0x10/0x10
> [  154.902772]  ? kvm_clock_get_cycles+0x18/0x30
> [  154.903131]  ? ktime_get+0x65/0x140
> [  154.903417]  pm_fs_sync_work_fn+0x17/0xc0
> [  154.903750]  process_one_work+0x656/0x1150
> [  154.904083]  ? assign_work+0x122/0x3e0
> [  154.904367]  worker_thread+0x5de/0xcb0
> [  154.904670]  ? __pfx_worker_thread+0x10/0x10
> [  154.905052]  kthread+0x34d/0x450
>
> Further debugging showed it's not due to that call chain with the
> folio_wait_writeback() in folio_prepare_writeback() scenario described
> in the previous message, but due to the fuse daemon being stuck and
> thus unable to fully service the ->write_inode() by the wb_workfn:
>
> PID 121 (kworker/u64:2+flush-0:51):
> [<0>] request_wait_answer+0x215/0x560 [fuse]
> [<0>] __fuse_simple_request+0x341/0xc80 [fuse]
> [<0>] fuse_flush_times+0x2ff/0x410 [fuse]
> [<0>] fuse_write_inode+0x93/0x100 [fuse]
> [<0>] __writeback_single_inode+0x5e1/0x930
> [<0>] writeback_sb_inodes+0x53e/0xe40
> [<0>] __writeback_inodes_wb+0xb7/0x200
> [<0>] wb_writeback+0x571/0x790
> [<0>] wb_workfn+0x73a/0xb90
> [<0>] process_one_work+0x656/0x1150
> [<0>] worker_thread+0x5de/0xcb0
> [<0>] kthread+0x34d/0x450
> [<0>] ret_from_fork+0x3ca/0x640
> [<0>] ret_from_fork_asm+0x1a/0x30
>
> In the kernel suspend code though, the sync (pm_sleep_fs_sync())
> happens before freezing any userspace processes
> (suspend_freeze_processes()) so it's odd for the fuse daemon to be
> stuck.
>
> It turns out the daemon is stuck because systemd freezes the user
> session cgroups first before invoking the kernel suspend:
>
> > ps aux | grep -i "fuse-overlayfs"
> vmuser       826  0.0  0.0  12316  2616 ?        Ss   15:02   0:00
> fuse-overlayfs -o ...
>
> > cgroup=$(cat /proc/826/cgroup | head -1 | cut -d: -f3)
> > echo "cgroup: $cgroup"
> > cat /sys/fs/cgroup${cgroup}/cgroup.freeze 2>/dev/null || echo "No cgroup freeze file"
> > cat /sys/fs/cgroup${cgroup}/cgroup.events 2>/dev/null | grep frozen
> cgroup: /user.slice/user-1001.slice/session-3.scope
> 1
> frozen 1
>
> Even though the root cause is the cgroup freezer rather than the
> folio_prepare_writeback() scenario, the fix proposed previously is
> still the fix for this.
>
> Thanks,
> Joanne
>
> >
> > > Note: prior to commit 0c58a97f919c, sync() on FUSE filesystems was effectively a no-op, which avoided this hang at the cost of correctness. The regression was introduced when that commit made sync() actually wait on FUSE writeback completion.

     prev parent reply	other threads:[~2026-03-18 22:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-15 11:24 [BUG] fuse: wb_wait_for_completion hang on suspend with fuse-overlayfs on tmpfs persists in 6.19.6 (AS_NO_DATA_INTEGRITY fix incomplete) John
2026-03-17  0:15 ` Joanne Koong
2026-03-17 21:07   ` John
2026-03-17 22:55     ` Joanne Koong
2026-03-18  3:50     ` Joanne Koong
2026-03-18 20:37       ` John
2026-03-18 20:45         ` Joanne Koong
2026-03-17 23:25   ` Joanne Koong
2026-03-18 22:31     ` Joanne Koong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJnrk1YMcRfrMRLVc+jVE9RqB9344+mVL0rj+_ESKGzp2ZdMRg@mail.gmail.com \
    --to=joannelkoong@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-fuse@lists.sourceforge.net \
    --cc=linux-pm@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=therealgraysky@proton.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox