* [PATCH v2 0/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees
@ 2026-03-19 19:45 Joanne Koong
2026-03-19 19:45 ` [PATCH v2 1/1] " Joanne Koong
0 siblings, 1 reply; 4+ messages in thread
From: Joanne Koong @ 2026-03-19 19:45 UTC (permalink / raw)
To: brauner
Cc: linux-fsdevel, jack, miklos, david, therealgraysky, linux-pm,
stable
Changelog
---------
v1: https://lore.kernel.org/linux-fsdevel/20260318225604.71545-1-joannelkoong@gmail.com/
v1 -> v2:
* Still kick off flusher threads for writeback for SB_I_NO_DATA_INTEGRITY,
instead of skipping that entirely (Jan)
* Improve commit message (me)
Joanne Koong (1):
writeback: don't block sync(2) for filesystems with no data integrity
guarantees
fs/fs-writeback.c | 7 +------
fs/fuse/file.c | 4 +---
fs/fuse/inode.c | 1 +
fs/sync.c | 7 ++++++-
include/linux/fs/super_types.h | 1 +
include/linux/pagemap.h | 11 -----------
6 files changed, 10 insertions(+), 21 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees
2026-03-19 19:45 [PATCH v2 0/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees Joanne Koong
@ 2026-03-19 19:45 ` Joanne Koong
2026-03-19 21:26 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 4+ messages in thread
From: Joanne Koong @ 2026-03-19 19:45 UTC (permalink / raw)
To: brauner
Cc: linux-fsdevel, jack, miklos, david, therealgraysky, linux-pm,
stable
Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
guarantee data persistence on sync (eg fuse). For superblocks with this
flag set, sync(2) kicks off writeback of dirty inodes but does not wait
for the flusher threads to complete the writeback.
This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
in wait_sb_inodes()"). The flag belongs at the superblock level because
data integrity is a filesystem-wide property, not a per-inode one.
Having this flag at the superblock level allows us to skip the logic in
sync_inodes_sb() entirely, rather than iterating every dirty inode in
wait_sb_inodes() only to skip each inode individually.
This also addresses a recent report [1] for a suspend-to-RAM hang seen
on fuse-overlayfs:
Workqueue: pm_fs_sync pm_fs_sync_work_fn
Call Trace:
<TASK>
__schedule+0x457/0x1720
schedule+0x27/0xd0
wb_wait_for_completion+0x97/0xe0
sync_inodes_sb+0xf8/0x2e0
__iterate_supers+0xdc/0x160
ksys_sync+0x43/0xb0
pm_fs_sync_work_fn+0x17/0xa0
process_one_work+0x193/0x350
worker_thread+0x1a1/0x310
kthread+0xfc/0x240
ret_from_fork+0x243/0x280
ret_from_fork_asm+0x1a/0x30
</TASK>
Prior to this commit, mappings with no data integrity guarantees skipped
waiting on writeback completion but it still waited on the flusher
threads to finish initiating the writeback. On fuse this is problematic
because even though the writeback requests are non-blocking background
requests, there are still paths that may cause the flusher thread to
block (eg if systemd freezes the user session cgroups first, which
freezes the fuse daemon, before invoking the kernel suspend. The kernel
suspend triggers ->write_node() which on fuse issues a synchronous
setattr request, which cannot be processed since daemon is frozen. Or
another example, if the daemon is buggy and does not properly complete
writeback, initiating writeback on a dirty folio already under writeback
leads to writeback_get_folio() -> folio_prepare_writeback() ->
unconditional wait on writeback to finish which will cause a hang). This
commit restores fuse to its prior behavior before tmp folios were
removed, where sync was essentially a no-op.
[1] https://lore.kernel.org/linux-fsdevel/CAJnrk1a-asuvfrbKXbEwwDSctvemF+6zfhdnuzO65Pt8HsFSRw@mail.gmail.com/T/#m632c4648e9cafc4239299887109ebd880ac6c5c1
Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
Reported-by: John <therealgraysky@proton.me>
Cc: <stable@vger.kernel.org>
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
fs/fs-writeback.c | 7 +------
fs/fuse/file.c | 4 +---
fs/fuse/inode.c | 1 +
fs/sync.c | 7 ++++++-
include/linux/fs/super_types.h | 1 +
include/linux/pagemap.h | 11 -----------
6 files changed, 10 insertions(+), 21 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 7c75ed7e8979..154249e4e5ce 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2775,13 +2775,8 @@ static void wait_sb_inodes(struct super_block *sb)
* The mapping can appear untagged while still on-list since we
* do not have the mapping lock. Skip it here, wb completion
* will remove it.
- *
- * If the mapping does not have data integrity semantics,
- * there's no need to wait for the writeout to complete, as the
- * mapping cannot guarantee that data is persistently stored.
*/
- if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) ||
- mapping_no_data_integrity(mapping))
+ if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
continue;
spin_unlock_irq(&sb->s_inode_wblist_lock);
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a9c836d7f586..f6240f24b814 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3202,10 +3202,8 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
inode->i_fop = &fuse_file_operations;
inode->i_data.a_ops = &fuse_file_aops;
- if (fc->writeback_cache) {
+ if (fc->writeback_cache)
mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
- mapping_set_no_data_integrity(&inode->i_data);
- }
INIT_LIST_HEAD(&fi->write_files);
INIT_LIST_HEAD(&fi->queued_writes);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index e57b8af06be9..c795abe47a4f 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1709,6 +1709,7 @@ static void fuse_sb_defaults(struct super_block *sb)
sb->s_export_op = &fuse_export_operations;
sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE;
sb->s_iflags |= SB_I_NOIDMAP;
+ sb->s_iflags |= SB_I_NO_DATA_INTEGRITY;
if (sb->s_user_ns != &init_user_ns)
sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
diff --git a/fs/sync.c b/fs/sync.c
index 942a60cfedfb..aedbf723830a 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -73,7 +73,12 @@ EXPORT_SYMBOL(sync_filesystem);
static void sync_inodes_one_sb(struct super_block *sb, void *arg)
{
- if (!sb_rdonly(sb))
+ if (sb_rdonly(sb))
+ return;
+
+ if (sb->s_iflags & SB_I_NO_DATA_INTEGRITY)
+ wakeup_flusher_threads_bdi(sb->s_bdi, WB_REASON_SYNC);
+ else
sync_inodes_sb(sb);
}
diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
index fa7638b81246..383050e7fdf5 100644
--- a/include/linux/fs/super_types.h
+++ b/include/linux/fs/super_types.h
@@ -338,5 +338,6 @@ struct super_block {
#define SB_I_NOUMASK 0x00001000 /* VFS does not apply umask */
#define SB_I_NOIDMAP 0x00002000 /* No idmapped mounts on this superblock */
#define SB_I_ALLOW_HSM 0x00004000 /* Allow HSM events on this superblock */
+#define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data persistence on sync */
#endif /* _LINUX_FS_SUPER_TYPES_H */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ec442af3f886..31a848485ad9 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -210,7 +210,6 @@ enum mapping_flags {
AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9,
AS_KERNEL_FILE = 10, /* mapping for a fake kernel file that shouldn't
account usage to user cgroups */
- AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */
/* Bits 16-25 are used for FOLIO_ORDER */
AS_FOLIO_ORDER_BITS = 5,
AS_FOLIO_ORDER_MIN = 16,
@@ -346,16 +345,6 @@ static inline bool mapping_writeback_may_deadlock_on_reclaim(const struct addres
return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
}
-static inline void mapping_set_no_data_integrity(struct address_space *mapping)
-{
- set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
-}
-
-static inline bool mapping_no_data_integrity(const struct address_space *mapping)
-{
- return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
-}
-
static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
{
return mapping->gfp_mask;
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees
2026-03-19 19:45 ` [PATCH v2 1/1] " Joanne Koong
@ 2026-03-19 21:26 ` David Hildenbrand (Arm)
2026-03-19 23:55 ` Joanne Koong
0 siblings, 1 reply; 4+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-19 21:26 UTC (permalink / raw)
To: Joanne Koong, brauner
Cc: linux-fsdevel, jack, miklos, therealgraysky, linux-pm, stable
On 3/19/26 20:45, Joanne Koong wrote:
> Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
> guarantee data persistence on sync (eg fuse). For superblocks with this
> flag set, sync(2) kicks off writeback of dirty inodes but does not wait
> for the flusher threads to complete the writeback.
>
> This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
> commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
> in wait_sb_inodes()"). The flag belongs at the superblock level because
> data integrity is a filesystem-wide property, not a per-inode one.
> Having this flag at the superblock level allows us to skip the logic in
> sync_inodes_sb() entirely, rather than iterating every dirty inode in
> wait_sb_inodes() only to skip each inode individually.
Makes sense to me.
[...]
> if (sb->s_user_ns != &init_user_ns)
> sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
> sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
> diff --git a/fs/sync.c b/fs/sync.c
> index 942a60cfedfb..aedbf723830a 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -73,7 +73,12 @@ EXPORT_SYMBOL(sync_filesystem);
>
> static void sync_inodes_one_sb(struct super_block *sb, void *arg)
> {
> - if (!sb_rdonly(sb))
> + if (sb_rdonly(sb))
> + return;
> +
Should we move some of the comment you deleting over here?
> + if (sb->s_iflags & SB_I_NO_DATA_INTEGRITY)
> + wakeup_flusher_threads_bdi(sb->s_bdi, WB_REASON_SYNC);
> + else
> sync_inodes_sb(sb);
> }
I was wondering whether that handling should be moved to
sync_inodes_sb(), so it would catch any (existing+future) callers.
Alternatively, we could catch abuse by adding a warning to sync_inodes_sb.
--
Cheers,
David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees
2026-03-19 21:26 ` David Hildenbrand (Arm)
@ 2026-03-19 23:55 ` Joanne Koong
0 siblings, 0 replies; 4+ messages in thread
From: Joanne Koong @ 2026-03-19 23:55 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: brauner, linux-fsdevel, jack, miklos, therealgraysky, linux-pm,
stable
On Thu, Mar 19, 2026 at 2:26 PM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 3/19/26 20:45, Joanne Koong wrote:
> > Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
> > guarantee data persistence on sync (eg fuse). For superblocks with this
> > flag set, sync(2) kicks off writeback of dirty inodes but does not wait
> > for the flusher threads to complete the writeback.
> >
> > This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
> > commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
> > in wait_sb_inodes()"). The flag belongs at the superblock level because
> > data integrity is a filesystem-wide property, not a per-inode one.
> > Having this flag at the superblock level allows us to skip the logic in
> > sync_inodes_sb() entirely, rather than iterating every dirty inode in
> > wait_sb_inodes() only to skip each inode individually.
>
> Makes sense to me.
>
> [...]
>
> > if (sb->s_user_ns != &init_user_ns)
> > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
> > sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
> > diff --git a/fs/sync.c b/fs/sync.c
> > index 942a60cfedfb..aedbf723830a 100644
> > --- a/fs/sync.c
> > +++ b/fs/sync.c
> > @@ -73,7 +73,12 @@ EXPORT_SYMBOL(sync_filesystem);
> >
> > static void sync_inodes_one_sb(struct super_block *sb, void *arg)
> > {
> > - if (!sb_rdonly(sb))
> > + if (sb_rdonly(sb))
> > + return;
> > +
>
> Should we move some of the comment you deleting over here?
That's a good idea - I'll bring back the comment block that was
previously in wait_sb_inodes() and move it here
>
> > + if (sb->s_iflags & SB_I_NO_DATA_INTEGRITY)
> > + wakeup_flusher_threads_bdi(sb->s_bdi, WB_REASON_SYNC);
> > + else
> > sync_inodes_sb(sb);
> > }
> I was wondering whether that handling should be moved to
> sync_inodes_sb(), so it would catch any (existing+future) callers.
>
> Alternatively, we could catch abuse by adding a warning to sync_inodes_sb.
Thinking about this some more, I think you're right.
At the vfs layer, the only other relevant callers are
generic_shutdown_super() when handling an unmount and syncfs for
syncing a specific fd. For these, I was thinking it'd be more useful
to return after writeback has actually completed, since the caller has
to directly invoke the operation on a specific fd/mount. But that's
wrong, we shouldn't be picking and choosing which syncs are ok vs not
ok to wait for. And I'm realizing now there's also the
hibernate/suspend call paths that call filesystems_freeze_callback()
which can also call into sync_filesystem() -> sync_inodes_sb()...
I'll make this change for v3.
Thanks,
Joanne
>
> --
> Cheers,
>
> David
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-19 23:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 19:45 [PATCH v2 0/1] writeback: don't block sync(2) for filesystems with no data integrity guarantees Joanne Koong
2026-03-19 19:45 ` [PATCH v2 1/1] " Joanne Koong
2026-03-19 21:26 ` David Hildenbrand (Arm)
2026-03-19 23:55 ` Joanne Koong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox