linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration
@ 2025-08-15 18:25 Joanne Koong
  2025-08-15 18:25 ` [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed Joanne Koong
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Joanne Koong @ 2025-08-15 18:25 UTC (permalink / raw)
  To: brauner; +Cc: miklos, djwong, linux-fsdevel, kernel-team

These 2 patches are fixes for inode blocksize usage in fuse.
The first is neededed to maintain current stat() behavior for clients in the
case where fuse uses cached values for stat, and the second is needed for
fuseblk filesystems as a workaround until fuse implements iomap for reads.

These patches are on top of Christian's vfs.fixes tree.

Thanks,
Joanne

Changelog
v2: https://lore.kernel.org/linux-fsdevel/20250813223521.734817-1-joannelkoong@gmail.com/
v2 -> v3:
- use u8 instead of unsigned char for blkbits type (Darrick)

v1: https://lore.kernel.org/linux-fsdevel/20250812214614.2674485-1-joannelkoong@gmail.com/
v1 -> v2:
- fix spacing to avoid overly long line
- add ctx->blksize check

Joanne Koong (2):
  fuse: reflect cached blocksize if blocksize was changed
  fuse: fix fuseblk i_blkbits for iomap partial writes

 fs/fuse/dir.c    |  3 ++-
 fs/fuse/fuse_i.h | 14 ++++++++++++++
 fs/fuse/inode.c  | 16 ++++++++++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed
  2025-08-15 18:25 [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Joanne Koong
@ 2025-08-15 18:25 ` Joanne Koong
  2025-08-18  8:32   ` Chunsheng Luo
  2025-08-15 18:25 ` [PATCH v3 2/2] fuse: fix fuseblk i_blkbits for iomap partial writes Joanne Koong
  2025-08-19  8:26 ` [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Christian Brauner
  2 siblings, 1 reply; 8+ messages in thread
From: Joanne Koong @ 2025-08-15 18:25 UTC (permalink / raw)
  To: brauner; +Cc: miklos, djwong, linux-fsdevel, kernel-team

As pointed out by Miklos[1], in the fuse_update_get_attr() path, the
attributes returned to stat may be cached values instead of fresh ones
fetched from the server. In the case where the server returned a
modified blocksize value, we need to cache it and reflect it back to
stat if values are not re-fetched since we now no longer directly change
inode->i_blkbits.

Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguCOxeVX88_zPd1hqziB_C+tmfuDhZP5qO2nKmnb-dTUA@mail.gmail.com/ [1]

Fixes: 542ede096e48 ("fuse: keep inode->i_blkbits constant)
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/fuse/dir.c    | 1 +
 fs/fuse/fuse_i.h | 6 ++++++
 fs/fuse/inode.c  | 5 +++++
 3 files changed, 12 insertions(+)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 2d817d7cab26..ebee7e0b1cd3 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1377,6 +1377,7 @@ static int fuse_update_get_attr(struct mnt_idmap *idmap, struct inode *inode,
 		generic_fillattr(idmap, request_mask, inode, stat);
 		stat->mode = fi->orig_i_mode;
 		stat->ino = fi->orig_ino;
+		stat->blksize = 1 << fi->cached_i_blkbits;
 		if (test_bit(FUSE_I_BTIME, &fi->state)) {
 			stat->btime = fi->i_btime;
 			stat->result_mask |= STATX_BTIME;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index ec248d13c8bf..1647eb7ca6fa 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -210,6 +210,12 @@ struct fuse_inode {
 	/** Reference to backing file in passthrough mode */
 	struct fuse_backing *fb;
 #endif
+
+	/*
+	 * The underlying inode->i_blkbits value will not be modified,
+	 * so preserve the blocksize specified by the server.
+	 */
+	u8 cached_i_blkbits;
 };
 
 /** FUSE inode state bits */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 67c2318bfc42..3bfd83469d9f 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -289,6 +289,11 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
 		}
 	}
 
+	if (attr->blksize)
+		fi->cached_i_blkbits = ilog2(attr->blksize);
+	else
+		fi->cached_i_blkbits = inode->i_sb->s_blocksize_bits;
+
 	/*
 	 * Don't set the sticky bit in i_mode, unless we want the VFS
 	 * to check permissions.  This prevents failures due to the
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] fuse: fix fuseblk i_blkbits for iomap partial writes
  2025-08-15 18:25 [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Joanne Koong
  2025-08-15 18:25 ` [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed Joanne Koong
@ 2025-08-15 18:25 ` Joanne Koong
  2025-08-19  8:26 ` [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Christian Brauner
  2 siblings, 0 replies; 8+ messages in thread
From: Joanne Koong @ 2025-08-15 18:25 UTC (permalink / raw)
  To: brauner; +Cc: miklos, djwong, linux-fsdevel, kernel-team

On regular fuse filesystems, i_blkbits is set to PAGE_SHIFT which means
any iomap partial writes will mark the entire folio as uptodate. However
fuseblk filesystems work differently and allow the blocksize to be less
than the page size. As such, this may lead to data corruption if fuseblk
sets its blocksize to less than the page size, uses the writeback cache,
and does a partial write, then a read and the read happens before the
write has undergone writeback, since the folio will not be marked
uptodate from the partial write so the read will read in the entire
folio from disk, which will overwrite the partial write.

The long-term solution for this, which will also be needed for fuse to
enable large folios with the writeback cache on, is to have fuse also
use iomap for folio reads, but until that is done, the cleanest
workaround is to use the page size for fuseblk's internal kernel inode
blksize/blkbits values while maintaining current behavior for stat().

This was verified using ntfs-3g:
$ sudo mkfs.ntfs -f -c 512 /dev/vdd1
$ sudo ntfs-3g /dev/vdd1 ~/fuseblk
$ stat ~/fuseblk/hi.txt
IO Block: 512

Fixes: a4c9ab1d4975 ("fuse: use iomap for buffered writes")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/fuse/dir.c    |  2 +-
 fs/fuse/fuse_i.h |  8 ++++++++
 fs/fuse/inode.c  | 13 ++++++++++++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index ebee7e0b1cd3..5c569c3cb53f 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1199,7 +1199,7 @@ static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode,
 	if (attr->blksize != 0)
 		blkbits = ilog2(attr->blksize);
 	else
-		blkbits = inode->i_sb->s_blocksize_bits;
+		blkbits = fc->blkbits;
 
 	stat->blksize = 1 << blkbits;
 }
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1647eb7ca6fa..cc428d04be3e 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -975,6 +975,14 @@ struct fuse_conn {
 		/* Request timeout (in jiffies). 0 = no timeout */
 		unsigned int req_timeout;
 	} timeout;
+
+	/*
+	 * This is a workaround until fuse uses iomap for reads.
+	 * For fuseblk servers, this represents the blocksize passed in at
+	 * mount time and for regular fuse servers, this is equivalent to
+	 * inode->i_blkbits.
+	 */
+	u8 blkbits;
 };
 
 /*
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 3bfd83469d9f..7ddfd2b3cc9c 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -292,7 +292,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
 	if (attr->blksize)
 		fi->cached_i_blkbits = ilog2(attr->blksize);
 	else
-		fi->cached_i_blkbits = inode->i_sb->s_blocksize_bits;
+		fi->cached_i_blkbits = fc->blkbits;
 
 	/*
 	 * Don't set the sticky bit in i_mode, unless we want the VFS
@@ -1810,10 +1810,21 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
 		err = -EINVAL;
 		if (!sb_set_blocksize(sb, ctx->blksize))
 			goto err;
+		/*
+		 * This is a workaround until fuse hooks into iomap for reads.
+		 * Use PAGE_SIZE for the blocksize else if the writeback cache
+		 * is enabled, buffered writes go through iomap and a read may
+		 * overwrite partially written data if blocksize < PAGE_SIZE
+		 */
+		fc->blkbits = sb->s_blocksize_bits;
+		if (ctx->blksize != PAGE_SIZE &&
+		    !sb_set_blocksize(sb, PAGE_SIZE))
+			goto err;
 #endif
 	} else {
 		sb->s_blocksize = PAGE_SIZE;
 		sb->s_blocksize_bits = PAGE_SHIFT;
+		fc->blkbits = sb->s_blocksize_bits;
 	}
 
 	sb->s_subtype = ctx->subtype;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed
  2025-08-15 18:25 ` [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed Joanne Koong
@ 2025-08-18  8:32   ` Chunsheng Luo
  2025-08-19 14:43     ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Chunsheng Luo @ 2025-08-18  8:32 UTC (permalink / raw)
  To: joannelkoong; +Cc: brauner, djwong, kernel-team, linux-fsdevel, miklos

On Fri, 15 Aug 2025 11:25:38 Joanne Koong wrote:
>diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
>index ec248d13c8bf..1647eb7ca6fa 100644
>--- a/fs/fuse/fuse_i.h
>+++ b/fs/fuse/fuse_i.h
>@@ -210,6 +210,12 @@ struct fuse_inode {
>        /** Reference to backing file in passthrough mode */
>        struct fuse_backing *fb;
> #endif
>+
>+       /*
>+        * The underlying inode->i_blkbits value will not be modified,
>+        * so preserve the blocksize specified by the server.
>+        */
>+       u8 cached_i_blkbits;
> };

Does the `cached_i_blkbits` member also need to be initialized in the
`fuse_alloc_inode` function like `orig_ino`? 

And I am also confused, why does `orig_ino` need to be initialized in
`fuse_alloc_inode`, but the `orig_i_mode` member does not need it?

Thanks
Chunsheng Luo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration
  2025-08-15 18:25 [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Joanne Koong
  2025-08-15 18:25 ` [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed Joanne Koong
  2025-08-15 18:25 ` [PATCH v3 2/2] fuse: fix fuseblk i_blkbits for iomap partial writes Joanne Koong
@ 2025-08-19  8:26 ` Christian Brauner
  2025-08-19  8:35   ` Miklos Szeredi
  2 siblings, 1 reply; 8+ messages in thread
From: Christian Brauner @ 2025-08-19  8:26 UTC (permalink / raw)
  To: Joanne Koong; +Cc: miklos, djwong, linux-fsdevel, kernel-team

On Fri, Aug 15, 2025 at 11:25:37AM -0700, Joanne Koong wrote:
> These 2 patches are fixes for inode blocksize usage in fuse.
> The first is neededed to maintain current stat() behavior for clients in the
> case where fuse uses cached values for stat, and the second is needed for
> fuseblk filesystems as a workaround until fuse implements iomap for reads.
> 
> These patches are on top of Christian's vfs.fixes tree.

Thanks! This all looks straightforward to me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration
  2025-08-19  8:26 ` [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Christian Brauner
@ 2025-08-19  8:35   ` Miklos Szeredi
  0 siblings, 0 replies; 8+ messages in thread
From: Miklos Szeredi @ 2025-08-19  8:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Joanne Koong, djwong, linux-fsdevel, kernel-team

On Tue, 19 Aug 2025 at 10:26, Christian Brauner <brauner@kernel.org> wrote:
>
> On Fri, Aug 15, 2025 at 11:25:37AM -0700, Joanne Koong wrote:
> > These 2 patches are fixes for inode blocksize usage in fuse.
> > The first is neededed to maintain current stat() behavior for clients in the
> > case where fuse uses cached values for stat, and the second is needed for
> > fuseblk filesystems as a workaround until fuse implements iomap for reads.
> >
> > These patches are on top of Christian's vfs.fixes tree.
>
> Thanks! This all looks straightforward to me.

I'll take these into the fuse tree to avoid conflicts with other work.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed
  2025-08-18  8:32   ` Chunsheng Luo
@ 2025-08-19 14:43     ` Miklos Szeredi
  2025-08-20  2:38       ` Chunsheng Luo
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2025-08-19 14:43 UTC (permalink / raw)
  To: Chunsheng Luo; +Cc: joannelkoong, brauner, djwong, kernel-team, linux-fsdevel

On Mon, 18 Aug 2025 at 10:32, Chunsheng Luo <luochunsheng@ustc.edu> wrote:
>
> On Fri, 15 Aug 2025 11:25:38 Joanne Koong wrote:
> >diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> >index ec248d13c8bf..1647eb7ca6fa 100644
> >--- a/fs/fuse/fuse_i.h
> >+++ b/fs/fuse/fuse_i.h
> >@@ -210,6 +210,12 @@ struct fuse_inode {
> >        /** Reference to backing file in passthrough mode */
> >        struct fuse_backing *fb;
> > #endif
> >+
> >+       /*
> >+        * The underlying inode->i_blkbits value will not be modified,
> >+        * so preserve the blocksize specified by the server.
> >+        */
> >+       u8 cached_i_blkbits;
> > };
>
> Does the `cached_i_blkbits` member also need to be initialized in the
> `fuse_alloc_inode` function like `orig_ino`?
>
> And I am also confused, why does `orig_ino` need to be initialized in
> `fuse_alloc_inode`, but the `orig_i_mode` member does not need it?

Strictly speaking neither one needs initialization, since these
shouldn't be accessed until the in-core inode is set up in lookup or
create.

But having random values in there is not nice, so I prefer having
everything initialized in fuse_alloc_inode().  See patch below
(whitespace damage(TM) by gmail).

Thanks,
Miklos

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 19fc58cb84dc..9d26a5bc394d 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -101,14 +101,11 @@ static struct inode *fuse_alloc_inode(struct
super_block *sb)
        if (!fi)
                return NULL;

-       fi->i_time = 0;
+       /* Initialize private data (i.e. everything except fi->inode) */
+       BUILD_BUG_ON(offsetof(struct fuse_inode, inode) != 0);
+       memset((void *) fi + sizeof(fi->inode), 0, sizeof(*fi) -
sizeof(fi->inode));
+
        fi->inval_mask = ~0;
-       fi->nodeid = 0;
-       fi->nlookup = 0;
-       fi->attr_version = 0;
-       fi->orig_ino = 0;
-       fi->state = 0;
-       fi->submount_lookup = NULL;
        mutex_init(&fi->mutex);
        spin_lock_init(&fi->lock);
        fi->forget = fuse_alloc_forget();

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed
  2025-08-19 14:43     ` Miklos Szeredi
@ 2025-08-20  2:38       ` Chunsheng Luo
  0 siblings, 0 replies; 8+ messages in thread
From: Chunsheng Luo @ 2025-08-20  2:38 UTC (permalink / raw)
  To: miklos
  Cc: brauner, djwong, joannelkoong, kernel-team, linux-fsdevel,
	luochunsheng

On Tue, 19 Aug 2025 16:43:05 Miklos wrote:

> On Mon, 18 Aug 2025 at 10:32, Chunsheng Luo <luochunsheng@ustc.edu> wrote:
> >
> > On Fri, 15 Aug 2025 11:25:38 Joanne Koong wrote:
> > >diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > >index ec248d13c8bf..1647eb7ca6fa 100644
> > >--- a/fs/fuse/fuse_i.h
> > >+++ b/fs/fuse/fuse_i.h
> > >@@ -210,6 +210,12 @@ struct fuse_inode {
> > >        /** Reference to backing file in passthrough mode */
> > >        struct fuse_backing *fb;
> > > #endif
> > >+
> > >+       /*
> > >+        * The underlying inode->i_blkbits value will not be modified,
> > >+        * so preserve the blocksize specified by the server.
> > >+        */
> > >+       u8 cached_i_blkbits;
> > > };
> >
> > Does the `cached_i_blkbits` member also need to be initialized in the
> > `fuse_alloc_inode` function like `orig_ino`?
> >
> > And I am also confused, why does `orig_ino` need to be initialized in
> > `fuse_alloc_inode`, but the `orig_i_mode` member does not need it?
> 
> Strictly speaking neither one needs initialization, since these
> shouldn't be accessed until the in-core inode is set up in lookup or
> create.
> 
> But having random values in there is not nice, so I prefer having
> everything initialized in fuse_alloc_inode().  See patch below
> (whitespace damage(TM) by gmail).
> 
> Thanks,
> Miklos
> 

Understood. Thanks for the explanation. The below patch looks good to me.

Thanks. 
Chunsheng Luo

> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 19fc58cb84dc..9d26a5bc394d 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -101,14 +101,11 @@ static struct inode *fuse_alloc_inode(struct
> super_block *sb)
>         if (!fi)
>                 return NULL;
> 
> -       fi->i_time = 0;
> +       /* Initialize private data (i.e. everything except fi->inode) */
> +       BUILD_BUG_ON(offsetof(struct fuse_inode, inode) != 0);
> +       memset((void *) fi + sizeof(fi->inode), 0, sizeof(*fi) -
> sizeof(fi->inode));
> +
>         fi->inval_mask = ~0;
> -       fi->nodeid = 0;
> -       fi->nlookup = 0;
> -       fi->attr_version = 0;
> -       fi->orig_ino = 0;
> -       fi->state = 0;
> -       fi->submount_lookup = NULL;
>         mutex_init(&fi->mutex);
>         spin_lock_init(&fi->lock);
>         fi->forget = fuse_alloc_forget();
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-08-20  2:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 18:25 [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Joanne Koong
2025-08-15 18:25 ` [PATCH v3 1/2] fuse: reflect cached blocksize if blocksize was changed Joanne Koong
2025-08-18  8:32   ` Chunsheng Luo
2025-08-19 14:43     ` Miklos Szeredi
2025-08-20  2:38       ` Chunsheng Luo
2025-08-15 18:25 ` [PATCH v3 2/2] fuse: fix fuseblk i_blkbits for iomap partial writes Joanne Koong
2025-08-19  8:26 ` [PATCH v3 0/2] fuse: inode blocksize fixes for iomap integration Christian Brauner
2025-08-19  8:35   ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).