* [PATCH 01/15] fs,fsverity: reject size changes on fsverity files in setattr_prepare
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-02-17 21:14 ` [f2fs-dev] [PATCH 01/15] fs, fsverity: " patchwork-bot+f2fs
2026-01-28 15:26 ` [PATCH 02/15] fs,fsverity: clear out fsverity_info from common code Christoph Hellwig
` (14 subsequent siblings)
15 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Add the check to reject truncates of fsverity files directly to
setattr_prepare instead of requiring the file system to handle it.
Besides removing boilerplate code, this also fixes the complete lack of
such check in btrfs.
Fixes: 146054090b08 ("btrfs: initial fsverity support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/attr.c | 12 +++++++++++-
fs/ext4/inode.c | 4 ----
fs/f2fs/file.c | 4 ----
fs/verity/open.c | 8 --------
include/linux/fsverity.h | 25 -------------------------
5 files changed, 11 insertions(+), 42 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c
index b9ec6b47bab2..e7d7c6d19fe9 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -169,7 +169,17 @@ int setattr_prepare(struct mnt_idmap *idmap, struct dentry *dentry,
* ATTR_FORCE.
*/
if (ia_valid & ATTR_SIZE) {
- int error = inode_newsize_ok(inode, attr->ia_size);
+ int error;
+
+ /*
+ * Verity files are immutable, so deny truncates. This isn't
+ * covered by the open-time check because sys_truncate() takes a
+ * path, not an open file.
+ */
+ if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode))
+ return -EPERM;
+
+ error = inode_newsize_ok(inode, attr->ia_size);
if (error)
return error;
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0c466ccbed69..8c2ef98fa530 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5835,10 +5835,6 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
if (error)
return error;
- error = fsverity_prepare_setattr(dentry, attr);
- if (error)
- return error;
-
if (is_quota_modification(idmap, inode, attr)) {
error = dquot_initialize(inode);
if (error)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index d7047ca6b98d..da029fed4e5a 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1074,10 +1074,6 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
if (err)
return err;
- err = fsverity_prepare_setattr(dentry, attr);
- if (err)
- return err;
-
if (unlikely(IS_IMMUTABLE(inode)))
return -EPERM;
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 77b1c977af02..2aa5eae5a540 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -384,14 +384,6 @@ int __fsverity_file_open(struct inode *inode, struct file *filp)
}
EXPORT_SYMBOL_GPL(__fsverity_file_open);
-int __fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr)
-{
- if (attr->ia_valid & ATTR_SIZE)
- return -EPERM;
- return 0;
-}
-EXPORT_SYMBOL_GPL(__fsverity_prepare_setattr);
-
void __fsverity_cleanup_inode(struct inode *inode)
{
struct fsverity_info **vi_addr = fsverity_info_addr(inode);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 5bc7280425a7..86fb1708676b 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -179,7 +179,6 @@ int fsverity_get_digest(struct inode *inode,
/* open.c */
int __fsverity_file_open(struct inode *inode, struct file *filp);
-int __fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
void __fsverity_cleanup_inode(struct inode *inode);
/**
@@ -251,12 +250,6 @@ static inline int __fsverity_file_open(struct inode *inode, struct file *filp)
return -EOPNOTSUPP;
}
-static inline int __fsverity_prepare_setattr(struct dentry *dentry,
- struct iattr *attr)
-{
- return -EOPNOTSUPP;
-}
-
static inline void fsverity_cleanup_inode(struct inode *inode)
{
}
@@ -338,22 +331,4 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
return 0;
}
-/**
- * fsverity_prepare_setattr() - prepare to change a verity inode's attributes
- * @dentry: dentry through which the inode is being changed
- * @attr: attributes to change
- *
- * Verity files are immutable, so deny truncates. This isn't covered by the
- * open-time check because sys_truncate() takes a path, not a file descriptor.
- *
- * Return: 0 on success, -errno on failure
- */
-static inline int fsverity_prepare_setattr(struct dentry *dentry,
- struct iattr *attr)
-{
- if (IS_VERITY(d_inode(dentry)))
- return __fsverity_prepare_setattr(dentry, attr);
- return 0;
-}
-
#endif /* _LINUX_FSVERITY_H */
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [f2fs-dev] [PATCH 01/15] fs, fsverity: reject size changes on fsverity files in setattr_prepare
2026-01-28 15:26 ` [PATCH 01/15] fs,fsverity: reject size changes on fsverity files in setattr_prepare Christoph Hellwig
@ 2026-02-17 21:14 ` patchwork-bot+f2fs
0 siblings, 0 replies; 27+ messages in thread
From: patchwork-bot+f2fs @ 2026-02-17 21:14 UTC (permalink / raw)
To: Christoph Hellwig
Cc: ebiggers, fsverity, brauner, tytso, djwong, aalbersh, willy,
linux-f2fs-devel, linux-fsdevel, viro, jaegeuk, dsterba, jack,
linux-ext4, linux-btrfs
Hello:
This series was applied to jaegeuk/f2fs.git (dev)
by Eric Biggers <ebiggers@kernel.org>:
On Wed, 28 Jan 2026 16:26:13 +0100 you wrote:
> Add the check to reject truncates of fsverity files directly to
> setattr_prepare instead of requiring the file system to handle it.
> Besides removing boilerplate code, this also fixes the complete lack of
> such check in btrfs.
>
> Fixes: 146054090b08 ("btrfs: initial fsverity support")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
>
> [...]
Here is the summary with links:
- [f2fs-dev,01/15] fs, fsverity: reject size changes on fsverity files in setattr_prepare
https://git.kernel.org/jaegeuk/f2fs/c/e9734653c523
- [f2fs-dev,02/15] fs, fsverity: clear out fsverity_info from common code
https://git.kernel.org/jaegeuk/f2fs/c/70098d932714
- [f2fs-dev,03/15] ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY
https://git.kernel.org/jaegeuk/f2fs/c/fb2661645909
- [f2fs-dev,04/15] f2fs: don't build the fsverity work handler for !CONFIG_FS_VERITY
https://git.kernel.org/jaegeuk/f2fs/c/6f9fae2f738c
- [f2fs-dev,05/15] fsverity: pass struct file to ->write_merkle_tree_block
(no matching commit)
- [f2fs-dev,06/15] fsverity: start consolidating pagecache code
(no matching commit)
- [f2fs-dev,07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
(no matching commit)
- [f2fs-dev,08/15] fsverity: kick off hash readahead at data I/O submission time
(no matching commit)
- [f2fs-dev,09/15] fsverity: deconstify the inode pointer in struct fsverity_info
https://git.kernel.org/jaegeuk/f2fs/c/7e36e044958d
- [f2fs-dev,10/15] fsverity: push out fsverity_info lookup
(no matching commit)
- [f2fs-dev,11/15] fs: consolidate fsverity_info lookup in buffer.c
https://git.kernel.org/jaegeuk/f2fs/c/f6ae956dfb34
- [f2fs-dev,12/15] ext4: consolidate fsverity_info lookup
(no matching commit)
- [f2fs-dev,13/15] f2fs: consolidate fsverity_info lookup
(no matching commit)
- [f2fs-dev,14/15] btrfs: consolidate fsverity_info lookup
https://git.kernel.org/jaegeuk/f2fs/c/b0160e4501bb
- [f2fs-dev,15/15] fsverity: use a hashtable to find the fsverity_info
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 02/15] fs,fsverity: clear out fsverity_info from common code
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
2026-01-28 15:26 ` [PATCH 01/15] fs,fsverity: reject size changes on fsverity files in setattr_prepare Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 03/15] ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY Christoph Hellwig
` (13 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Directly remove the fsverity_info from the hash and free it from
clear_inode instead of requiring file systems to handle it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
---
fs/btrfs/inode.c | 10 +++-------
fs/ext4/super.c | 1 -
fs/f2fs/inode.c | 1 -
fs/inode.c | 9 +++++++++
fs/verity/open.c | 3 +--
include/linux/fsverity.h | 26 ++------------------------
6 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a2b5b440637e..67c64efc5099 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -34,7 +34,6 @@
#include <linux/sched/mm.h>
#include <linux/iomap.h>
#include <linux/unaligned.h>
-#include <linux/fsverity.h>
#include "misc.h"
#include "ctree.h"
#include "disk-io.h"
@@ -5571,11 +5570,8 @@ void btrfs_evict_inode(struct inode *inode)
trace_btrfs_inode_evict(inode);
- if (!root) {
- fsverity_cleanup_inode(inode);
- clear_inode(inode);
- return;
- }
+ if (!root)
+ goto clear_inode;
fs_info = inode_to_fs_info(inode);
evict_inode_truncate_pages(inode);
@@ -5675,7 +5671,7 @@ void btrfs_evict_inode(struct inode *inode)
* to retry these periodically in the future.
*/
btrfs_remove_delayed_node(BTRFS_I(inode));
- fsverity_cleanup_inode(inode);
+clear_inode:
clear_inode(inode);
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 87205660c5d0..86131f4d8718 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1527,7 +1527,6 @@ void ext4_clear_inode(struct inode *inode)
EXT4_I(inode)->jinode = NULL;
}
fscrypt_put_encryption_info(inode);
- fsverity_cleanup_inode(inode);
}
static struct inode *ext4_nfs_get_inode(struct super_block *sb,
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 38b8994bc1b2..ee332b994348 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -1000,7 +1000,6 @@ void f2fs_evict_inode(struct inode *inode)
}
out_clear:
fscrypt_put_encryption_info(inode);
- fsverity_cleanup_inode(inode);
clear_inode(inode);
}
diff --git a/fs/inode.c b/fs/inode.c
index 379f4c19845c..38dbdfbb09ba 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -14,6 +14,7 @@
#include <linux/cdev.h>
#include <linux/memblock.h>
#include <linux/fsnotify.h>
+#include <linux/fsverity.h>
#include <linux/mount.h>
#include <linux/posix_acl.h>
#include <linux/buffer_head.h> /* for inode_has_buffers */
@@ -773,6 +774,14 @@ void dump_mapping(const struct address_space *mapping)
void clear_inode(struct inode *inode)
{
+ /*
+ * Only IS_VERITY() inodes can have verity info, so start by checking
+ * for IS_VERITY() (which is faster than retrieving the pointer to the
+ * verity info). This minimizes overhead for non-verity inodes.
+ */
+ if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode))
+ fsverity_cleanup_inode(inode);
+
/*
* We have to cycle the i_pages lock here because reclaim can be in the
* process of removing the last page (in __filemap_remove_folio())
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 2aa5eae5a540..090cb77326ee 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -384,14 +384,13 @@ int __fsverity_file_open(struct inode *inode, struct file *filp)
}
EXPORT_SYMBOL_GPL(__fsverity_file_open);
-void __fsverity_cleanup_inode(struct inode *inode)
+void fsverity_cleanup_inode(struct inode *inode)
{
struct fsverity_info **vi_addr = fsverity_info_addr(inode);
fsverity_free_info(*vi_addr);
*vi_addr = NULL;
}
-EXPORT_SYMBOL_GPL(__fsverity_cleanup_inode);
void __init fsverity_init_info_cache(void)
{
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 86fb1708676b..ea1ed2e6c2f9 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -179,26 +179,6 @@ int fsverity_get_digest(struct inode *inode,
/* open.c */
int __fsverity_file_open(struct inode *inode, struct file *filp);
-void __fsverity_cleanup_inode(struct inode *inode);
-
-/**
- * fsverity_cleanup_inode() - free the inode's verity info, if present
- * @inode: an inode being evicted
- *
- * Filesystems must call this on inode eviction to free the inode's verity info.
- */
-static inline void fsverity_cleanup_inode(struct inode *inode)
-{
- /*
- * Only IS_VERITY() inodes can have verity info, so start by checking
- * for IS_VERITY() (which is faster than retrieving the pointer to the
- * verity info). This minimizes overhead for non-verity inodes.
- */
- if (IS_VERITY(inode))
- __fsverity_cleanup_inode(inode);
- else
- VFS_WARN_ON_ONCE(*fsverity_info_addr(inode) != NULL);
-}
/* read_metadata.c */
@@ -250,10 +230,6 @@ static inline int __fsverity_file_open(struct inode *inode, struct file *filp)
return -EOPNOTSUPP;
}
-static inline void fsverity_cleanup_inode(struct inode *inode)
-{
-}
-
/* read_metadata.c */
static inline int fsverity_ioctl_read_metadata(struct file *filp,
@@ -331,4 +307,6 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
return 0;
}
+void fsverity_cleanup_inode(struct inode *inode);
+
#endif /* _LINUX_FSVERITY_H */
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 03/15] ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
2026-01-28 15:26 ` [PATCH 01/15] fs,fsverity: reject size changes on fsverity files in setattr_prepare Christoph Hellwig
2026-01-28 15:26 ` [PATCH 02/15] fs,fsverity: clear out fsverity_info from common code Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 04/15] f2fs: " Christoph Hellwig
` (12 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Use IS_ENABLED to disable this code, leading to a slight size reduction:
text data bss dec hex filename
4121 376 16 4513 11a1 fs/ext4/readpage.o.old
4030 328 16 4374 1116 fs/ext4/readpage.o
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/ext4/readpage.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index e7f2350c725b..267594ef0b2c 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -130,7 +130,8 @@ static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
ctx->cur_step++;
fallthrough;
case STEP_VERITY:
- if (ctx->enabled_steps & (1 << STEP_VERITY)) {
+ if (IS_ENABLED(CONFIG_FS_VERITY) &&
+ ctx->enabled_steps & (1 << STEP_VERITY)) {
INIT_WORK(&ctx->work, verity_work);
fsverity_enqueue_verify_work(&ctx->work);
return;
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 04/15] f2fs: don't build the fsverity work handler for !CONFIG_FS_VERITY
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (2 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 03/15] ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 05/15] fsverity: pass struct file to ->write_merkle_tree_block Christoph Hellwig
` (11 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
Use IS_ENABLED to disable this code, leading to a slight size reduction:
text data bss dec hex filename
25709 2412 24 28145 6df1 fs/f2fs/compress.o.old
25198 2252 24 27474 6b52 fs/f2fs/compress.o
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/f2fs/compress.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 7b68bf22989d..40a62f1dee4d 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1833,7 +1833,7 @@ void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed,
{
int i;
- if (!failed && dic->need_verity) {
+ if (IS_ENABLED(CONFIG_FS_VERITY) && !failed && dic->need_verity) {
/*
* Note that to avoid deadlocks, the verity work can't be done
* on the decompression workqueue. This is because verifying
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 05/15] fsverity: pass struct file to ->write_merkle_tree_block
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (3 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 04/15] f2fs: " Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 06/15] fsverity: start consolidating pagecache code Christoph Hellwig
` (10 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
This will make an iomap implementation of the method easier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
---
fs/btrfs/verity.c | 5 +++--
fs/ext4/verity.c | 6 +++---
fs/f2fs/verity.c | 6 +++---
fs/verity/enable.c | 9 +++++----
include/linux/fsverity.h | 4 ++--
5 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index a2ac3fb68bc8..e7643c22a6bf 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -774,16 +774,17 @@ static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
/*
* fsverity op that writes a Merkle tree block into the btree.
*
- * @inode: inode to write a Merkle tree block for
+ * @file: file to write a Merkle tree block for
* @buf: Merkle tree block to write
* @pos: the position of the block in the Merkle tree (in bytes)
* @size: the Merkle tree block size (in bytes)
*
* Returns 0 on success or negative error code on failure
*/
-static int btrfs_write_merkle_tree_block(struct inode *inode, const void *buf,
+static int btrfs_write_merkle_tree_block(struct file *file, const void *buf,
u64 pos, unsigned int size)
{
+ struct inode *inode = file_inode(file);
loff_t merkle_pos = merkle_file_pos(inode);
if (merkle_pos < 0)
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index 415d9c4d8a32..2ce4cf8a1e31 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -380,12 +380,12 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode,
return folio_file_page(folio, index);
}
-static int ext4_write_merkle_tree_block(struct inode *inode, const void *buf,
+static int ext4_write_merkle_tree_block(struct file *file, const void *buf,
u64 pos, unsigned int size)
{
- pos += ext4_verity_metadata_pos(inode);
+ pos += ext4_verity_metadata_pos(file_inode(file));
- return pagecache_write(inode, buf, size, pos);
+ return pagecache_write(file_inode(file), buf, size, pos);
}
const struct fsverity_operations ext4_verityops = {
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index 05b935b55216..c1c4d8044681 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -278,12 +278,12 @@ static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
return folio_file_page(folio, index);
}
-static int f2fs_write_merkle_tree_block(struct inode *inode, const void *buf,
+static int f2fs_write_merkle_tree_block(struct file *file, const void *buf,
u64 pos, unsigned int size)
{
- pos += f2fs_verity_metadata_pos(inode);
+ pos += f2fs_verity_metadata_pos(file_inode(file));
- return pagecache_write(inode, buf, size, pos);
+ return pagecache_write(file_inode(file), buf, size, pos);
}
const struct fsverity_operations f2fs_verityops = {
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 95ec42b84797..c56c18e2605b 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -41,14 +41,15 @@ static int hash_one_block(const struct merkle_tree_params *params,
return 0;
}
-static int write_merkle_tree_block(struct inode *inode, const u8 *buf,
+static int write_merkle_tree_block(struct file *file, const u8 *buf,
unsigned long index,
const struct merkle_tree_params *params)
{
+ struct inode *inode = file_inode(file);
u64 pos = (u64)index << params->log_blocksize;
int err;
- err = inode->i_sb->s_vop->write_merkle_tree_block(inode, buf, pos,
+ err = inode->i_sb->s_vop->write_merkle_tree_block(file, buf, pos,
params->block_size);
if (err)
fsverity_err(inode, "Error %d writing Merkle tree block %lu",
@@ -135,7 +136,7 @@ static int build_merkle_tree(struct file *filp,
err = hash_one_block(params, &buffers[level]);
if (err)
goto out;
- err = write_merkle_tree_block(inode,
+ err = write_merkle_tree_block(filp,
buffers[level].data,
level_offset[level],
params);
@@ -155,7 +156,7 @@ static int build_merkle_tree(struct file *filp,
err = hash_one_block(params, &buffers[level]);
if (err)
goto out;
- err = write_merkle_tree_block(inode,
+ err = write_merkle_tree_block(filp,
buffers[level].data,
level_offset[level],
params);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index ea1ed2e6c2f9..e22cf84fe83a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -116,7 +116,7 @@ struct fsverity_operations {
/**
* Write a Merkle tree block to the given inode.
*
- * @inode: the inode for which the Merkle tree is being built
+ * @file: the file for which the Merkle tree is being built
* @buf: the Merkle tree block to write
* @pos: the position of the block in the Merkle tree (in bytes)
* @size: the Merkle tree block size (in bytes)
@@ -126,7 +126,7 @@ struct fsverity_operations {
*
* Return: 0 on success, -errno on failure
*/
- int (*write_merkle_tree_block)(struct inode *inode, const void *buf,
+ int (*write_merkle_tree_block)(struct file *file, const void *buf,
u64 pos, unsigned int size);
};
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 06/15] fsverity: start consolidating pagecache code
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (4 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 05/15] fsverity: pass struct file to ->write_merkle_tree_block Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio Christoph Hellwig
` (9 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
ext4 and f2fs are largely using the same code to read a page full
of Merkle tree blocks from the page cache, and the upcoming xfs
fsverity support would add another copy.
Move the ext4 code to fs/verity/ and use it in f2fs as well. For f2fs
this removes the previous f2fs-specific error injection, but otherwise
the behavior remains unchanged.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/ext4/verity.c | 17 +----------------
fs/f2fs/verity.c | 17 +----------------
fs/verity/Makefile | 1 +
fs/verity/pagecache.c | 38 ++++++++++++++++++++++++++++++++++++++
include/linux/fsverity.h | 3 +++
5 files changed, 44 insertions(+), 32 deletions(-)
create mode 100644 fs/verity/pagecache.c
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index 2ce4cf8a1e31..a071860ad36a 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -361,23 +361,8 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode,
pgoff_t index,
unsigned long num_ra_pages)
{
- struct folio *folio;
-
index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
-
- folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
- if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
- DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
-
- if (!IS_ERR(folio))
- folio_put(folio);
- else if (num_ra_pages > 1)
- page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
- folio = read_mapping_folio(inode->i_mapping, index, NULL);
- if (IS_ERR(folio))
- return ERR_CAST(folio);
- }
- return folio_file_page(folio, index);
+ return generic_read_merkle_tree_page(inode, index, num_ra_pages);
}
static int ext4_write_merkle_tree_block(struct file *file, const void *buf,
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index c1c4d8044681..d37e584423af 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -259,23 +259,8 @@ static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
pgoff_t index,
unsigned long num_ra_pages)
{
- struct folio *folio;
-
index += f2fs_verity_metadata_pos(inode) >> PAGE_SHIFT;
-
- folio = f2fs_filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
- if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
- DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
-
- if (!IS_ERR(folio))
- folio_put(folio);
- else if (num_ra_pages > 1)
- page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
- folio = read_mapping_folio(inode->i_mapping, index, NULL);
- if (IS_ERR(folio))
- return ERR_CAST(folio);
- }
- return folio_file_page(folio, index);
+ return generic_read_merkle_tree_page(inode, index, num_ra_pages);
}
static int f2fs_write_merkle_tree_block(struct file *file, const void *buf,
diff --git a/fs/verity/Makefile b/fs/verity/Makefile
index 435559a4fa9e..ddb4a88a0d60 100644
--- a/fs/verity/Makefile
+++ b/fs/verity/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_FS_VERITY) += enable.o \
init.o \
measure.o \
open.o \
+ pagecache.o \
read_metadata.o \
verify.o
diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
new file mode 100644
index 000000000000..f67248e9e768
--- /dev/null
+++ b/fs/verity/pagecache.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2019 Google LLC
+ */
+
+#include <linux/fsverity.h>
+#include <linux/pagemap.h>
+
+/**
+ * generic_read_merkle_tree_page - generic ->read_merkle_tree_page helper
+ * @inode: inode containing the Merkle tree
+ * @index: 0-based index of the Merkle tree page in the inode
+ * @num_ra_pages: The number of Merkle tree pages that should be prefetched.
+ *
+ * The caller needs to adjust @index from the Merkle-tree relative index passed
+ * to ->read_merkle_tree_page to the actual index where the Merkle tree is
+ * stored in the page cache for @inode.
+ */
+struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
+ unsigned long num_ra_pages)
+{
+ struct folio *folio;
+
+ folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
+ if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
+ DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
+
+ if (!IS_ERR(folio))
+ folio_put(folio);
+ else if (num_ra_pages > 1)
+ page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
+ folio = read_mapping_folio(inode->i_mapping, index, NULL);
+ if (IS_ERR(folio))
+ return ERR_CAST(folio);
+ }
+ return folio_file_page(folio, index);
+}
+EXPORT_SYMBOL_GPL(generic_read_merkle_tree_page);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index e22cf84fe83a..121703625cc8 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -309,4 +309,7 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
void fsverity_cleanup_inode(struct inode *inode);
+struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
+ unsigned long num_ra_pages);
+
#endif /* _LINUX_FSVERITY_H */
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (5 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 06/15] fsverity: start consolidating pagecache code Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 16:32 ` Darrick J. Wong
2026-01-28 23:48 ` Eric Biggers
2026-01-28 15:26 ` [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time Christoph Hellwig
` (8 subsequent siblings)
15 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
Issuing more reads on errors is not a good idea, especially when the
most common error here is -ENOMEM.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/verity/pagecache.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
index f67248e9e768..eae419d8d091 100644
--- a/fs/verity/pagecache.c
+++ b/fs/verity/pagecache.c
@@ -22,7 +22,8 @@ struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
struct folio *folio;
folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
- if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
+ if (folio == ERR_PTR(-ENOENT) ||
+ (!IS_ERR(folio) && !folio_test_uptodate(folio))) {
DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
if (!IS_ERR(folio))
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
2026-01-28 15:26 ` [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio Christoph Hellwig
@ 2026-01-28 16:32 ` Darrick J. Wong
2026-01-28 23:48 ` Eric Biggers
1 sibling, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2026-01-28 16:32 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Eric Biggers, Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 04:26:19PM +0100, Christoph Hellwig wrote:
> Issuing more reads on errors is not a good idea, especially when the
> most common error here is -ENOMEM.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks fine, and I still hate the C type system and all its
barely-mentioned subtleties that cause endless discussion
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/verity/pagecache.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
> index f67248e9e768..eae419d8d091 100644
> --- a/fs/verity/pagecache.c
> +++ b/fs/verity/pagecache.c
> @@ -22,7 +22,8 @@ struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
> struct folio *folio;
>
> folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
> - if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
> + if (folio == ERR_PTR(-ENOENT) ||
> + (!IS_ERR(folio) && !folio_test_uptodate(folio))) {
> DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
>
> if (!IS_ERR(folio))
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
2026-01-28 15:26 ` [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio Christoph Hellwig
2026-01-28 16:32 ` Darrick J. Wong
@ 2026-01-28 23:48 ` Eric Biggers
2026-01-30 5:48 ` Christoph Hellwig
1 sibling, 1 reply; 27+ messages in thread
From: Eric Biggers @ 2026-01-28 23:48 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 04:26:19PM +0100, Christoph Hellwig wrote:
> Issuing more reads on errors is not a good idea, especially when the
> most common error here is -ENOMEM.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/verity/pagecache.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
> index f67248e9e768..eae419d8d091 100644
> --- a/fs/verity/pagecache.c
> +++ b/fs/verity/pagecache.c
> @@ -22,7 +22,8 @@ struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
> struct folio *folio;
>
> folio = __filemap_get_folio(inode->i_mapping, index, FGP_ACCESSED, 0);
> - if (IS_ERR(folio) || !folio_test_uptodate(folio)) {
> + if (folio == ERR_PTR(-ENOENT) ||
> + (!IS_ERR(folio) && !folio_test_uptodate(folio))) {
> DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
This patch is still incorrect: when IS_ERR(folio) && folio !=
ERR_PTR(-ENOENT) it falls through to folio_file_page(), which crashes.
See https://lore.kernel.org/r/20260126205301.GD30838@quark/ for a
correct suggestion.
- Eric
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
2026-01-28 23:48 ` Eric Biggers
@ 2026-01-30 5:48 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-30 5:48 UTC (permalink / raw)
To: Eric Biggers
Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jan Kara,
David Sterba, Theodore Ts'o, Jaegeuk Kim, Chao Yu,
Andrey Albershteyn, Matthew Wilcox, linux-fsdevel, linux-btrfs,
linux-ext4, linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 03:48:14PM -0800, Eric Biggers wrote:
> This patch is still incorrect: when IS_ERR(folio) && folio !=
> ERR_PTR(-ENOENT) it falls through to folio_file_page(), which crashes.
> See https://lore.kernel.org/r/20260126205301.GD30838@quark/ for a
> correct suggestion.
Sorry, I missed that part that only applies to the version after this
patch. Fixed now.
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (6 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 07/15] fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 16:33 ` Darrick J. Wong
2026-01-28 22:56 ` Eric Biggers
2026-01-28 15:26 ` [PATCH 09/15] fsverity: deconstify the inode pointer in struct fsverity_info Christoph Hellwig
` (7 subsequent siblings)
15 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
Currently all reads of the fsverity hashes is kicked off from the data
I/O completion handler, leading to needlessly dependent I/O. This is
worked around a bit by performing readahead on the level 0 nodes, but
still fairly ineffective.
Switch to a model where the ->read_folio and ->readahead methods instead
kick off explicit readahead of the fsverity hashed so they are usually
available at I/O completion time.
For 64k sequential reads on my test VM this improves read performance
from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
random reads are likely to be even bigger.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
---
fs/btrfs/verity.c | 4 +--
fs/ext4/readpage.c | 7 ++++
fs/ext4/verity.c | 13 +++++--
fs/f2fs/data.c | 7 ++++
fs/f2fs/verity.c | 13 +++++--
fs/verity/pagecache.c | 39 ++++++++++++++------
fs/verity/read_metadata.c | 17 ++++++---
fs/verity/verify.c | 76 +++++++++++++++++++++++++--------------
include/linux/fsverity.h | 29 ++++++++++-----
9 files changed, 146 insertions(+), 59 deletions(-)
diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index e7643c22a6bf..c152bef71e8b 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -697,7 +697,6 @@ int btrfs_get_verity_descriptor(struct inode *inode, void *buf, size_t buf_size)
*
* @inode: inode to read a merkle tree page for
* @index: page index relative to the start of the merkle tree
- * @num_ra_pages: number of pages to readahead. Optional, we ignore it
*
* The Merkle tree is stored in the filesystem btree, but its pages are cached
* with a logical position past EOF in the inode's mapping.
@@ -705,8 +704,7 @@ int btrfs_get_verity_descriptor(struct inode *inode, void *buf, size_t buf_size)
* Returns the page we read, or an ERR_PTR on error.
*/
static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
- pgoff_t index,
- unsigned long num_ra_pages)
+ pgoff_t index)
{
struct folio *folio;
u64 off = (u64)index << PAGE_SHIFT;
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 267594ef0b2c..e99072c8a619 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -227,6 +227,7 @@ int ext4_mpage_readpages(struct inode *inode,
unsigned relative_block = 0;
struct ext4_map_blocks map;
unsigned int nr_pages, folio_pages;
+ bool first_folio = true;
map.m_pblk = 0;
map.m_lblk = 0;
@@ -242,6 +243,12 @@ int ext4_mpage_readpages(struct inode *inode,
if (rac)
folio = readahead_folio(rac);
+ if (first_folio) {
+ if (ext4_need_verity(inode, folio->index))
+ fsverity_readahead(folio, nr_pages);
+ first_folio = false;
+ }
+
folio_pages = folio_nr_pages(folio);
prefetchw(&folio->flags);
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index a071860ad36a..54ae4d4a176c 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -358,11 +358,17 @@ static int ext4_get_verity_descriptor(struct inode *inode, void *buf,
}
static struct page *ext4_read_merkle_tree_page(struct inode *inode,
- pgoff_t index,
- unsigned long num_ra_pages)
+ pgoff_t index)
{
index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
- return generic_read_merkle_tree_page(inode, index, num_ra_pages);
+ return generic_read_merkle_tree_page(inode, index);
+}
+
+static void ext4_readahead_merkle_tree(struct inode *inode, pgoff_t index,
+ unsigned long nr_pages)
+{
+ index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
+ generic_readahead_merkle_tree(inode, index, nr_pages);
}
static int ext4_write_merkle_tree_block(struct file *file, const void *buf,
@@ -380,5 +386,6 @@ const struct fsverity_operations ext4_verityops = {
.end_enable_verity = ext4_end_enable_verity,
.get_verity_descriptor = ext4_get_verity_descriptor,
.read_merkle_tree_page = ext4_read_merkle_tree_page,
+ .readahead_merkle_tree = ext4_readahead_merkle_tree,
.write_merkle_tree_block = ext4_write_merkle_tree_block,
};
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c30e69392a62..49bdc7e771f2 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2359,6 +2359,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
unsigned nr_pages = rac ? readahead_count(rac) : 1;
unsigned max_nr_pages = nr_pages;
int ret = 0;
+ bool first_folio = true;
#ifdef CONFIG_F2FS_FS_COMPRESSION
if (f2fs_compressed_file(inode)) {
@@ -2383,6 +2384,12 @@ static int f2fs_mpage_readpages(struct inode *inode,
prefetchw(&folio->flags);
}
+ if (first_folio) {
+ if (f2fs_need_verity(inode, folio->index))
+ fsverity_readahead(folio, nr_pages);
+ first_folio = false;
+ }
+
#ifdef CONFIG_F2FS_FS_COMPRESSION
index = folio->index;
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index d37e584423af..628e8eafa96a 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -256,11 +256,17 @@ static int f2fs_get_verity_descriptor(struct inode *inode, void *buf,
}
static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
- pgoff_t index,
- unsigned long num_ra_pages)
+ pgoff_t index)
{
index += f2fs_verity_metadata_pos(inode) >> PAGE_SHIFT;
- return generic_read_merkle_tree_page(inode, index, num_ra_pages);
+ return generic_read_merkle_tree_page(inode, index);
+}
+
+static void f2fs_readahead_merkle_tree(struct inode *inode, pgoff_t index,
+ unsigned long nr_pages)
+{
+ index += f2fs_verity_metadata_pos(inode) >> PAGE_SHIFT;
+ generic_readahead_merkle_tree(inode, index, nr_pages);
}
static int f2fs_write_merkle_tree_block(struct file *file, const void *buf,
@@ -278,5 +284,6 @@ const struct fsverity_operations f2fs_verityops = {
.end_enable_verity = f2fs_end_enable_verity,
.get_verity_descriptor = f2fs_get_verity_descriptor,
.read_merkle_tree_page = f2fs_read_merkle_tree_page,
+ .readahead_merkle_tree = f2fs_readahead_merkle_tree,
.write_merkle_tree_block = f2fs_write_merkle_tree_block,
};
diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
index eae419d8d091..196072bbe284 100644
--- a/fs/verity/pagecache.c
+++ b/fs/verity/pagecache.c
@@ -16,8 +16,30 @@
* to ->read_merkle_tree_page to the actual index where the Merkle tree is
* stored in the page cache for @inode.
*/
-struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
- unsigned long num_ra_pages)
+struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index)
+{
+ struct folio *folio;
+
+ folio = read_mapping_folio(inode->i_mapping, index, NULL);
+ if (IS_ERR(folio))
+ return ERR_CAST(folio);
+ return folio_file_page(folio, index);
+}
+EXPORT_SYMBOL_GPL(generic_read_merkle_tree_page);
+
+/**
+ * generic_readahead_merkle_tree() - generic ->readahead_merkle_tree helper
+ * @inode: inode containing the Merkle tree
+ * @index: 0-based index of the first Merkle tree page to read ahead in the
+ * inode
+ * @nr_pages: the number of Merkle tree pages that should be read ahead
+ *
+ * The caller needs to adjust @index from the Merkle-tree relative index passed
+ * to ->read_merkle_tree_page to the actual index where the Merkle tree is
+ * stored in the page cache for @inode.
+ */
+void generic_readahead_merkle_tree(struct inode *inode, pgoff_t index,
+ unsigned long nr_pages)
{
struct folio *folio;
@@ -26,14 +48,9 @@ struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
(!IS_ERR(folio) && !folio_test_uptodate(folio))) {
DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
- if (!IS_ERR(folio))
- folio_put(folio);
- else if (num_ra_pages > 1)
- page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
- folio = read_mapping_folio(inode->i_mapping, index, NULL);
- if (IS_ERR(folio))
- return ERR_CAST(folio);
+ page_cache_ra_unbounded(&ractl, nr_pages, 0);
}
- return folio_file_page(folio, index);
+ if (!IS_ERR(folio))
+ folio_put(folio);
}
-EXPORT_SYMBOL_GPL(generic_read_merkle_tree_page);
+EXPORT_SYMBOL_GPL(generic_readahead_merkle_tree);
diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
index cba5d6af4e04..81b82e9ddb1d 100644
--- a/fs/verity/read_metadata.c
+++ b/fs/verity/read_metadata.c
@@ -28,24 +28,31 @@ static int fsverity_read_merkle_tree(struct inode *inode,
if (offset >= end_offset)
return 0;
offs_in_page = offset_in_page(offset);
+ index = offset >> PAGE_SHIFT;
last_index = (end_offset - 1) >> PAGE_SHIFT;
+ /*
+ * Kick off readahead for the range we are going to read to ensure a
+ * single large sequential read instead of lots of small ones.
+ */
+ if (inode->i_sb->s_vop->readahead_merkle_tree) {
+ inode->i_sb->s_vop->readahead_merkle_tree(inode, index,
+ last_index - index + 1);
+ }
+
/*
* Iterate through each Merkle tree page in the requested range and copy
* the requested portion to userspace. Note that the Merkle tree block
* size isn't important here, as we are returning a byte stream; i.e.,
* we can just work with pages even if the tree block size != PAGE_SIZE.
*/
- for (index = offset >> PAGE_SHIFT; index <= last_index; index++) {
- unsigned long num_ra_pages =
- min_t(unsigned long, last_index - index + 1,
- inode->i_sb->s_bdi->io_pages);
+ for (; index <= last_index; index++) {
unsigned int bytes_to_copy = min_t(u64, end_offset - offset,
PAGE_SIZE - offs_in_page);
struct page *page;
const void *virt;
- page = vops->read_merkle_tree_page(inode, index, num_ra_pages);
+ page = vops->read_merkle_tree_page(inode, index);
if (IS_ERR(page)) {
err = PTR_ERR(page);
fsverity_err(inode,
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 86067c8b40cf..f5bea750b427 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -9,6 +9,7 @@
#include <linux/bio.h>
#include <linux/export.h>
+#include <linux/pagemap.h>
#define FS_VERITY_MAX_PENDING_BLOCKS 2
@@ -21,7 +22,6 @@ struct fsverity_pending_block {
struct fsverity_verification_context {
struct inode *inode;
struct fsverity_info *vi;
- unsigned long max_ra_pages;
/*
* This is the queue of data blocks that are pending verification. When
@@ -37,6 +37,49 @@ struct fsverity_verification_context {
static struct workqueue_struct *fsverity_read_workqueue;
+/**
+ * fsverity_readahead() - kick off readahead on fsverity hashes
+ * @folio: first file data folio that is being read
+ * @nr_pages: number of file data pages to be read
+ *
+ * Start readahead on the fsverity hashes that are needed to verity the file
+ * data in the range from folio->index to folio->index + nr_pages.
+ *
+ * To be called from the file systems' ->read_folio and ->readahead methods to
+ * ensure that the hashes are already cached on completion of the file data
+ * read if possible.
+ */
+void fsverity_readahead(struct folio *folio, unsigned long nr_pages)
+{
+ struct inode *inode = folio->mapping->host;
+ const struct fsverity_info *vi = *fsverity_info_addr(inode);
+ const struct merkle_tree_params *params = &vi->tree_params;
+ u64 start_hidx = (u64)folio->index << params->log_blocks_per_page;
+ u64 end_hidx = (((u64)folio->index + nr_pages) <<
+ params->log_blocks_per_page) - 1;
+ int level;
+
+ if (!inode->i_sb->s_vop->readahead_merkle_tree)
+ return;
+
+ for (level = 0; level < params->num_levels; level++) {
+ unsigned long level_start = params->level_start[level];
+ unsigned long next_start_hidx = start_hidx >> params->log_arity;
+ unsigned long next_end_hidx = end_hidx >> params->log_arity;
+ pgoff_t start_idx = (level_start + next_start_hidx) >>
+ params->log_blocks_per_page;
+ pgoff_t end_idx = (level_start + next_end_hidx) >>
+ params->log_blocks_per_page;
+
+ inode->i_sb->s_vop->readahead_merkle_tree(inode, start_idx,
+ end_idx - start_idx + 1);
+
+ start_hidx = next_start_hidx;
+ end_hidx = next_end_hidx;
+ }
+}
+EXPORT_SYMBOL_GPL(fsverity_readahead);
+
/*
* Returns true if the hash block with index @hblock_idx in the tree, located in
* @hpage, has already been verified.
@@ -114,8 +157,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
* Return: %true if the data block is valid, else %false.
*/
static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
- const struct fsverity_pending_block *dblock,
- unsigned long max_ra_pages)
+ const struct fsverity_pending_block *dblock)
{
const u64 data_pos = dblock->pos;
const struct merkle_tree_params *params = &vi->tree_params;
@@ -200,8 +242,7 @@ static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
(params->block_size - 1);
hpage = inode->i_sb->s_vop->read_merkle_tree_page(inode,
- hpage_idx, level == 0 ? min(max_ra_pages,
- params->tree_pages - hpage_idx) : 0);
+ hpage_idx);
if (IS_ERR(hpage)) {
fsverity_err(inode,
"Error %ld reading Merkle tree page %lu",
@@ -272,14 +313,12 @@ static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
static void
fsverity_init_verification_context(struct fsverity_verification_context *ctx,
- struct inode *inode,
- unsigned long max_ra_pages)
+ struct inode *inode)
{
struct fsverity_info *vi = *fsverity_info_addr(inode);
ctx->inode = inode;
ctx->vi = vi;
- ctx->max_ra_pages = max_ra_pages;
ctx->num_pending = 0;
if (vi->tree_params.hash_alg->algo_id == HASH_ALGO_SHA256 &&
sha256_finup_2x_is_optimized())
@@ -322,8 +361,7 @@ fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
}
for (i = 0; i < ctx->num_pending; i++) {
- if (!verify_data_block(ctx->inode, vi, &ctx->pending_blocks[i],
- ctx->max_ra_pages))
+ if (!verify_data_block(ctx->inode, vi, &ctx->pending_blocks[i]))
return false;
}
fsverity_clear_pending_blocks(ctx);
@@ -373,7 +411,7 @@ bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
{
struct fsverity_verification_context ctx;
- fsverity_init_verification_context(&ctx, folio->mapping->host, 0);
+ fsverity_init_verification_context(&ctx, folio->mapping->host);
if (fsverity_add_data_blocks(&ctx, folio, len, offset) &&
fsverity_verify_pending_blocks(&ctx))
@@ -403,22 +441,8 @@ void fsverity_verify_bio(struct bio *bio)
struct inode *inode = bio_first_folio_all(bio)->mapping->host;
struct fsverity_verification_context ctx;
struct folio_iter fi;
- unsigned long max_ra_pages = 0;
-
- if (bio->bi_opf & REQ_RAHEAD) {
- /*
- * If this bio is for data readahead, then we also do readahead
- * of the first (largest) level of the Merkle tree. Namely,
- * when a Merkle tree page is read, we also try to piggy-back on
- * some additional pages -- up to 1/4 the number of data pages.
- *
- * This improves sequential read performance, as it greatly
- * reduces the number of I/O requests made to the Merkle tree.
- */
- max_ra_pages = bio->bi_iter.bi_size >> (PAGE_SHIFT + 2);
- }
- fsverity_init_verification_context(&ctx, inode, max_ra_pages);
+ fsverity_init_verification_context(&ctx, inode);
bio_for_each_folio_all(fi, bio) {
if (!fsverity_add_data_blocks(&ctx, fi.folio, fi.length,
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 121703625cc8..bade511cf3aa 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -97,10 +97,6 @@ struct fsverity_operations {
*
* @inode: the inode
* @index: 0-based index of the page within the Merkle tree
- * @num_ra_pages: The number of Merkle tree pages that should be
- * prefetched starting at @index if the page at @index
- * isn't already cached. Implementations may ignore this
- * argument; it's only a performance optimization.
*
* This can be called at any time on an open verity file. It may be
* called by multiple processes concurrently, even with the same page.
@@ -110,8 +106,23 @@ struct fsverity_operations {
* Return: the page on success, ERR_PTR() on failure
*/
struct page *(*read_merkle_tree_page)(struct inode *inode,
- pgoff_t index,
- unsigned long num_ra_pages);
+ pgoff_t index);
+
+ /**
+ * Perform readahead of a Merkle tree for the given inode.
+ *
+ * @inode: the inode
+ * @index: 0-based index of the first page within the Merkle tree
+ * @nr_pages: number of pages to be read ahead.
+ *
+ * This can be called at any time on an open verity file. It may be
+ * called by multiple processes concurrently, even with the same range.
+ *
+ * Optional method so that ->read_merkle_tree_page preferably finds
+ * cached data instead of issuing dependent I/O.
+ */
+ void (*readahead_merkle_tree)(struct inode *inode, pgoff_t index,
+ unsigned long nr_pages);
/**
* Write a Merkle tree block to the given inode.
@@ -308,8 +319,10 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
}
void fsverity_cleanup_inode(struct inode *inode);
+void fsverity_readahead(struct folio *folio, unsigned long nr_pages);
-struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
- unsigned long num_ra_pages);
+struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index);
+void generic_readahead_merkle_tree(struct inode *inode, pgoff_t index,
+ unsigned long nr_pages);
#endif /* _LINUX_FSVERITY_H */
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 15:26 ` [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time Christoph Hellwig
@ 2026-01-28 16:33 ` Darrick J. Wong
2026-01-28 22:56 ` Eric Biggers
1 sibling, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2026-01-28 16:33 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Eric Biggers, Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 04:26:20PM +0100, Christoph Hellwig wrote:
> Currently all reads of the fsverity hashes is kicked off from the data
> I/O completion handler, leading to needlessly dependent I/O. This is
> worked around a bit by performing readahead on the level 0 nodes, but
> still fairly ineffective.
>
> Switch to a model where the ->read_folio and ->readahead methods instead
> kick off explicit readahead of the fsverity hashed so they are usually
> available at I/O completion time.
>
> For 64k sequential reads on my test VM this improves read performance
> from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
> random reads are likely to be even bigger.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Thanks for updating the kerneldoc and fixing the 'pgoff_t long' thing
from the last round;
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/btrfs/verity.c | 4 +--
> fs/ext4/readpage.c | 7 ++++
> fs/ext4/verity.c | 13 +++++--
> fs/f2fs/data.c | 7 ++++
> fs/f2fs/verity.c | 13 +++++--
> fs/verity/pagecache.c | 39 ++++++++++++++------
> fs/verity/read_metadata.c | 17 ++++++---
> fs/verity/verify.c | 76 +++++++++++++++++++++++++--------------
> include/linux/fsverity.h | 29 ++++++++++-----
> 9 files changed, 146 insertions(+), 59 deletions(-)
>
> diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
> index e7643c22a6bf..c152bef71e8b 100644
> --- a/fs/btrfs/verity.c
> +++ b/fs/btrfs/verity.c
> @@ -697,7 +697,6 @@ int btrfs_get_verity_descriptor(struct inode *inode, void *buf, size_t buf_size)
> *
> * @inode: inode to read a merkle tree page for
> * @index: page index relative to the start of the merkle tree
> - * @num_ra_pages: number of pages to readahead. Optional, we ignore it
> *
> * The Merkle tree is stored in the filesystem btree, but its pages are cached
> * with a logical position past EOF in the inode's mapping.
> @@ -705,8 +704,7 @@ int btrfs_get_verity_descriptor(struct inode *inode, void *buf, size_t buf_size)
> * Returns the page we read, or an ERR_PTR on error.
> */
> static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
> - pgoff_t index,
> - unsigned long num_ra_pages)
> + pgoff_t index)
> {
> struct folio *folio;
> u64 off = (u64)index << PAGE_SHIFT;
> diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
> index 267594ef0b2c..e99072c8a619 100644
> --- a/fs/ext4/readpage.c
> +++ b/fs/ext4/readpage.c
> @@ -227,6 +227,7 @@ int ext4_mpage_readpages(struct inode *inode,
> unsigned relative_block = 0;
> struct ext4_map_blocks map;
> unsigned int nr_pages, folio_pages;
> + bool first_folio = true;
>
> map.m_pblk = 0;
> map.m_lblk = 0;
> @@ -242,6 +243,12 @@ int ext4_mpage_readpages(struct inode *inode,
> if (rac)
> folio = readahead_folio(rac);
>
> + if (first_folio) {
> + if (ext4_need_verity(inode, folio->index))
> + fsverity_readahead(folio, nr_pages);
> + first_folio = false;
> + }
> +
> folio_pages = folio_nr_pages(folio);
> prefetchw(&folio->flags);
>
> diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
> index a071860ad36a..54ae4d4a176c 100644
> --- a/fs/ext4/verity.c
> +++ b/fs/ext4/verity.c
> @@ -358,11 +358,17 @@ static int ext4_get_verity_descriptor(struct inode *inode, void *buf,
> }
>
> static struct page *ext4_read_merkle_tree_page(struct inode *inode,
> - pgoff_t index,
> - unsigned long num_ra_pages)
> + pgoff_t index)
> {
> index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
> - return generic_read_merkle_tree_page(inode, index, num_ra_pages);
> + return generic_read_merkle_tree_page(inode, index);
> +}
> +
> +static void ext4_readahead_merkle_tree(struct inode *inode, pgoff_t index,
> + unsigned long nr_pages)
> +{
> + index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT;
> + generic_readahead_merkle_tree(inode, index, nr_pages);
> }
>
> static int ext4_write_merkle_tree_block(struct file *file, const void *buf,
> @@ -380,5 +386,6 @@ const struct fsverity_operations ext4_verityops = {
> .end_enable_verity = ext4_end_enable_verity,
> .get_verity_descriptor = ext4_get_verity_descriptor,
> .read_merkle_tree_page = ext4_read_merkle_tree_page,
> + .readahead_merkle_tree = ext4_readahead_merkle_tree,
> .write_merkle_tree_block = ext4_write_merkle_tree_block,
> };
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index c30e69392a62..49bdc7e771f2 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2359,6 +2359,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
> unsigned nr_pages = rac ? readahead_count(rac) : 1;
> unsigned max_nr_pages = nr_pages;
> int ret = 0;
> + bool first_folio = true;
>
> #ifdef CONFIG_F2FS_FS_COMPRESSION
> if (f2fs_compressed_file(inode)) {
> @@ -2383,6 +2384,12 @@ static int f2fs_mpage_readpages(struct inode *inode,
> prefetchw(&folio->flags);
> }
>
> + if (first_folio) {
> + if (f2fs_need_verity(inode, folio->index))
> + fsverity_readahead(folio, nr_pages);
> + first_folio = false;
> + }
> +
> #ifdef CONFIG_F2FS_FS_COMPRESSION
> index = folio->index;
>
> diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
> index d37e584423af..628e8eafa96a 100644
> --- a/fs/f2fs/verity.c
> +++ b/fs/f2fs/verity.c
> @@ -256,11 +256,17 @@ static int f2fs_get_verity_descriptor(struct inode *inode, void *buf,
> }
>
> static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
> - pgoff_t index,
> - unsigned long num_ra_pages)
> + pgoff_t index)
> {
> index += f2fs_verity_metadata_pos(inode) >> PAGE_SHIFT;
> - return generic_read_merkle_tree_page(inode, index, num_ra_pages);
> + return generic_read_merkle_tree_page(inode, index);
> +}
> +
> +static void f2fs_readahead_merkle_tree(struct inode *inode, pgoff_t index,
> + unsigned long nr_pages)
> +{
> + index += f2fs_verity_metadata_pos(inode) >> PAGE_SHIFT;
> + generic_readahead_merkle_tree(inode, index, nr_pages);
> }
>
> static int f2fs_write_merkle_tree_block(struct file *file, const void *buf,
> @@ -278,5 +284,6 @@ const struct fsverity_operations f2fs_verityops = {
> .end_enable_verity = f2fs_end_enable_verity,
> .get_verity_descriptor = f2fs_get_verity_descriptor,
> .read_merkle_tree_page = f2fs_read_merkle_tree_page,
> + .readahead_merkle_tree = f2fs_readahead_merkle_tree,
> .write_merkle_tree_block = f2fs_write_merkle_tree_block,
> };
> diff --git a/fs/verity/pagecache.c b/fs/verity/pagecache.c
> index eae419d8d091..196072bbe284 100644
> --- a/fs/verity/pagecache.c
> +++ b/fs/verity/pagecache.c
> @@ -16,8 +16,30 @@
> * to ->read_merkle_tree_page to the actual index where the Merkle tree is
> * stored in the page cache for @inode.
> */
> -struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
> - unsigned long num_ra_pages)
> +struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index)
> +{
> + struct folio *folio;
> +
> + folio = read_mapping_folio(inode->i_mapping, index, NULL);
> + if (IS_ERR(folio))
> + return ERR_CAST(folio);
> + return folio_file_page(folio, index);
> +}
> +EXPORT_SYMBOL_GPL(generic_read_merkle_tree_page);
> +
> +/**
> + * generic_readahead_merkle_tree() - generic ->readahead_merkle_tree helper
> + * @inode: inode containing the Merkle tree
> + * @index: 0-based index of the first Merkle tree page to read ahead in the
> + * inode
> + * @nr_pages: the number of Merkle tree pages that should be read ahead
> + *
> + * The caller needs to adjust @index from the Merkle-tree relative index passed
> + * to ->read_merkle_tree_page to the actual index where the Merkle tree is
> + * stored in the page cache for @inode.
> + */
> +void generic_readahead_merkle_tree(struct inode *inode, pgoff_t index,
> + unsigned long nr_pages)
> {
> struct folio *folio;
>
> @@ -26,14 +48,9 @@ struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
> (!IS_ERR(folio) && !folio_test_uptodate(folio))) {
> DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
>
> - if (!IS_ERR(folio))
> - folio_put(folio);
> - else if (num_ra_pages > 1)
> - page_cache_ra_unbounded(&ractl, num_ra_pages, 0);
> - folio = read_mapping_folio(inode->i_mapping, index, NULL);
> - if (IS_ERR(folio))
> - return ERR_CAST(folio);
> + page_cache_ra_unbounded(&ractl, nr_pages, 0);
> }
> - return folio_file_page(folio, index);
> + if (!IS_ERR(folio))
> + folio_put(folio);
> }
> -EXPORT_SYMBOL_GPL(generic_read_merkle_tree_page);
> +EXPORT_SYMBOL_GPL(generic_readahead_merkle_tree);
> diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
> index cba5d6af4e04..81b82e9ddb1d 100644
> --- a/fs/verity/read_metadata.c
> +++ b/fs/verity/read_metadata.c
> @@ -28,24 +28,31 @@ static int fsverity_read_merkle_tree(struct inode *inode,
> if (offset >= end_offset)
> return 0;
> offs_in_page = offset_in_page(offset);
> + index = offset >> PAGE_SHIFT;
> last_index = (end_offset - 1) >> PAGE_SHIFT;
>
> + /*
> + * Kick off readahead for the range we are going to read to ensure a
> + * single large sequential read instead of lots of small ones.
> + */
> + if (inode->i_sb->s_vop->readahead_merkle_tree) {
> + inode->i_sb->s_vop->readahead_merkle_tree(inode, index,
> + last_index - index + 1);
> + }
> +
> /*
> * Iterate through each Merkle tree page in the requested range and copy
> * the requested portion to userspace. Note that the Merkle tree block
> * size isn't important here, as we are returning a byte stream; i.e.,
> * we can just work with pages even if the tree block size != PAGE_SIZE.
> */
> - for (index = offset >> PAGE_SHIFT; index <= last_index; index++) {
> - unsigned long num_ra_pages =
> - min_t(unsigned long, last_index - index + 1,
> - inode->i_sb->s_bdi->io_pages);
> + for (; index <= last_index; index++) {
> unsigned int bytes_to_copy = min_t(u64, end_offset - offset,
> PAGE_SIZE - offs_in_page);
> struct page *page;
> const void *virt;
>
> - page = vops->read_merkle_tree_page(inode, index, num_ra_pages);
> + page = vops->read_merkle_tree_page(inode, index);
> if (IS_ERR(page)) {
> err = PTR_ERR(page);
> fsverity_err(inode,
> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> index 86067c8b40cf..f5bea750b427 100644
> --- a/fs/verity/verify.c
> +++ b/fs/verity/verify.c
> @@ -9,6 +9,7 @@
>
> #include <linux/bio.h>
> #include <linux/export.h>
> +#include <linux/pagemap.h>
>
> #define FS_VERITY_MAX_PENDING_BLOCKS 2
>
> @@ -21,7 +22,6 @@ struct fsverity_pending_block {
> struct fsverity_verification_context {
> struct inode *inode;
> struct fsverity_info *vi;
> - unsigned long max_ra_pages;
>
> /*
> * This is the queue of data blocks that are pending verification. When
> @@ -37,6 +37,49 @@ struct fsverity_verification_context {
>
> static struct workqueue_struct *fsverity_read_workqueue;
>
> +/**
> + * fsverity_readahead() - kick off readahead on fsverity hashes
> + * @folio: first file data folio that is being read
> + * @nr_pages: number of file data pages to be read
> + *
> + * Start readahead on the fsverity hashes that are needed to verity the file
> + * data in the range from folio->index to folio->index + nr_pages.
> + *
> + * To be called from the file systems' ->read_folio and ->readahead methods to
> + * ensure that the hashes are already cached on completion of the file data
> + * read if possible.
> + */
> +void fsverity_readahead(struct folio *folio, unsigned long nr_pages)
> +{
> + struct inode *inode = folio->mapping->host;
> + const struct fsverity_info *vi = *fsverity_info_addr(inode);
> + const struct merkle_tree_params *params = &vi->tree_params;
> + u64 start_hidx = (u64)folio->index << params->log_blocks_per_page;
> + u64 end_hidx = (((u64)folio->index + nr_pages) <<
> + params->log_blocks_per_page) - 1;
> + int level;
> +
> + if (!inode->i_sb->s_vop->readahead_merkle_tree)
> + return;
> +
> + for (level = 0; level < params->num_levels; level++) {
> + unsigned long level_start = params->level_start[level];
> + unsigned long next_start_hidx = start_hidx >> params->log_arity;
> + unsigned long next_end_hidx = end_hidx >> params->log_arity;
> + pgoff_t start_idx = (level_start + next_start_hidx) >>
> + params->log_blocks_per_page;
> + pgoff_t end_idx = (level_start + next_end_hidx) >>
> + params->log_blocks_per_page;
> +
> + inode->i_sb->s_vop->readahead_merkle_tree(inode, start_idx,
> + end_idx - start_idx + 1);
> +
> + start_hidx = next_start_hidx;
> + end_hidx = next_end_hidx;
> + }
> +}
> +EXPORT_SYMBOL_GPL(fsverity_readahead);
> +
> /*
> * Returns true if the hash block with index @hblock_idx in the tree, located in
> * @hpage, has already been verified.
> @@ -114,8 +157,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
> * Return: %true if the data block is valid, else %false.
> */
> static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
> - const struct fsverity_pending_block *dblock,
> - unsigned long max_ra_pages)
> + const struct fsverity_pending_block *dblock)
> {
> const u64 data_pos = dblock->pos;
> const struct merkle_tree_params *params = &vi->tree_params;
> @@ -200,8 +242,7 @@ static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
> (params->block_size - 1);
>
> hpage = inode->i_sb->s_vop->read_merkle_tree_page(inode,
> - hpage_idx, level == 0 ? min(max_ra_pages,
> - params->tree_pages - hpage_idx) : 0);
> + hpage_idx);
> if (IS_ERR(hpage)) {
> fsverity_err(inode,
> "Error %ld reading Merkle tree page %lu",
> @@ -272,14 +313,12 @@ static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
>
> static void
> fsverity_init_verification_context(struct fsverity_verification_context *ctx,
> - struct inode *inode,
> - unsigned long max_ra_pages)
> + struct inode *inode)
> {
> struct fsverity_info *vi = *fsverity_info_addr(inode);
>
> ctx->inode = inode;
> ctx->vi = vi;
> - ctx->max_ra_pages = max_ra_pages;
> ctx->num_pending = 0;
> if (vi->tree_params.hash_alg->algo_id == HASH_ALGO_SHA256 &&
> sha256_finup_2x_is_optimized())
> @@ -322,8 +361,7 @@ fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
> }
>
> for (i = 0; i < ctx->num_pending; i++) {
> - if (!verify_data_block(ctx->inode, vi, &ctx->pending_blocks[i],
> - ctx->max_ra_pages))
> + if (!verify_data_block(ctx->inode, vi, &ctx->pending_blocks[i]))
> return false;
> }
> fsverity_clear_pending_blocks(ctx);
> @@ -373,7 +411,7 @@ bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
> {
> struct fsverity_verification_context ctx;
>
> - fsverity_init_verification_context(&ctx, folio->mapping->host, 0);
> + fsverity_init_verification_context(&ctx, folio->mapping->host);
>
> if (fsverity_add_data_blocks(&ctx, folio, len, offset) &&
> fsverity_verify_pending_blocks(&ctx))
> @@ -403,22 +441,8 @@ void fsverity_verify_bio(struct bio *bio)
> struct inode *inode = bio_first_folio_all(bio)->mapping->host;
> struct fsverity_verification_context ctx;
> struct folio_iter fi;
> - unsigned long max_ra_pages = 0;
> -
> - if (bio->bi_opf & REQ_RAHEAD) {
> - /*
> - * If this bio is for data readahead, then we also do readahead
> - * of the first (largest) level of the Merkle tree. Namely,
> - * when a Merkle tree page is read, we also try to piggy-back on
> - * some additional pages -- up to 1/4 the number of data pages.
> - *
> - * This improves sequential read performance, as it greatly
> - * reduces the number of I/O requests made to the Merkle tree.
> - */
> - max_ra_pages = bio->bi_iter.bi_size >> (PAGE_SHIFT + 2);
> - }
>
> - fsverity_init_verification_context(&ctx, inode, max_ra_pages);
> + fsverity_init_verification_context(&ctx, inode);
>
> bio_for_each_folio_all(fi, bio) {
> if (!fsverity_add_data_blocks(&ctx, fi.folio, fi.length,
> diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> index 121703625cc8..bade511cf3aa 100644
> --- a/include/linux/fsverity.h
> +++ b/include/linux/fsverity.h
> @@ -97,10 +97,6 @@ struct fsverity_operations {
> *
> * @inode: the inode
> * @index: 0-based index of the page within the Merkle tree
> - * @num_ra_pages: The number of Merkle tree pages that should be
> - * prefetched starting at @index if the page at @index
> - * isn't already cached. Implementations may ignore this
> - * argument; it's only a performance optimization.
> *
> * This can be called at any time on an open verity file. It may be
> * called by multiple processes concurrently, even with the same page.
> @@ -110,8 +106,23 @@ struct fsverity_operations {
> * Return: the page on success, ERR_PTR() on failure
> */
> struct page *(*read_merkle_tree_page)(struct inode *inode,
> - pgoff_t index,
> - unsigned long num_ra_pages);
> + pgoff_t index);
> +
> + /**
> + * Perform readahead of a Merkle tree for the given inode.
> + *
> + * @inode: the inode
> + * @index: 0-based index of the first page within the Merkle tree
> + * @nr_pages: number of pages to be read ahead.
> + *
> + * This can be called at any time on an open verity file. It may be
> + * called by multiple processes concurrently, even with the same range.
> + *
> + * Optional method so that ->read_merkle_tree_page preferably finds
> + * cached data instead of issuing dependent I/O.
> + */
> + void (*readahead_merkle_tree)(struct inode *inode, pgoff_t index,
> + unsigned long nr_pages);
>
> /**
> * Write a Merkle tree block to the given inode.
> @@ -308,8 +319,10 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
> }
>
> void fsverity_cleanup_inode(struct inode *inode);
> +void fsverity_readahead(struct folio *folio, unsigned long nr_pages);
>
> -struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index,
> - unsigned long num_ra_pages);
> +struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index);
> +void generic_readahead_merkle_tree(struct inode *inode, pgoff_t index,
> + unsigned long nr_pages);
>
> #endif /* _LINUX_FSVERITY_H */
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 15:26 ` [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time Christoph Hellwig
2026-01-28 16:33 ` Darrick J. Wong
@ 2026-01-28 22:56 ` Eric Biggers
2026-01-28 23:22 ` Darrick J. Wong
2026-01-30 5:51 ` Christoph Hellwig
1 sibling, 2 replies; 27+ messages in thread
From: Eric Biggers @ 2026-01-28 22:56 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 04:26:20PM +0100, Christoph Hellwig wrote:
> Currently all reads of the fsverity hashes is kicked off from the data
> I/O completion handler, leading to needlessly dependent I/O. This is
> worked around a bit by performing readahead on the level 0 nodes, but
> still fairly ineffective.
>
> Switch to a model where the ->read_folio and ->readahead methods instead
> kick off explicit readahead of the fsverity hashed so they are usually
> available at I/O completion time.
>
> For 64k sequential reads on my test VM this improves read performance
> from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
> random reads are likely to be even bigger.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Unfortunately, this patch causes recursive down_read() of
address_space::invalidate_lock. How was this meant to work?
[ 20.563185] ============================================
[ 20.564179] WARNING: possible recursive locking detected
[ 20.565170] 6.19.0-rc7-00041-g7bd72c6393ab #2 Not tainted
[ 20.566180] --------------------------------------------
[ 20.567169] cmp/2320 is trying to acquire lock:
[ 20.568019] ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x280
[ 20.569828]
[ 20.569828] but task is already holding lock:
[ 20.570914] ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x280
[ 20.572739]
[ 20.572739] other info that might help us debug this:
[ 20.573938] Possible unsafe locking scenario:
[ 20.573938]
[ 20.575042] CPU0
[ 20.575522] ----
[ 20.576003] lock(mapping.invalidate_lock#2);
[ 20.576849] lock(mapping.invalidate_lock#2);
[ 20.577698]
[ 20.577698] *** DEADLOCK ***
[ 20.577698]
[ 20.578795] May be due to missing lock nesting notation
[ 20.578795]
[ 20.580045] 1 lock held by cmp/2320:
[ 20.580726] #0: ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x20
[ 20.582596]
[ 20.582596] stack backtrace:
[ 20.583428] CPU: 0 UID: 0 PID: 2320 Comm: cmp Not tainted 6.19.0-rc7-00041-g7bd72c6393ab #2 PREEMPT(none)
[ 20.583433] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-2-2 04/01/2014
[ 20.583435] Call Trace:
[ 20.583437] <TASK>
[ 20.583438] show_stack+0x48/0x60
[ 20.583446] dump_stack_lvl+0x75/0xb0
[ 20.583451] dump_stack+0x14/0x1a
[ 20.583452] print_deadlock_bug.cold+0xc0/0xca
[ 20.583457] validate_chain+0x4ca/0x970
[ 20.583463] __lock_acquire+0x587/0xc40
[ 20.583465] ? find_held_lock+0x31/0x90
[ 20.583470] lock_acquire.part.0+0xaf/0x230
[ 20.583472] ? page_cache_ra_unbounded+0x6f/0x280
[ 20.583474] ? debug_smp_processor_id+0x1b/0x30
[ 20.583481] lock_acquire+0x67/0x140
[ 20.583483] ? page_cache_ra_unbounded+0x6f/0x280
[ 20.583484] down_read+0x40/0x180
[ 20.583487] ? page_cache_ra_unbounded+0x6f/0x280
[ 20.583489] page_cache_ra_unbounded+0x6f/0x280
[ 20.583491] ? lock_acquire.part.0+0xaf/0x230
[ 20.583492] ? __this_cpu_preempt_check+0x17/0x20
[ 20.583495] generic_readahead_merkle_tree+0x133/0x140
[ 20.583501] ext4_readahead_merkle_tree+0x2a/0x30
[ 20.583507] fsverity_readahead+0x9d/0xc0
[ 20.583510] ext4_mpage_readpages+0x194/0x9b0
[ 20.583515] ? __lock_release.isra.0+0x5e/0x160
[ 20.583517] ext4_readahead+0x3a/0x40
[ 20.583521] read_pages+0x84/0x370
[ 20.583523] page_cache_ra_unbounded+0x16c/0x280
[ 20.583525] page_cache_ra_order+0x10c/0x170
[ 20.583527] page_cache_sync_ra+0x1a1/0x360
[ 20.583528] filemap_get_pages+0x141/0x4c0
[ 20.583532] ? __this_cpu_preempt_check+0x17/0x20
[ 20.583534] filemap_read+0x11f/0x540
[ 20.583536] ? __folio_batch_add_and_move+0x7c/0x330
[ 20.583539] ? __this_cpu_preempt_check+0x17/0x20
[ 20.583541] generic_file_read_iter+0xc1/0x110
[ 20.583543] ? do_pte_missing+0x13a/0x450
[ 20.583547] ext4_file_read_iter+0x51/0x17
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 22:56 ` Eric Biggers
@ 2026-01-28 23:22 ` Darrick J. Wong
2026-01-30 5:55 ` Christoph Hellwig
2026-01-30 5:51 ` Christoph Hellwig
1 sibling, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2026-01-28 23:22 UTC (permalink / raw)
To: Eric Biggers
Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jan Kara,
David Sterba, Theodore Ts'o, Jaegeuk Kim, Chao Yu,
Andrey Albershteyn, Matthew Wilcox, linux-fsdevel, linux-btrfs,
linux-ext4, linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 02:56:02PM -0800, Eric Biggers wrote:
> On Wed, Jan 28, 2026 at 04:26:20PM +0100, Christoph Hellwig wrote:
> > Currently all reads of the fsverity hashes is kicked off from the data
> > I/O completion handler, leading to needlessly dependent I/O. This is
> > worked around a bit by performing readahead on the level 0 nodes, but
> > still fairly ineffective.
> >
> > Switch to a model where the ->read_folio and ->readahead methods instead
> > kick off explicit readahead of the fsverity hashed so they are usually
> > available at I/O completion time.
> >
> > For 64k sequential reads on my test VM this improves read performance
> > from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
> > random reads are likely to be even bigger.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Acked-by: David Sterba <dsterba@suse.com> [btrfs]
>
> Unfortunately, this patch causes recursive down_read() of
> address_space::invalidate_lock. How was this meant to work?
Usually the filesystem calls filemap_invalidate_lock{,_shared} if it
needs to coordinate truncate vs. page removal (i.e. fallocate hole
punch). That said, there are a few places where the pagecache itself
will take that lock too...
> [ 20.563185] ============================================
> [ 20.564179] WARNING: possible recursive locking detected
> [ 20.565170] 6.19.0-rc7-00041-g7bd72c6393ab #2 Not tainted
> [ 20.566180] --------------------------------------------
> [ 20.567169] cmp/2320 is trying to acquire lock:
> [ 20.568019] ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x280
> [ 20.569828]
> [ 20.569828] but task is already holding lock:
> [ 20.570914] ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x280
> [ 20.572739]
> [ 20.572739] other info that might help us debug this:
> [ 20.573938] Possible unsafe locking scenario:
> [ 20.573938]
> [ 20.575042] CPU0
> [ 20.575522] ----
> [ 20.576003] lock(mapping.invalidate_lock#2);
> [ 20.576849] lock(mapping.invalidate_lock#2);
> [ 20.577698]
> [ 20.577698] *** DEADLOCK ***
> [ 20.577698]
> [ 20.578795] May be due to missing lock nesting notation
> [ 20.578795]
> [ 20.580045] 1 lock held by cmp/2320:
> [ 20.580726] #0: ffff888108465030 (mapping.invalidate_lock#2){++++}-{4:4}, at: page_cache_ra_unbounded+0x6f/0x20
> [ 20.582596]
> [ 20.582596] stack backtrace:
> [ 20.583428] CPU: 0 UID: 0 PID: 2320 Comm: cmp Not tainted 6.19.0-rc7-00041-g7bd72c6393ab #2 PREEMPT(none)
> [ 20.583433] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-2-2 04/01/2014
> [ 20.583435] Call Trace:
> [ 20.583437] <TASK>
> [ 20.583438] show_stack+0x48/0x60
> [ 20.583446] dump_stack_lvl+0x75/0xb0
> [ 20.583451] dump_stack+0x14/0x1a
> [ 20.583452] print_deadlock_bug.cold+0xc0/0xca
> [ 20.583457] validate_chain+0x4ca/0x970
> [ 20.583463] __lock_acquire+0x587/0xc40
> [ 20.583465] ? find_held_lock+0x31/0x90
> [ 20.583470] lock_acquire.part.0+0xaf/0x230
> [ 20.583472] ? page_cache_ra_unbounded+0x6f/0x280
> [ 20.583474] ? debug_smp_processor_id+0x1b/0x30
> [ 20.583481] lock_acquire+0x67/0x140
> [ 20.583483] ? page_cache_ra_unbounded+0x6f/0x280
> [ 20.583484] down_read+0x40/0x180
> [ 20.583487] ? page_cache_ra_unbounded+0x6f/0x280
> [ 20.583489] page_cache_ra_unbounded+0x6f/0x280
...and it looks like this is one of those places where the pagecache
takes it for us...
> [ 20.583491] ? lock_acquire.part.0+0xaf/0x230
> [ 20.583492] ? __this_cpu_preempt_check+0x17/0x20
> [ 20.583495] generic_readahead_merkle_tree+0x133/0x140
> [ 20.583501] ext4_readahead_merkle_tree+0x2a/0x30
> [ 20.583507] fsverity_readahead+0x9d/0xc0
> [ 20.583510] ext4_mpage_readpages+0x194/0x9b0
> [ 20.583515] ? __lock_release.isra.0+0x5e/0x160
> [ 20.583517] ext4_readahead+0x3a/0x40
> [ 20.583521] read_pages+0x84/0x370
> [ 20.583523] page_cache_ra_unbounded+0x16c/0x280
...except that pagecache_ra_unbounded is being called recursively from
an actual file data read. My guess is that we'd need a flag or
something to ask for "unlocked" readahead if we still want readahead to
spur more readahead.
--D
> [ 20.583525] page_cache_ra_order+0x10c/0x170
> [ 20.583527] page_cache_sync_ra+0x1a1/0x360
> [ 20.583528] filemap_get_pages+0x141/0x4c0
> [ 20.583532] ? __this_cpu_preempt_check+0x17/0x20
> [ 20.583534] filemap_read+0x11f/0x540
> [ 20.583536] ? __folio_batch_add_and_move+0x7c/0x330
> [ 20.583539] ? __this_cpu_preempt_check+0x17/0x20
> [ 20.583541] generic_file_read_iter+0xc1/0x110
> [ 20.583543] ? do_pte_missing+0x13a/0x450
> [ 20.583547] ext4_file_read_iter+0x51/0x17
>
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 23:22 ` Darrick J. Wong
@ 2026-01-30 5:55 ` Christoph Hellwig
0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-30 5:55 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Eric Biggers, Christoph Hellwig, Al Viro, Christian Brauner,
Jan Kara, David Sterba, Theodore Ts'o, Jaegeuk Kim, Chao Yu,
Andrey Albershteyn, Matthew Wilcox, linux-fsdevel, linux-btrfs,
linux-ext4, linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 03:22:13PM -0800, Darrick J. Wong wrote:
> > Unfortunately, this patch causes recursive down_read() of
> > address_space::invalidate_lock. How was this meant to work?
>
> Usually the filesystem calls filemap_invalidate_lock{,_shared} if it
> needs to coordinate truncate vs. page removal (i.e. fallocate hole
> punch). That said, there are a few places where the pagecache itself
> will take that lock too...
> [...]
> ...except that pagecache_ra_unbounded is being called recursively from
> an actual file data read. My guess is that we'd need a flag or
> something to ask for "unlocked" readahead if we still want readahead to
> spur more readahead.
Basically just move it out of page_cache_ra_unbounded. With the
consolidation in the earlier patches there are just two callers
of page_cache_ra_unbounded left, this and the redirty_blocks() in f2fs.
I'd kinda wish to kill the latter, as the past-EOF reading is something
that should be restricted to core code, but I can't really think of
an easy way to do that.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time
2026-01-28 22:56 ` Eric Biggers
2026-01-28 23:22 ` Darrick J. Wong
@ 2026-01-30 5:51 ` Christoph Hellwig
1 sibling, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-30 5:51 UTC (permalink / raw)
To: Eric Biggers
Cc: Christoph Hellwig, Al Viro, Christian Brauner, Jan Kara,
David Sterba, Theodore Ts'o, Jaegeuk Kim, Chao Yu,
Andrey Albershteyn, Matthew Wilcox, linux-fsdevel, linux-btrfs,
linux-ext4, linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 02:56:02PM -0800, Eric Biggers wrote:
> On Wed, Jan 28, 2026 at 04:26:20PM +0100, Christoph Hellwig wrote:
> > Currently all reads of the fsverity hashes is kicked off from the data
> > I/O completion handler, leading to needlessly dependent I/O. This is
> > worked around a bit by performing readahead on the level 0 nodes, but
> > still fairly ineffective.
> >
> > Switch to a model where the ->read_folio and ->readahead methods instead
> > kick off explicit readahead of the fsverity hashed so they are usually
> > available at I/O completion time.
> >
> > For 64k sequential reads on my test VM this improves read performance
> > from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
> > random reads are likely to be even bigger.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Acked-by: David Sterba <dsterba@suse.com> [btrfs]
>
> Unfortunately, this patch causes recursive down_read() of
> address_space::invalidate_lock. How was this meant to work?
It worked by the chances that multiple down_read generally work.
Except when they don't and we have a write queued up in between,
but nothing in xfstests hits that.
I'll look into reworking it to avoid that.
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 09/15] fsverity: deconstify the inode pointer in struct fsverity_info
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (7 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 08/15] fsverity: kick off hash readahead at data I/O submission time Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 10/15] fsverity: push out fsverity_info lookup Christoph Hellwig
` (6 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
A lot of file system code expects a non-const inode pointer. Dropping
the const qualifier here allows using the inode pointer in
verify_data_block and prepares for further argument reductions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/verity/fsverity_private.h | 4 ++--
fs/verity/open.c | 2 +-
fs/verity/verify.c | 5 +++--
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index dd20b138d452..f9f3936b0a89 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -73,7 +73,7 @@ struct fsverity_info {
struct merkle_tree_params tree_params;
u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE];
u8 file_digest[FS_VERITY_MAX_DIGEST_SIZE];
- const struct inode *inode;
+ struct inode *inode;
unsigned long *hash_block_verified;
};
@@ -124,7 +124,7 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
unsigned int log_blocksize,
const u8 *salt, size_t salt_size);
-struct fsverity_info *fsverity_create_info(const struct inode *inode,
+struct fsverity_info *fsverity_create_info(struct inode *inode,
struct fsverity_descriptor *desc);
void fsverity_set_info(struct inode *inode, struct fsverity_info *vi);
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 090cb77326ee..128502cf0a23 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -175,7 +175,7 @@ static void compute_file_digest(const struct fsverity_hash_alg *hash_alg,
* appended builtin signature), and check the signature if present. The
* fsverity_descriptor must have already undergone basic validation.
*/
-struct fsverity_info *fsverity_create_info(const struct inode *inode,
+struct fsverity_info *fsverity_create_info(struct inode *inode,
struct fsverity_descriptor *desc)
{
struct fsverity_info *vi;
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index f5bea750b427..6248d25a1f89 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -156,9 +156,10 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
*
* Return: %true if the data block is valid, else %false.
*/
-static bool verify_data_block(struct inode *inode, struct fsverity_info *vi,
+static bool verify_data_block(struct fsverity_info *vi,
const struct fsverity_pending_block *dblock)
{
+ struct inode *inode = vi->inode;
const u64 data_pos = dblock->pos;
const struct merkle_tree_params *params = &vi->tree_params;
const unsigned int hsize = params->digest_size;
@@ -361,7 +362,7 @@ fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
}
for (i = 0; i < ctx->num_pending; i++) {
- if (!verify_data_block(ctx->inode, vi, &ctx->pending_blocks[i]))
+ if (!verify_data_block(vi, &ctx->pending_blocks[i]))
return false;
}
fsverity_clear_pending_blocks(ctx);
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 10/15] fsverity: push out fsverity_info lookup
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (8 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 09/15] fsverity: deconstify the inode pointer in struct fsverity_info Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 11/15] fs: consolidate fsverity_info lookup in buffer.c Christoph Hellwig
` (5 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Pass a struct fsverity_info to the verification and readahead helpers,
and push the lookup into the callers. Right now this is a very
dumb almost mechanic move that open codes a lot of fsverity_info_addr()
calls int the file systems. The subsequent patches will clean this up.
This prepares for reducing the number of fsverity_info lookups, which
will allow to amortize them better when using a more expensive lookup
method.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
---
fs/btrfs/extent_io.c | 3 ++-
fs/buffer.c | 4 +++-
fs/ext4/readpage.c | 10 +++++++---
fs/f2fs/compress.c | 4 +++-
fs/f2fs/data.c | 15 +++++++++++----
fs/verity/verify.c | 23 ++++++++++++-----------
include/linux/fsverity.h | 31 ++++++++++++++++++++++---------
7 files changed, 60 insertions(+), 30 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a4b74023618d..21430b7d8f27 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -484,7 +484,8 @@ static bool btrfs_verify_folio(struct folio *folio, u64 start, u32 len)
btrfs_folio_test_uptodate(fs_info, folio, start, len) ||
start >= i_size_read(folio->mapping->host))
return true;
- return fsverity_verify_folio(folio);
+ return fsverity_verify_folio(*fsverity_info_addr(folio->mapping->host),
+ folio);
}
static void end_folio_read(struct folio *folio, bool uptodate, u64 start, u32 len)
diff --git a/fs/buffer.c b/fs/buffer.c
index 838c0c571022..3982253b6805 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -309,9 +309,11 @@ static void verify_bh(struct work_struct *work)
struct postprocess_bh_ctx *ctx =
container_of(work, struct postprocess_bh_ctx, work);
struct buffer_head *bh = ctx->bh;
+ struct inode *inode = bh->b_folio->mapping->host;
bool valid;
- valid = fsverity_verify_blocks(bh->b_folio, bh->b_size, bh_offset(bh));
+ valid = fsverity_verify_blocks(*fsverity_info_addr(inode), bh->b_folio,
+ bh->b_size, bh_offset(bh));
end_buffer_async_read(bh, valid);
kfree(ctx);
}
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index e99072c8a619..bf65562da9c2 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -96,6 +96,7 @@ static void verity_work(struct work_struct *work)
struct bio_post_read_ctx *ctx =
container_of(work, struct bio_post_read_ctx, work);
struct bio *bio = ctx->bio;
+ struct inode *inode = bio_first_folio_all(bio)->mapping->host;
/*
* fsverity_verify_bio() may call readahead() again, and although verity
@@ -108,7 +109,7 @@ static void verity_work(struct work_struct *work)
mempool_free(ctx, bio_post_read_ctx_pool);
bio->bi_private = NULL;
- fsverity_verify_bio(bio);
+ fsverity_verify_bio(*fsverity_info_addr(inode), bio);
__read_end_io(bio);
}
@@ -245,7 +246,8 @@ int ext4_mpage_readpages(struct inode *inode,
if (first_folio) {
if (ext4_need_verity(inode, folio->index))
- fsverity_readahead(folio, nr_pages);
+ fsverity_readahead(*fsverity_info_addr(inode),
+ folio, nr_pages);
first_folio = false;
}
@@ -337,7 +339,9 @@ int ext4_mpage_readpages(struct inode *inode,
folio_size(folio));
if (first_hole == 0) {
if (ext4_need_verity(inode, folio->index) &&
- !fsverity_verify_folio(folio))
+ !fsverity_verify_folio(
+ *fsverity_info_addr(inode),
+ folio))
goto set_error_page;
folio_end_read(folio, true);
continue;
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 40a62f1dee4d..3de4a7e66959 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1814,7 +1814,9 @@ static void f2fs_verify_cluster(struct work_struct *work)
if (!rpage)
continue;
- if (fsverity_verify_page(rpage))
+ if (fsverity_verify_page(
+ *fsverity_info_addr(rpage->mapping->host),
+ rpage))
SetPageUptodate(rpage);
else
ClearPageUptodate(rpage);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 49bdc7e771f2..bca1e34d327a 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -185,15 +185,19 @@ static void f2fs_verify_bio(struct work_struct *work)
bio_for_each_folio_all(fi, bio) {
struct folio *folio = fi.folio;
+ struct fsverity_info *vi =
+ *fsverity_info_addr(folio->mapping->host);
if (!f2fs_is_compressed_page(folio) &&
- !fsverity_verify_page(&folio->page)) {
+ !fsverity_verify_page(vi, &folio->page)) {
bio->bi_status = BLK_STS_IOERR;
break;
}
}
} else {
- fsverity_verify_bio(bio);
+ struct inode *inode = bio_first_folio_all(bio)->mapping->host;
+
+ fsverity_verify_bio(*fsverity_info_addr(inode), bio);
}
f2fs_finish_read_bio(bio, true);
@@ -2121,7 +2125,9 @@ static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
zero_out:
folio_zero_segment(folio, 0, folio_size(folio));
if (f2fs_need_verity(inode, index) &&
- !fsverity_verify_folio(folio)) {
+ !fsverity_verify_folio(
+ *fsverity_info_addr(folio->mapping->host),
+ folio)) {
ret = -EIO;
goto out;
}
@@ -2386,7 +2392,8 @@ static int f2fs_mpage_readpages(struct inode *inode,
if (first_folio) {
if (f2fs_need_verity(inode, folio->index))
- fsverity_readahead(folio, nr_pages);
+ fsverity_readahead(*fsverity_info_addr(inode),
+ folio, nr_pages);
first_folio = false;
}
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 6248d25a1f89..98685bbb21f6 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -39,6 +39,7 @@ static struct workqueue_struct *fsverity_read_workqueue;
/**
* fsverity_readahead() - kick off readahead on fsverity hashes
+ * @vi: fsverity_info for the inode to be read
* @folio: first file data folio that is being read
* @nr_pages: number of file data pages to be read
*
@@ -49,10 +50,10 @@ static struct workqueue_struct *fsverity_read_workqueue;
* ensure that the hashes are already cached on completion of the file data
* read if possible.
*/
-void fsverity_readahead(struct folio *folio, unsigned long nr_pages)
+void fsverity_readahead(struct fsverity_info *vi, struct folio *folio,
+ unsigned long nr_pages)
{
struct inode *inode = folio->mapping->host;
- const struct fsverity_info *vi = *fsverity_info_addr(inode);
const struct merkle_tree_params *params = &vi->tree_params;
u64 start_hidx = (u64)folio->index << params->log_blocks_per_page;
u64 end_hidx = (((u64)folio->index + nr_pages) <<
@@ -314,11 +315,9 @@ static bool verify_data_block(struct fsverity_info *vi,
static void
fsverity_init_verification_context(struct fsverity_verification_context *ctx,
- struct inode *inode)
+ struct fsverity_info *vi)
{
- struct fsverity_info *vi = *fsverity_info_addr(inode);
-
- ctx->inode = inode;
+ ctx->inode = vi->inode;
ctx->vi = vi;
ctx->num_pending = 0;
if (vi->tree_params.hash_alg->algo_id == HASH_ALGO_SHA256 &&
@@ -398,6 +397,7 @@ static bool fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
/**
* fsverity_verify_blocks() - verify data in a folio
+ * @vi: fsverity_info for the inode to be read
* @folio: the folio containing the data to verify
* @len: the length of the data to verify in the folio
* @offset: the offset of the data to verify in the folio
@@ -408,11 +408,12 @@ static bool fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
*
* Return: %true if the data is valid, else %false.
*/
-bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
+bool fsverity_verify_blocks(struct fsverity_info *vi, struct folio *folio,
+ size_t len, size_t offset)
{
struct fsverity_verification_context ctx;
- fsverity_init_verification_context(&ctx, folio->mapping->host);
+ fsverity_init_verification_context(&ctx, vi);
if (fsverity_add_data_blocks(&ctx, folio, len, offset) &&
fsverity_verify_pending_blocks(&ctx))
@@ -425,6 +426,7 @@ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
#ifdef CONFIG_BLOCK
/**
* fsverity_verify_bio() - verify a 'read' bio that has just completed
+ * @vi: fsverity_info for the inode to be read
* @bio: the bio to verify
*
* Verify the bio's data against the file's Merkle tree. All bio data segments
@@ -437,13 +439,12 @@ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
* filesystems) must instead call fsverity_verify_page() directly on each page.
* All filesystems must also call fsverity_verify_page() on holes.
*/
-void fsverity_verify_bio(struct bio *bio)
+void fsverity_verify_bio(struct fsverity_info *vi, struct bio *bio)
{
- struct inode *inode = bio_first_folio_all(bio)->mapping->host;
struct fsverity_verification_context ctx;
struct folio_iter fi;
- fsverity_init_verification_context(&ctx, inode);
+ fsverity_init_verification_context(&ctx, vi);
bio_for_each_folio_all(fi, bio) {
if (!fsverity_add_data_blocks(&ctx, fi.folio, fi.length,
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index bade511cf3aa..1d70b270e90a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -197,12 +197,20 @@ int fsverity_ioctl_read_metadata(struct file *filp, const void __user *uarg);
/* verify.c */
-bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset);
-void fsverity_verify_bio(struct bio *bio);
+bool fsverity_verify_blocks(struct fsverity_info *vi, struct folio *folio,
+ size_t len, size_t offset);
+void fsverity_verify_bio(struct fsverity_info *vi, struct bio *bio);
void fsverity_enqueue_verify_work(struct work_struct *work);
#else /* !CONFIG_FS_VERITY */
+/*
+ * Provide a stub to allow code using this to compile. All callsites should be
+ * guarded by compiler dead code elimination, and this forces a link error if
+ * not.
+ */
+struct fsverity_info **fsverity_info_addr(const struct inode *inode);
+
static inline struct fsverity_info *fsverity_get_info(const struct inode *inode)
{
return NULL;
@@ -251,14 +259,16 @@ static inline int fsverity_ioctl_read_metadata(struct file *filp,
/* verify.c */
-static inline bool fsverity_verify_blocks(struct folio *folio, size_t len,
+static inline bool fsverity_verify_blocks(struct fsverity_info *vi,
+ struct folio *folio, size_t len,
size_t offset)
{
WARN_ON_ONCE(1);
return false;
}
-static inline void fsverity_verify_bio(struct bio *bio)
+static inline void fsverity_verify_bio(struct fsverity_info *vi,
+ struct bio *bio)
{
WARN_ON_ONCE(1);
}
@@ -270,14 +280,16 @@ static inline void fsverity_enqueue_verify_work(struct work_struct *work)
#endif /* !CONFIG_FS_VERITY */
-static inline bool fsverity_verify_folio(struct folio *folio)
+static inline bool fsverity_verify_folio(struct fsverity_info *vi,
+ struct folio *folio)
{
- return fsverity_verify_blocks(folio, folio_size(folio), 0);
+ return fsverity_verify_blocks(vi, folio, folio_size(folio), 0);
}
-static inline bool fsverity_verify_page(struct page *page)
+static inline bool fsverity_verify_page(struct fsverity_info *vi,
+ struct page *page)
{
- return fsverity_verify_blocks(page_folio(page), PAGE_SIZE, 0);
+ return fsverity_verify_blocks(vi, page_folio(page), PAGE_SIZE, 0);
}
/**
@@ -319,7 +331,8 @@ static inline int fsverity_file_open(struct inode *inode, struct file *filp)
}
void fsverity_cleanup_inode(struct inode *inode);
-void fsverity_readahead(struct folio *folio, unsigned long nr_pages);
+void fsverity_readahead(struct fsverity_info *vi, struct folio *folio,
+ unsigned long nr_pages);
struct page *generic_read_merkle_tree_page(struct inode *inode, pgoff_t index);
void generic_readahead_merkle_tree(struct inode *inode, pgoff_t index,
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 11/15] fs: consolidate fsverity_info lookup in buffer.c
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (9 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 10/15] fsverity: push out fsverity_info lookup Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 12/15] ext4: consolidate fsverity_info lookup Christoph Hellwig
` (4 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Look up the fsverity_info once in end_buffer_async_read_io, and then
pass it along to the I/O completion workqueue in
struct postprocess_bh_ctx.
This amortizes the lookup better once it becomes less efficient.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/buffer.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 3982253b6805..f4b3297ef1b1 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -302,6 +302,7 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
struct postprocess_bh_ctx {
struct work_struct work;
struct buffer_head *bh;
+ struct fsverity_info *vi;
};
static void verify_bh(struct work_struct *work)
@@ -309,25 +310,14 @@ static void verify_bh(struct work_struct *work)
struct postprocess_bh_ctx *ctx =
container_of(work, struct postprocess_bh_ctx, work);
struct buffer_head *bh = ctx->bh;
- struct inode *inode = bh->b_folio->mapping->host;
bool valid;
- valid = fsverity_verify_blocks(*fsverity_info_addr(inode), bh->b_folio,
- bh->b_size, bh_offset(bh));
+ valid = fsverity_verify_blocks(ctx->vi, bh->b_folio, bh->b_size,
+ bh_offset(bh));
end_buffer_async_read(bh, valid);
kfree(ctx);
}
-static bool need_fsverity(struct buffer_head *bh)
-{
- struct folio *folio = bh->b_folio;
- struct inode *inode = folio->mapping->host;
-
- return fsverity_active(inode) &&
- /* needed by ext4 */
- folio->index < DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
-}
-
static void decrypt_bh(struct work_struct *work)
{
struct postprocess_bh_ctx *ctx =
@@ -337,7 +327,7 @@ static void decrypt_bh(struct work_struct *work)
err = fscrypt_decrypt_pagecache_blocks(bh->b_folio, bh->b_size,
bh_offset(bh));
- if (err == 0 && need_fsverity(bh)) {
+ if (err == 0 && ctx->vi) {
/*
* We use different work queues for decryption and for verity
* because verity may require reading metadata pages that need
@@ -359,15 +349,20 @@ static void end_buffer_async_read_io(struct buffer_head *bh, int uptodate)
{
struct inode *inode = bh->b_folio->mapping->host;
bool decrypt = fscrypt_inode_uses_fs_layer_crypto(inode);
- bool verify = need_fsverity(bh);
+ struct fsverity_info *vi = NULL;
+
+ /* needed by ext4 */
+ if (bh->b_folio->index < DIV_ROUND_UP(inode->i_size, PAGE_SIZE))
+ vi = fsverity_get_info(inode);
/* Decrypt (with fscrypt) and/or verify (with fsverity) if needed. */
- if (uptodate && (decrypt || verify)) {
+ if (uptodate && (decrypt || vi)) {
struct postprocess_bh_ctx *ctx =
kmalloc(sizeof(*ctx), GFP_ATOMIC);
if (ctx) {
ctx->bh = bh;
+ ctx->vi = vi;
if (decrypt) {
INIT_WORK(&ctx->work, decrypt_bh);
fscrypt_enqueue_decrypt_work(&ctx->work);
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 12/15] ext4: consolidate fsverity_info lookup
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (10 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 11/15] fs: consolidate fsverity_info lookup in buffer.c Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 16:28 ` Jan Kara
2026-01-28 15:26 ` [PATCH 13/15] f2fs: " Christoph Hellwig
` (3 subsequent siblings)
15 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Look up the fsverity_info once in ext4_mpage_readpages, and then use it
for the readahead, local verification of holes and pass it along to the
I/O completion workqueue in struct bio_post_read_ctx.
This amortizes the lookup better once it becomes less efficient.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/ext4/readpage.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index bf65562da9c2..17920f14e2c2 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -61,6 +61,7 @@ enum bio_post_read_step {
struct bio_post_read_ctx {
struct bio *bio;
+ struct fsverity_info *vi;
struct work_struct work;
unsigned int cur_step;
unsigned int enabled_steps;
@@ -96,7 +97,7 @@ static void verity_work(struct work_struct *work)
struct bio_post_read_ctx *ctx =
container_of(work, struct bio_post_read_ctx, work);
struct bio *bio = ctx->bio;
- struct inode *inode = bio_first_folio_all(bio)->mapping->host;
+ struct fsverity_info *vi = ctx->vi;
/*
* fsverity_verify_bio() may call readahead() again, and although verity
@@ -109,7 +110,7 @@ static void verity_work(struct work_struct *work)
mempool_free(ctx, bio_post_read_ctx_pool);
bio->bi_private = NULL;
- fsverity_verify_bio(*fsverity_info_addr(inode), bio);
+ fsverity_verify_bio(vi, bio);
__read_end_io(bio);
}
@@ -173,22 +174,16 @@ static void mpage_end_io(struct bio *bio)
__read_end_io(bio);
}
-static inline bool ext4_need_verity(const struct inode *inode, pgoff_t idx)
-{
- return fsverity_active(inode) &&
- idx < DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
-}
-
static void ext4_set_bio_post_read_ctx(struct bio *bio,
const struct inode *inode,
- pgoff_t first_idx)
+ struct fsverity_info *vi)
{
unsigned int post_read_steps = 0;
if (fscrypt_inode_uses_fs_layer_crypto(inode))
post_read_steps |= 1 << STEP_DECRYPT;
- if (ext4_need_verity(inode, first_idx))
+ if (vi)
post_read_steps |= 1 << STEP_VERITY;
if (post_read_steps) {
@@ -197,6 +192,7 @@ static void ext4_set_bio_post_read_ctx(struct bio *bio,
mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
ctx->bio = bio;
+ ctx->vi = vi;
ctx->enabled_steps = post_read_steps;
bio->bi_private = ctx;
}
@@ -224,6 +220,7 @@ int ext4_mpage_readpages(struct inode *inode,
sector_t first_block;
unsigned page_block;
struct block_device *bdev = inode->i_sb->s_bdev;
+ struct fsverity_info *vi = NULL;
int length;
unsigned relative_block = 0;
struct ext4_map_blocks map;
@@ -245,9 +242,11 @@ int ext4_mpage_readpages(struct inode *inode,
folio = readahead_folio(rac);
if (first_folio) {
- if (ext4_need_verity(inode, folio->index))
- fsverity_readahead(*fsverity_info_addr(inode),
- folio, nr_pages);
+ if (folio->index <
+ DIV_ROUND_UP(inode->i_size, PAGE_SIZE))
+ vi = fsverity_get_info(inode);
+ if (vi)
+ fsverity_readahead(vi, folio, nr_pages);
first_folio = false;
}
@@ -338,10 +337,7 @@ int ext4_mpage_readpages(struct inode *inode,
folio_zero_segment(folio, first_hole << blkbits,
folio_size(folio));
if (first_hole == 0) {
- if (ext4_need_verity(inode, folio->index) &&
- !fsverity_verify_folio(
- *fsverity_info_addr(inode),
- folio))
+ if (vi && !fsverity_verify_folio(vi, folio))
goto set_error_page;
folio_end_read(folio, true);
continue;
@@ -369,7 +365,7 @@ int ext4_mpage_readpages(struct inode *inode,
REQ_OP_READ, GFP_KERNEL);
fscrypt_set_bio_crypt_ctx(bio, inode, next_block,
GFP_KERNEL);
- ext4_set_bio_post_read_ctx(bio, inode, folio->index);
+ ext4_set_bio_post_read_ctx(bio, inode, vi);
bio->bi_iter.bi_sector = first_block << (blkbits - 9);
bio->bi_end_io = mpage_end_io;
if (rac)
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH 12/15] ext4: consolidate fsverity_info lookup
2026-01-28 15:26 ` [PATCH 12/15] ext4: consolidate fsverity_info lookup Christoph Hellwig
@ 2026-01-28 16:28 ` Jan Kara
0 siblings, 0 replies; 27+ messages in thread
From: Jan Kara @ 2026-01-28 16:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Eric Biggers, Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
On Wed 28-01-26 16:26:24, Christoph Hellwig wrote:
> Look up the fsverity_info once in ext4_mpage_readpages, and then use it
> for the readahead, local verification of holes and pass it along to the
> I/O completion workqueue in struct bio_post_read_ctx.
>
> This amortizes the lookup better once it becomes less efficient.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/readpage.c | 32 ++++++++++++++------------------
> 1 file changed, 14 insertions(+), 18 deletions(-)
>
> diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
> index bf65562da9c2..17920f14e2c2 100644
> --- a/fs/ext4/readpage.c
> +++ b/fs/ext4/readpage.c
> @@ -61,6 +61,7 @@ enum bio_post_read_step {
>
> struct bio_post_read_ctx {
> struct bio *bio;
> + struct fsverity_info *vi;
> struct work_struct work;
> unsigned int cur_step;
> unsigned int enabled_steps;
> @@ -96,7 +97,7 @@ static void verity_work(struct work_struct *work)
> struct bio_post_read_ctx *ctx =
> container_of(work, struct bio_post_read_ctx, work);
> struct bio *bio = ctx->bio;
> - struct inode *inode = bio_first_folio_all(bio)->mapping->host;
> + struct fsverity_info *vi = ctx->vi;
>
> /*
> * fsverity_verify_bio() may call readahead() again, and although verity
> @@ -109,7 +110,7 @@ static void verity_work(struct work_struct *work)
> mempool_free(ctx, bio_post_read_ctx_pool);
> bio->bi_private = NULL;
>
> - fsverity_verify_bio(*fsverity_info_addr(inode), bio);
> + fsverity_verify_bio(vi, bio);
>
> __read_end_io(bio);
> }
> @@ -173,22 +174,16 @@ static void mpage_end_io(struct bio *bio)
> __read_end_io(bio);
> }
>
> -static inline bool ext4_need_verity(const struct inode *inode, pgoff_t idx)
> -{
> - return fsverity_active(inode) &&
> - idx < DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
> -}
> -
> static void ext4_set_bio_post_read_ctx(struct bio *bio,
> const struct inode *inode,
> - pgoff_t first_idx)
> + struct fsverity_info *vi)
> {
> unsigned int post_read_steps = 0;
>
> if (fscrypt_inode_uses_fs_layer_crypto(inode))
> post_read_steps |= 1 << STEP_DECRYPT;
>
> - if (ext4_need_verity(inode, first_idx))
> + if (vi)
> post_read_steps |= 1 << STEP_VERITY;
>
> if (post_read_steps) {
> @@ -197,6 +192,7 @@ static void ext4_set_bio_post_read_ctx(struct bio *bio,
> mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
>
> ctx->bio = bio;
> + ctx->vi = vi;
> ctx->enabled_steps = post_read_steps;
> bio->bi_private = ctx;
> }
> @@ -224,6 +220,7 @@ int ext4_mpage_readpages(struct inode *inode,
> sector_t first_block;
> unsigned page_block;
> struct block_device *bdev = inode->i_sb->s_bdev;
> + struct fsverity_info *vi = NULL;
> int length;
> unsigned relative_block = 0;
> struct ext4_map_blocks map;
> @@ -245,9 +242,11 @@ int ext4_mpage_readpages(struct inode *inode,
> folio = readahead_folio(rac);
>
> if (first_folio) {
> - if (ext4_need_verity(inode, folio->index))
> - fsverity_readahead(*fsverity_info_addr(inode),
> - folio, nr_pages);
> + if (folio->index <
> + DIV_ROUND_UP(inode->i_size, PAGE_SIZE))
> + vi = fsverity_get_info(inode);
> + if (vi)
> + fsverity_readahead(vi, folio, nr_pages);
> first_folio = false;
> }
>
> @@ -338,10 +337,7 @@ int ext4_mpage_readpages(struct inode *inode,
> folio_zero_segment(folio, first_hole << blkbits,
> folio_size(folio));
> if (first_hole == 0) {
> - if (ext4_need_verity(inode, folio->index) &&
> - !fsverity_verify_folio(
> - *fsverity_info_addr(inode),
> - folio))
> + if (vi && !fsverity_verify_folio(vi, folio))
> goto set_error_page;
> folio_end_read(folio, true);
> continue;
> @@ -369,7 +365,7 @@ int ext4_mpage_readpages(struct inode *inode,
> REQ_OP_READ, GFP_KERNEL);
> fscrypt_set_bio_crypt_ctx(bio, inode, next_block,
> GFP_KERNEL);
> - ext4_set_bio_post_read_ctx(bio, inode, folio->index);
> + ext4_set_bio_post_read_ctx(bio, inode, vi);
> bio->bi_iter.bi_sector = first_block << (blkbits - 9);
> bio->bi_end_io = mpage_end_io;
> if (rac)
> --
> 2.47.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 13/15] f2fs: consolidate fsverity_info lookup
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (11 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 12/15] ext4: consolidate fsverity_info lookup Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 14/15] btrfs: " Christoph Hellwig
` (2 subsequent siblings)
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
Look up the fsverity_info once in f2fs_mpage_readpages, and then use it
for the readahead, local verification of holes and pass it along to the
I/O completion workqueue in struct bio_post_read_ctx. Do the same
thing in f2fs_get_read_data_folio for reads that come from garbage
collection and other background activities.
This amortizes the lookup better once it becomes less efficient.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/f2fs/compress.c | 9 +++---
fs/f2fs/data.c | 74 +++++++++++++++++++++++++---------------------
fs/f2fs/f2fs.h | 9 ++----
3 files changed, 46 insertions(+), 46 deletions(-)
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 3de4a7e66959..ef1225af2acf 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1181,6 +1181,7 @@ int f2fs_prepare_compress_overwrite(struct inode *inode,
.cluster_idx = index >> F2FS_I(inode)->i_log_cluster_size,
.rpages = NULL,
.nr_rpages = 0,
+ .vi = NULL, /* can't write to fsverity files */
};
return prepare_compress_overwrite(&cc, pagep, index, fsdata);
@@ -1716,7 +1717,7 @@ struct decompress_io_ctx *f2fs_alloc_dic(struct compress_ctx *cc)
dic->nr_cpages = cc->nr_cpages;
refcount_set(&dic->refcnt, 1);
dic->failed = false;
- dic->need_verity = f2fs_need_verity(cc->inode, start_idx);
+ dic->vi = cc->vi;
for (i = 0; i < dic->cluster_size; i++)
dic->rpages[i] = cc->rpages[i];
@@ -1814,9 +1815,7 @@ static void f2fs_verify_cluster(struct work_struct *work)
if (!rpage)
continue;
- if (fsverity_verify_page(
- *fsverity_info_addr(rpage->mapping->host),
- rpage))
+ if (fsverity_verify_page(dic->vi, rpage))
SetPageUptodate(rpage);
else
ClearPageUptodate(rpage);
@@ -1835,7 +1834,7 @@ void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed,
{
int i;
- if (IS_ENABLED(CONFIG_FS_VERITY) && !failed && dic->need_verity) {
+ if (IS_ENABLED(CONFIG_FS_VERITY) && !failed && dic->vi) {
/*
* Note that to avoid deadlocks, the verity work can't be done
* on the decompression workqueue. This is because verifying
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index bca1e34d327a..d9a8d633d83c 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -109,6 +109,7 @@ enum bio_post_read_step {
struct bio_post_read_ctx {
struct bio *bio;
struct f2fs_sb_info *sbi;
+ struct fsverity_info *vi;
struct work_struct work;
unsigned int enabled_steps;
/*
@@ -165,6 +166,7 @@ static void f2fs_verify_bio(struct work_struct *work)
container_of(work, struct bio_post_read_ctx, work);
struct bio *bio = ctx->bio;
bool may_have_compressed_pages = (ctx->enabled_steps & STEP_DECOMPRESS);
+ struct fsverity_info *vi = ctx->vi;
/*
* fsverity_verify_bio() may call readahead() again, and while verity
@@ -185,8 +187,6 @@ static void f2fs_verify_bio(struct work_struct *work)
bio_for_each_folio_all(fi, bio) {
struct folio *folio = fi.folio;
- struct fsverity_info *vi =
- *fsverity_info_addr(folio->mapping->host);
if (!f2fs_is_compressed_page(folio) &&
!fsverity_verify_page(vi, &folio->page)) {
@@ -195,9 +195,7 @@ static void f2fs_verify_bio(struct work_struct *work)
}
}
} else {
- struct inode *inode = bio_first_folio_all(bio)->mapping->host;
-
- fsverity_verify_bio(*fsverity_info_addr(inode), bio);
+ fsverity_verify_bio(vi, bio);
}
f2fs_finish_read_bio(bio, true);
@@ -1040,9 +1038,9 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio)
f2fs_up_write(&io->io_rwsem);
}
-static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
- unsigned nr_pages, blk_opf_t op_flag,
- pgoff_t first_idx, bool for_write)
+static struct bio *f2fs_grab_read_bio(struct inode *inode,
+ struct fsverity_info *vi, block_t blkaddr, unsigned nr_pages,
+ blk_opf_t op_flag, pgoff_t first_idx, bool for_write)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct bio *bio;
@@ -1061,7 +1059,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
if (fscrypt_inode_uses_fs_layer_crypto(inode))
post_read_steps |= STEP_DECRYPT;
- if (f2fs_need_verity(inode, first_idx))
+ if (vi)
post_read_steps |= STEP_VERITY;
/*
@@ -1076,6 +1074,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
ctx = mempool_alloc(bio_post_read_ctx_pool, GFP_NOFS);
ctx->bio = bio;
ctx->sbi = sbi;
+ ctx->vi = vi;
ctx->enabled_steps = post_read_steps;
ctx->fs_blkaddr = blkaddr;
ctx->decompression_attempted = false;
@@ -1087,15 +1086,15 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
}
/* This can handle encryption stuffs */
-static void f2fs_submit_page_read(struct inode *inode, struct folio *folio,
- block_t blkaddr, blk_opf_t op_flags,
- bool for_write)
+static void f2fs_submit_page_read(struct inode *inode, struct fsverity_info *vi,
+ struct folio *folio, block_t blkaddr, blk_opf_t op_flags,
+ bool for_write)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct bio *bio;
- bio = f2fs_grab_read_bio(inode, blkaddr, 1, op_flags,
- folio->index, for_write);
+ bio = f2fs_grab_read_bio(inode, vi, blkaddr, 1, op_flags, folio->index,
+ for_write);
/* wait for GCed page writeback via META_MAPPING */
f2fs_wait_on_block_writeback(inode, blkaddr);
@@ -1197,6 +1196,14 @@ int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index)
return err;
}
+static inline struct fsverity_info *f2fs_need_verity(const struct inode *inode,
+ pgoff_t idx)
+{
+ if (idx < DIV_ROUND_UP(inode->i_size, PAGE_SIZE))
+ return fsverity_get_info(inode);
+ return NULL;
+}
+
struct folio *f2fs_get_read_data_folio(struct inode *inode, pgoff_t index,
blk_opf_t op_flags, bool for_write, pgoff_t *next_pgofs)
{
@@ -1262,8 +1269,8 @@ struct folio *f2fs_get_read_data_folio(struct inode *inode, pgoff_t index,
return folio;
}
- f2fs_submit_page_read(inode, folio, dn.data_blkaddr,
- op_flags, for_write);
+ f2fs_submit_page_read(inode, f2fs_need_verity(inode, folio->index),
+ folio, dn.data_blkaddr, op_flags, for_write);
return folio;
put_err:
@@ -2067,12 +2074,10 @@ static inline blk_opf_t f2fs_ra_op_flags(struct readahead_control *rac)
return rac ? REQ_RAHEAD : 0;
}
-static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
- unsigned nr_pages,
- struct f2fs_map_blocks *map,
- struct bio **bio_ret,
- sector_t *last_block_in_bio,
- struct readahead_control *rac)
+static int f2fs_read_single_page(struct inode *inode, struct fsverity_info *vi,
+ struct folio *folio, unsigned nr_pages,
+ struct f2fs_map_blocks *map, struct bio **bio_ret,
+ sector_t *last_block_in_bio, struct readahead_control *rac)
{
struct bio *bio = *bio_ret;
const unsigned int blocksize = F2FS_BLKSIZE;
@@ -2124,10 +2129,7 @@ static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
} else {
zero_out:
folio_zero_segment(folio, 0, folio_size(folio));
- if (f2fs_need_verity(inode, index) &&
- !fsverity_verify_folio(
- *fsverity_info_addr(folio->mapping->host),
- folio)) {
+ if (vi && !fsverity_verify_folio(vi, folio)) {
ret = -EIO;
goto out;
}
@@ -2149,7 +2151,7 @@ static int f2fs_read_single_page(struct inode *inode, struct folio *folio,
bio = NULL;
}
if (bio == NULL)
- bio = f2fs_grab_read_bio(inode, block_nr, nr_pages,
+ bio = f2fs_grab_read_bio(inode, vi, block_nr, nr_pages,
f2fs_ra_op_flags(rac), index,
false);
@@ -2301,8 +2303,8 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret,
}
if (!bio)
- bio = f2fs_grab_read_bio(inode, blkaddr, nr_pages - i,
- f2fs_ra_op_flags(rac),
+ bio = f2fs_grab_read_bio(inode, cc->vi, blkaddr,
+ nr_pages - i, f2fs_ra_op_flags(rac),
folio->index, for_write);
if (!bio_add_folio(bio, folio, blocksize, 0))
@@ -2364,6 +2366,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
#endif
unsigned nr_pages = rac ? readahead_count(rac) : 1;
unsigned max_nr_pages = nr_pages;
+ struct fsverity_info *vi = NULL;
int ret = 0;
bool first_folio = true;
@@ -2391,9 +2394,9 @@ static int f2fs_mpage_readpages(struct inode *inode,
}
if (first_folio) {
- if (f2fs_need_verity(inode, folio->index))
- fsverity_readahead(*fsverity_info_addr(inode),
- folio, nr_pages);
+ vi = f2fs_need_verity(inode, folio->index);
+ if (vi)
+ fsverity_readahead(vi, folio, nr_pages);
first_folio = false;
}
@@ -2405,6 +2408,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
/* there are remained compressed pages, submit them */
if (!f2fs_cluster_can_merge_page(&cc, index)) {
+ cc.vi = vi;
ret = f2fs_read_multi_pages(&cc, &bio,
max_nr_pages,
&last_block_in_bio,
@@ -2438,8 +2442,8 @@ static int f2fs_mpage_readpages(struct inode *inode,
read_single_page:
#endif
- ret = f2fs_read_single_page(inode, folio, max_nr_pages, &map,
- &bio, &last_block_in_bio, rac);
+ ret = f2fs_read_single_page(inode, vi, folio, max_nr_pages,
+ &map, &bio, &last_block_in_bio, rac);
if (ret) {
#ifdef CONFIG_F2FS_FS_COMPRESSION
set_error_page:
@@ -2455,6 +2459,7 @@ static int f2fs_mpage_readpages(struct inode *inode,
if (f2fs_compressed_file(inode)) {
/* last page */
if (nr_pages == 1 && !f2fs_cluster_is_empty(&cc)) {
+ cc.vi = vi;
ret = f2fs_read_multi_pages(&cc, &bio,
max_nr_pages,
&last_block_in_bio,
@@ -3653,6 +3658,7 @@ static int f2fs_write_begin(const struct kiocb *iocb,
}
f2fs_submit_page_read(use_cow ?
F2FS_I(inode)->cow_inode : inode,
+ NULL, /* can't write to fsverity files */
folio, blkaddr, 0, true);
folio_lock(folio);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 20edbb99b814..f2fcadc7a6fe 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1603,6 +1603,7 @@ struct compress_ctx {
size_t clen; /* valid data length in cbuf */
void *private; /* payload buffer for specified compression algorithm */
void *private2; /* extra payload buffer */
+ struct fsverity_info *vi; /* verity info if needed */
};
/* compress context for write IO path */
@@ -1658,7 +1659,7 @@ struct decompress_io_ctx {
refcount_t refcnt;
bool failed; /* IO error occurred before decompression? */
- bool need_verity; /* need fs-verity verification after decompression? */
+ struct fsverity_info *vi; /* fs-verity context if needed */
unsigned char compress_algorithm; /* backup algorithm type */
void *private; /* payload buffer for specified decompression algorithm */
void *private2; /* extra payload buffer */
@@ -4886,12 +4887,6 @@ static inline bool f2fs_allow_multi_device_dio(struct f2fs_sb_info *sbi,
return sbi->aligned_blksize;
}
-static inline bool f2fs_need_verity(const struct inode *inode, pgoff_t idx)
-{
- return fsverity_active(inode) &&
- idx < DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
-}
-
#ifdef CONFIG_F2FS_FAULT_INJECTION
extern int f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned long rate,
unsigned long type, enum fault_option fo);
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 14/15] btrfs: consolidate fsverity_info lookup
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (12 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 13/15] f2fs: " Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-28 15:26 ` [PATCH 15/15] fsverity: use a hashtable to find the fsverity_info Christoph Hellwig
2026-01-29 0:07 ` fsverity cleanups, speedup and memory usage optimization v4 Eric Biggers
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
Look up the fsverity_info once in btrfs_do_readpage, and then use it
for all operations performed there, and do the same in end_folio_read
for all folios processed there. The latter is also changed to derive
the inode from the btrfs_bio - while bbio->inode is optional, it is
always set for buffered reads.
This amortizes the lookup better once it becomes less efficient.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com>
---
fs/btrfs/extent_io.c | 54 +++++++++++++++++++++++++++-----------------
1 file changed, 33 insertions(+), 21 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 21430b7d8f27..24988520521c 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -476,26 +476,25 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
end, page_ops);
}
-static bool btrfs_verify_folio(struct folio *folio, u64 start, u32 len)
+static bool btrfs_verify_folio(struct fsverity_info *vi, struct folio *folio,
+ u64 start, u32 len)
{
struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
- if (!fsverity_active(folio->mapping->host) ||
- btrfs_folio_test_uptodate(fs_info, folio, start, len) ||
- start >= i_size_read(folio->mapping->host))
+ if (!vi || btrfs_folio_test_uptodate(fs_info, folio, start, len))
return true;
- return fsverity_verify_folio(*fsverity_info_addr(folio->mapping->host),
- folio);
+ return fsverity_verify_folio(vi, folio);
}
-static void end_folio_read(struct folio *folio, bool uptodate, u64 start, u32 len)
+static void end_folio_read(struct fsverity_info *vi, struct folio *folio,
+ bool uptodate, u64 start, u32 len)
{
struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
ASSERT(folio_pos(folio) <= start &&
start + len <= folio_next_pos(folio));
- if (uptodate && btrfs_verify_folio(folio, start, len))
+ if (uptodate && btrfs_verify_folio(vi, folio, start, len))
btrfs_folio_set_uptodate(fs_info, folio, start, len);
else
btrfs_folio_clear_uptodate(fs_info, folio, start, len);
@@ -575,14 +574,19 @@ static void begin_folio_read(struct btrfs_fs_info *fs_info, struct folio *folio)
static void end_bbio_data_read(struct btrfs_bio *bbio)
{
struct btrfs_fs_info *fs_info = bbio->inode->root->fs_info;
+ struct inode *inode = &bbio->inode->vfs_inode;
struct bio *bio = &bbio->bio;
+ struct fsverity_info *vi = NULL;
struct folio_iter fi;
ASSERT(!bio_flagged(bio, BIO_CLONED));
+
+ if (bbio->file_offset < i_size_read(inode))
+ vi = fsverity_get_info(inode);
+
bio_for_each_folio_all(fi, &bbio->bio) {
bool uptodate = !bio->bi_status;
struct folio *folio = fi.folio;
- struct inode *inode = folio->mapping->host;
u64 start = folio_pos(folio) + fi.offset;
btrfs_debug(fs_info,
@@ -617,7 +621,7 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
}
/* Update page status and unlock. */
- end_folio_read(folio, uptodate, start, fi.length);
+ end_folio_read(vi, folio, uptodate, start, fi.length);
}
bio_put(bio);
}
@@ -992,7 +996,8 @@ static void btrfs_readahead_expand(struct readahead_control *ractl,
* return 0 on success, otherwise return error
*/
static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
- struct btrfs_bio_ctrl *bio_ctrl)
+ struct btrfs_bio_ctrl *bio_ctrl,
+ struct fsverity_info *vi)
{
struct inode *inode = folio->mapping->host;
struct btrfs_fs_info *fs_info = inode_to_fs_info(inode);
@@ -1030,16 +1035,16 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
ASSERT(IS_ALIGNED(cur, fs_info->sectorsize));
if (cur >= last_byte) {
folio_zero_range(folio, pg_offset, end - cur + 1);
- end_folio_read(folio, true, cur, end - cur + 1);
+ end_folio_read(vi, folio, true, cur, end - cur + 1);
break;
}
if (btrfs_folio_test_uptodate(fs_info, folio, cur, blocksize)) {
- end_folio_read(folio, true, cur, blocksize);
+ end_folio_read(vi, folio, true, cur, blocksize);
continue;
}
em = get_extent_map(BTRFS_I(inode), folio, cur, end - cur + 1, em_cached);
if (IS_ERR(em)) {
- end_folio_read(folio, false, cur, end + 1 - cur);
+ end_folio_read(vi, folio, false, cur, end + 1 - cur);
return PTR_ERR(em);
}
extent_offset = cur - em->start;
@@ -1116,12 +1121,12 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
/* we've found a hole, just zero and go on */
if (block_start == EXTENT_MAP_HOLE) {
folio_zero_range(folio, pg_offset, blocksize);
- end_folio_read(folio, true, cur, blocksize);
+ end_folio_read(vi, folio, true, cur, blocksize);
continue;
}
/* the get_extent function already copied into the folio */
if (block_start == EXTENT_MAP_INLINE) {
- end_folio_read(folio, true, cur, blocksize);
+ end_folio_read(vi, folio, true, cur, blocksize);
continue;
}
@@ -1318,7 +1323,8 @@ static void lock_extents_for_read(struct btrfs_inode *inode, u64 start, u64 end,
int btrfs_read_folio(struct file *file, struct folio *folio)
{
- struct btrfs_inode *inode = folio_to_inode(folio);
+ struct inode *vfs_inode = folio->mapping->host;
+ struct btrfs_inode *inode = BTRFS_I(vfs_inode);
const u64 start = folio_pos(folio);
const u64 end = start + folio_size(folio) - 1;
struct extent_state *cached_state = NULL;
@@ -1327,10 +1333,13 @@ int btrfs_read_folio(struct file *file, struct folio *folio)
.last_em_start = U64_MAX,
};
struct extent_map *em_cached = NULL;
+ struct fsverity_info *vi = NULL;
int ret;
lock_extents_for_read(inode, start, end, &cached_state);
- ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl);
+ if (folio_pos(folio) < i_size_read(vfs_inode))
+ vi = fsverity_get_info(vfs_inode);
+ ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl, vi);
btrfs_unlock_extent(&inode->io_tree, start, end, &cached_state);
btrfs_free_extent_map(em_cached);
@@ -2697,16 +2706,19 @@ void btrfs_readahead(struct readahead_control *rac)
.last_em_start = U64_MAX,
};
struct folio *folio;
- struct btrfs_inode *inode = BTRFS_I(rac->mapping->host);
+ struct inode *vfs_inode = rac->mapping->host;
+ struct btrfs_inode *inode = BTRFS_I(vfs_inode);
const u64 start = readahead_pos(rac);
const u64 end = start + readahead_length(rac) - 1;
struct extent_state *cached_state = NULL;
struct extent_map *em_cached = NULL;
+ struct fsverity_info *vi = NULL;
lock_extents_for_read(inode, start, end, &cached_state);
-
+ if (start < i_size_read(vfs_inode))
+ vi = fsverity_get_info(vfs_inode);
while ((folio = readahead_folio(rac)) != NULL)
- btrfs_do_readpage(folio, &em_cached, &bio_ctrl);
+ btrfs_do_readpage(folio, &em_cached, &bio_ctrl, vi);
btrfs_unlock_extent(&inode->io_tree, start, end, &cached_state);
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 15/15] fsverity: use a hashtable to find the fsverity_info
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (13 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 14/15] btrfs: " Christoph Hellwig
@ 2026-01-28 15:26 ` Christoph Hellwig
2026-01-29 0:07 ` fsverity cleanups, speedup and memory usage optimization v4 Eric Biggers
15 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2026-01-28 15:26 UTC (permalink / raw)
To: Eric Biggers
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity, Darrick J. Wong
Use the kernel's resizable hash table (rhashtable) to find the
fsverity_info. This way file systems that want to support fsverity don't
have to bloat every inode in the system with an extra pointer. The
trade-off is that looking up the fsverity_info is a bit more expensive
now, but the main operations are still dominated by I/O and hashing
overhead.
The rhashtable implementations requires no external synchronization, and
the _fast versions of the APIs provide the RCU critical sections required
by the implementation. Because struct fsverity_info is only removed on
inode eviction and does not contain a reference count, there is no need
for an extended critical section to grab a reference or validate the
object state. The file open path uses rhashtable_lookup_get_insert_fast,
which can either find an existing object for the hash key or insert a
new one in a single atomic operation, so that concurrent opens never
instatiate duplicate fsverity_info structure. FS_IOC_ENABLE_VERITY must
already be synchronized by a combination of i_rwsem and file system flags
and uses rhashtable_lookup_insert_fast, which errors out on an existing
object for the hash key as an additional safety check.
Because insertion into the hash table now happens before S_VERITY is set,
fsverity just becomes a barrier and a flag check and doesn't have to look
up the fsverity_info at all, so there is only a single lookup per
->read_folio or ->readahead invocation. For btrfs there is an additional
one for each bio completion, while for ext4 and f2fs the fsverity_info
is stored in the per-I/O context and reused for the completion workqueue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
---
fs/btrfs/btrfs_inode.h | 4 --
fs/btrfs/inode.c | 3 --
fs/btrfs/verity.c | 2 -
fs/ext4/ext4.h | 4 --
fs/ext4/super.c | 3 --
fs/ext4/verity.c | 2 -
fs/f2fs/f2fs.h | 3 --
fs/f2fs/super.c | 3 --
fs/f2fs/verity.c | 2 -
fs/verity/enable.c | 30 +++++++-----
fs/verity/fsverity_private.h | 17 +++----
fs/verity/open.c | 75 +++++++++++++++++++-----------
fs/verity/verify.c | 2 +-
include/linux/fsverity.h | 90 ++++++++++++++----------------------
14 files changed, 109 insertions(+), 131 deletions(-)
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 73602ee8de3f..55c272fe5d92 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -339,10 +339,6 @@ struct btrfs_inode {
struct rw_semaphore i_mmap_lock;
-#ifdef CONFIG_FS_VERITY
- struct fsverity_info *i_verity_info;
-#endif
-
struct inode vfs_inode;
};
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 67c64efc5099..93b2ce75fb06 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8097,9 +8097,6 @@ static void init_once(void *foo)
struct btrfs_inode *ei = foo;
inode_init_once(&ei->vfs_inode);
-#ifdef CONFIG_FS_VERITY
- ei->i_verity_info = NULL;
-#endif
}
void __cold btrfs_destroy_cachep(void)
diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index c152bef71e8b..cd96fac4739f 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -795,8 +795,6 @@ static int btrfs_write_merkle_tree_block(struct file *file, const void *buf,
}
const struct fsverity_operations btrfs_verityops = {
- .inode_info_offs = (int)offsetof(struct btrfs_inode, i_verity_info) -
- (int)offsetof(struct btrfs_inode, vfs_inode),
.begin_enable_verity = btrfs_begin_enable_verity,
.end_enable_verity = btrfs_end_enable_verity,
.get_verity_descriptor = btrfs_get_verity_descriptor,
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 56112f201cac..60c549bc894e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1205,10 +1205,6 @@ struct ext4_inode_info {
#ifdef CONFIG_FS_ENCRYPTION
struct fscrypt_inode_info *i_crypt_info;
#endif
-
-#ifdef CONFIG_FS_VERITY
- struct fsverity_info *i_verity_info;
-#endif
};
/*
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 86131f4d8718..1fb0c90c7a4b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1484,9 +1484,6 @@ static void init_once(void *foo)
#ifdef CONFIG_FS_ENCRYPTION
ei->i_crypt_info = NULL;
#endif
-#ifdef CONFIG_FS_VERITY
- ei->i_verity_info = NULL;
-#endif
}
static int __init init_inodecache(void)
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index 54ae4d4a176c..e3ab3ba8799b 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -380,8 +380,6 @@ static int ext4_write_merkle_tree_block(struct file *file, const void *buf,
}
const struct fsverity_operations ext4_verityops = {
- .inode_info_offs = (int)offsetof(struct ext4_inode_info, i_verity_info) -
- (int)offsetof(struct ext4_inode_info, vfs_inode),
.begin_enable_verity = ext4_begin_enable_verity,
.end_enable_verity = ext4_end_enable_verity,
.get_verity_descriptor = ext4_get_verity_descriptor,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index f2fcadc7a6fe..8ee8a7bc012c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -974,9 +974,6 @@ struct f2fs_inode_info {
#ifdef CONFIG_FS_ENCRYPTION
struct fscrypt_inode_info *i_crypt_info; /* filesystem encryption info */
#endif
-#ifdef CONFIG_FS_VERITY
- struct fsverity_info *i_verity_info; /* filesystem verity info */
-#endif
};
static inline void get_read_extent_info(struct extent_info *ext,
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index c4c225e09dc4..cd00d030edda 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -504,9 +504,6 @@ static void init_once(void *foo)
#ifdef CONFIG_FS_ENCRYPTION
fi->i_crypt_info = NULL;
#endif
-#ifdef CONFIG_FS_VERITY
- fi->i_verity_info = NULL;
-#endif
}
#ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index 628e8eafa96a..4f5230d871f7 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -278,8 +278,6 @@ static int f2fs_write_merkle_tree_block(struct file *file, const void *buf,
}
const struct fsverity_operations f2fs_verityops = {
- .inode_info_offs = (int)offsetof(struct f2fs_inode_info, i_verity_info) -
- (int)offsetof(struct f2fs_inode_info, vfs_inode),
.begin_enable_verity = f2fs_begin_enable_verity,
.end_enable_verity = f2fs_end_enable_verity,
.get_verity_descriptor = f2fs_get_verity_descriptor,
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index c56c18e2605b..94c88c419054 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -265,9 +265,24 @@ static int enable_verity(struct file *filp,
goto rollback;
}
+ /*
+ * Add the fsverity_info into the hash table before finishing the
+ * initialization so that we don't have to undo the enabling when memory
+ * allocation for the hash table fails. This is safe because looking up
+ * the fsverity_info always first checks the S_VERITY flag on the inode,
+ * which will only be set at the very end of the ->end_enable_verity
+ * method.
+ */
+ err = fsverity_set_info(vi);
+ if (err)
+ goto rollback;
+
/*
* Tell the filesystem to finish enabling verity on the file.
- * Serialized with ->begin_enable_verity() by the inode lock.
+ * Serialized with ->begin_enable_verity() by the inode lock. The file
+ * system needs to set the S_VERITY flag on the inode at the very end of
+ * the method, at which point the fsverity information can be accessed
+ * by other threads.
*/
inode_lock(inode);
err = vops->end_enable_verity(filp, desc, desc_size, params.tree_size);
@@ -275,19 +290,10 @@ static int enable_verity(struct file *filp,
if (err) {
fsverity_err(inode, "%ps() failed with err %d",
vops->end_enable_verity, err);
- fsverity_free_info(vi);
+ fsverity_remove_info(vi);
} else if (WARN_ON_ONCE(!IS_VERITY(inode))) {
+ fsverity_remove_info(vi);
err = -EINVAL;
- fsverity_free_info(vi);
- } else {
- /* Successfully enabled verity */
-
- /*
- * Readers can start using the inode's verity info immediately,
- * so it can't be rolled back once set. So don't set it until
- * just after the filesystem has successfully enabled verity.
- */
- fsverity_set_info(inode, vi);
}
out:
kfree(params.hashstate);
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index f9f3936b0a89..4d4a0a560562 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -11,6 +11,7 @@
#define pr_fmt(fmt) "fs-verity: " fmt
#include <linux/fsverity.h>
+#include <linux/rhashtable.h>
/*
* Implementation limit: maximum depth of the Merkle tree. For now 8 is plenty;
@@ -63,13 +64,14 @@ struct merkle_tree_params {
* fsverity_info - cached verity metadata for an inode
*
* When a verity file is first opened, an instance of this struct is allocated
- * and a pointer to it is stored in the file's in-memory inode. It remains
- * until the inode is evicted. It caches information about the Merkle tree
- * that's needed to efficiently verify data read from the file. It also caches
- * the file digest. The Merkle tree pages themselves are not cached here, but
- * the filesystem may cache them.
+ * and a pointer to it is stored in the global hash table, indexed by the inode
+ * pointer value. It remains alive until the inode is evicted. It caches
+ * information about the Merkle tree that's needed to efficiently verify data
+ * read from the file. It also caches the file digest. The Merkle tree pages
+ * themselves are not cached here, but the filesystem may cache them.
*/
struct fsverity_info {
+ struct rhash_head rhash_head;
struct merkle_tree_params tree_params;
u8 root_hash[FS_VERITY_MAX_DIGEST_SIZE];
u8 file_digest[FS_VERITY_MAX_DIGEST_SIZE];
@@ -127,9 +129,8 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
struct fsverity_info *fsverity_create_info(struct inode *inode,
struct fsverity_descriptor *desc);
-void fsverity_set_info(struct inode *inode, struct fsverity_info *vi);
-
-void fsverity_free_info(struct fsverity_info *vi);
+int fsverity_set_info(struct fsverity_info *vi);
+void fsverity_remove_info(struct fsverity_info *vi);
int fsverity_get_descriptor(struct inode *inode,
struct fsverity_descriptor **desc_ret);
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 128502cf0a23..1bde8fe79b3f 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -12,6 +12,14 @@
#include <linux/slab.h>
static struct kmem_cache *fsverity_info_cachep;
+static struct rhashtable fsverity_info_hash;
+
+static const struct rhashtable_params fsverity_info_hash_params = {
+ .key_len = sizeof_field(struct fsverity_info, inode),
+ .key_offset = offsetof(struct fsverity_info, inode),
+ .head_offset = offsetof(struct fsverity_info, rhash_head),
+ .automatic_shrinking = true,
+};
/**
* fsverity_init_merkle_tree_params() - initialize Merkle tree parameters
@@ -170,6 +178,13 @@ static void compute_file_digest(const struct fsverity_hash_alg *hash_alg,
desc->sig_size = sig_size;
}
+static void fsverity_free_info(struct fsverity_info *vi)
+{
+ kfree(vi->tree_params.hashstate);
+ kvfree(vi->hash_block_verified);
+ kmem_cache_free(fsverity_info_cachep, vi);
+}
+
/*
* Create a new fsverity_info from the given fsverity_descriptor (with optional
* appended builtin signature), and check the signature if present. The
@@ -241,33 +256,18 @@ struct fsverity_info *fsverity_create_info(struct inode *inode,
return ERR_PTR(err);
}
-void fsverity_set_info(struct inode *inode, struct fsverity_info *vi)
+int fsverity_set_info(struct fsverity_info *vi)
{
- /*
- * Multiple tasks may race to set the inode's verity info pointer, so
- * use cmpxchg_release(). This pairs with the smp_load_acquire() in
- * fsverity_get_info(). I.e., publish the pointer with a RELEASE
- * barrier so that other tasks can ACQUIRE it.
- */
- if (cmpxchg_release(fsverity_info_addr(inode), NULL, vi) != NULL) {
- /* Lost the race, so free the verity info we allocated. */
- fsverity_free_info(vi);
- /*
- * Afterwards, the caller may access the inode's verity info
- * directly, so make sure to ACQUIRE the winning verity info.
- */
- (void)fsverity_get_info(inode);
- }
+ return rhashtable_lookup_insert_fast(&fsverity_info_hash,
+ &vi->rhash_head, fsverity_info_hash_params);
}
-void fsverity_free_info(struct fsverity_info *vi)
+struct fsverity_info *__fsverity_get_info(const struct inode *inode)
{
- if (!vi)
- return;
- kfree(vi->tree_params.hashstate);
- kvfree(vi->hash_block_verified);
- kmem_cache_free(fsverity_info_cachep, vi);
+ return rhashtable_lookup_fast(&fsverity_info_hash, &inode,
+ fsverity_info_hash_params);
}
+EXPORT_SYMBOL_GPL(__fsverity_get_info);
static bool validate_fsverity_descriptor(struct inode *inode,
const struct fsverity_descriptor *desc,
@@ -352,7 +352,7 @@ int fsverity_get_descriptor(struct inode *inode,
static int ensure_verity_info(struct inode *inode)
{
- struct fsverity_info *vi = fsverity_get_info(inode);
+ struct fsverity_info *vi = fsverity_get_info(inode), *found;
struct fsverity_descriptor *desc;
int err;
@@ -369,8 +369,18 @@ static int ensure_verity_info(struct inode *inode)
goto out_free_desc;
}
- fsverity_set_info(inode, vi);
- err = 0;
+ /*
+ * Multiple tasks may race to set the inode's verity info, in which case
+ * we might find an existing fsverity_info in the hash table.
+ */
+ found = rhashtable_lookup_get_insert_fast(&fsverity_info_hash,
+ &vi->rhash_head, fsverity_info_hash_params);
+ if (found) {
+ fsverity_free_info(vi);
+ if (IS_ERR(found))
+ err = PTR_ERR(found);
+ }
+
out_free_desc:
kfree(desc);
return err;
@@ -384,16 +394,25 @@ int __fsverity_file_open(struct inode *inode, struct file *filp)
}
EXPORT_SYMBOL_GPL(__fsverity_file_open);
+void fsverity_remove_info(struct fsverity_info *vi)
+{
+ rhashtable_remove_fast(&fsverity_info_hash, &vi->rhash_head,
+ fsverity_info_hash_params);
+ fsverity_free_info(vi);
+}
+
void fsverity_cleanup_inode(struct inode *inode)
{
- struct fsverity_info **vi_addr = fsverity_info_addr(inode);
+ struct fsverity_info *vi = fsverity_get_info(inode);
- fsverity_free_info(*vi_addr);
- *vi_addr = NULL;
+ if (vi)
+ fsverity_remove_info(vi);
}
void __init fsverity_init_info_cache(void)
{
+ if (rhashtable_init(&fsverity_info_hash, &fsverity_info_hash_params))
+ panic("failed to initialize fsverity hash\n");
fsverity_info_cachep = KMEM_CACHE_USERCOPY(
fsverity_info,
SLAB_RECLAIM_ACCOUNT | SLAB_PANIC,
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 98685bbb21f6..fe132a19a877 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -320,7 +320,7 @@ fsverity_init_verification_context(struct fsverity_verification_context *ctx,
ctx->inode = vi->inode;
ctx->vi = vi;
ctx->num_pending = 0;
- if (vi->tree_params.hash_alg->algo_id == HASH_ALGO_SHA256 &&
+ if (ctx->vi->tree_params.hash_alg->algo_id == HASH_ALGO_SHA256 &&
sha256_finup_2x_is_optimized())
ctx->max_pending = 2;
else
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 1d70b270e90a..dc2d7b229844 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -30,13 +30,6 @@ struct fsverity_info;
/* Verity operations for filesystems */
struct fsverity_operations {
- /**
- * The offset of the pointer to struct fsverity_info in the
- * filesystem-specific part of the inode, relative to the beginning of
- * the common part of the inode (the 'struct inode').
- */
- ptrdiff_t inode_info_offs;
-
/**
* Begin enabling verity on the given file.
*
@@ -142,38 +135,43 @@ struct fsverity_operations {
};
#ifdef CONFIG_FS_VERITY
-
-/*
- * Returns the address of the verity info pointer within the filesystem-specific
- * part of the inode. (To save memory on filesystems that don't support
- * fsverity, a field in 'struct inode' itself is no longer used.)
+/**
+ * fsverity_active() - do reads from the inode need to go through fs-verity?
+ * @inode: inode to check
+ *
+ * This checks whether the inode's verity info has been set, and reads need
+ * to verify the file data.
+ *
+ * Return: true if reads need to go through fs-verity, otherwise false
*/
-static inline struct fsverity_info **
-fsverity_info_addr(const struct inode *inode)
+static inline bool fsverity_active(const struct inode *inode)
{
- VFS_WARN_ON_ONCE(inode->i_sb->s_vop->inode_info_offs == 0);
- return (void *)inode + inode->i_sb->s_vop->inode_info_offs;
+ if (IS_VERITY(inode)) {
+ /*
+ * This pairs with the try_cmpxchg in set_mask_bits()
+ * used to set the S_VERITY bit in i_flags.
+ */
+ smp_mb();
+ return true;
+ }
+
+ return false;
}
+/**
+ * fsverity_get_info - get fsverity information for an inode
+ * @inode: inode to operate on.
+ *
+ * This gets the fsverity_info for @inode if it exists. Safe to call without
+ * knowin that a fsverity_info exist for @inode, including on file systems that
+ * do not support fsverity.
+ */
+struct fsverity_info *__fsverity_get_info(const struct inode *inode);
static inline struct fsverity_info *fsverity_get_info(const struct inode *inode)
{
- /*
- * Since this function can be called on inodes belonging to filesystems
- * that don't support fsverity at all, and fsverity_info_addr() doesn't
- * work on such filesystems, we have to start with an IS_VERITY() check.
- * Checking IS_VERITY() here is also useful to minimize the overhead of
- * fsverity_active() on non-verity files.
- */
- if (!IS_VERITY(inode))
+ if (!fsverity_active(inode))
return NULL;
-
- /*
- * Pairs with the cmpxchg_release() in fsverity_set_info(). I.e.,
- * another task may publish the inode's verity info concurrently,
- * executing a RELEASE barrier. Use smp_load_acquire() here to safely
- * ACQUIRE the memory the other task published.
- */
- return smp_load_acquire(fsverity_info_addr(inode));
+ return __fsverity_get_info(inode);
}
/* enable.c */
@@ -204,12 +202,10 @@ void fsverity_enqueue_verify_work(struct work_struct *work);
#else /* !CONFIG_FS_VERITY */
-/*
- * Provide a stub to allow code using this to compile. All callsites should be
- * guarded by compiler dead code elimination, and this forces a link error if
- * not.
- */
-struct fsverity_info **fsverity_info_addr(const struct inode *inode);
+static inline bool fsverity_active(const struct inode *inode)
+{
+ return false;
+}
static inline struct fsverity_info *fsverity_get_info(const struct inode *inode)
{
@@ -292,24 +288,6 @@ static inline bool fsverity_verify_page(struct fsverity_info *vi,
return fsverity_verify_blocks(vi, page_folio(page), PAGE_SIZE, 0);
}
-/**
- * fsverity_active() - do reads from the inode need to go through fs-verity?
- * @inode: inode to check
- *
- * This checks whether the inode's verity info has been set.
- *
- * Filesystems call this from ->readahead() to check whether the pages need to
- * be verified or not. Don't use IS_VERITY() for this purpose; it's subject to
- * a race condition where the file is being read concurrently with
- * FS_IOC_ENABLE_VERITY completing. (S_VERITY is set before the verity info.)
- *
- * Return: true if reads need to go through fs-verity, otherwise false
- */
-static inline bool fsverity_active(const struct inode *inode)
-{
- return fsverity_get_info(inode) != NULL;
-}
-
/**
* fsverity_file_open() - prepare to open a verity file
* @inode: the inode being opened
--
2.47.3
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: fsverity cleanups, speedup and memory usage optimization v4
2026-01-28 15:26 fsverity cleanups, speedup and memory usage optimization v4 Christoph Hellwig
` (14 preceding siblings ...)
2026-01-28 15:26 ` [PATCH 15/15] fsverity: use a hashtable to find the fsverity_info Christoph Hellwig
@ 2026-01-29 0:07 ` Eric Biggers
15 siblings, 0 replies; 27+ messages in thread
From: Eric Biggers @ 2026-01-29 0:07 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Al Viro, Christian Brauner, Jan Kara, David Sterba,
Theodore Ts'o, Jaegeuk Kim, Chao Yu, Andrey Albershteyn,
Matthew Wilcox, linux-fsdevel, linux-btrfs, linux-ext4,
linux-f2fs-devel, fsverity
On Wed, Jan 28, 2026 at 04:26:12PM +0100, Christoph Hellwig wrote:
> Hi all,
>
> this series has a hodge podge of fsverity enhances that I looked into as
> part of the review of the xfs fsverity support series.
>
> The first part calls fsverity code from VFS code instead of requiring
> boilerplate in the file systems.
>
> The first patch fixes a bug in btrfs as part of that, as btrfs was missing
> a check. An xfstests test case for this was submitted already.
> Can we expedite this fix?
>
> The middle part optimizes the fsverity read path by kicking off readahead
> for the fsverity hashes from the data read submission context, which in my
> simply testing showed huge benefits for sequential reads using dd.
> I haven't been able to get fio to run on a preallocated fio file, but
> I expect random read benefits would be significantly better than that
> still.
To get things going, I've applied patches 1-6 to
https://git.kernel.org/pub/scm/fs/fsverity/linux.git/log/?h=for-next
I couldn't go further, due to the bugs in patches 7 and 8.
- Eric
^ permalink raw reply [flat|nested] 27+ messages in thread