From: "Darrick J. Wong" <djwong@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Andrey Albershteyn <aalbersh@redhat.com>,
fsverity@lists.linux.dev, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, chandan.babu@oracle.com
Subject: Re: [PATCH v5 07/24] fsverity: support block-based Merkle tree caching
Date: Sat, 9 Mar 2024 08:24:41 -0800 [thread overview]
Message-ID: <20240309162441.GP1927156@frogsfrogsfrogs> (raw)
In-Reply-To: <20240308035036.GL1927156@frogsfrogsfrogs>
On Thu, Mar 07, 2024 at 07:50:36PM -0800, Darrick J. Wong wrote:
> On Thu, Mar 07, 2024 at 02:49:03PM -0800, Eric Biggers wrote:
> > On Thu, Mar 07, 2024 at 01:54:01PM -0800, Darrick J. Wong wrote:
> > > On Tue, Mar 05, 2024 at 07:56:22PM -0800, Eric Biggers wrote:
> > > > On Mon, Mar 04, 2024 at 08:10:30PM +0100, Andrey Albershteyn wrote:
> > > > > diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> > > > > index b3506f56e180..dad33e6ff0d6 100644
> > > > > --- a/fs/verity/fsverity_private.h
> > > > > +++ b/fs/verity/fsverity_private.h
> > > > > @@ -154,4 +154,12 @@ static inline void fsverity_init_signature(void)
> > > > >
> > > > > void __init fsverity_init_workqueue(void);
> > > > >
> > > > > +/*
> > > > > + * Drop 'block' obtained with ->read_merkle_tree_block(). Calls out back to
> > > > > + * filesystem if ->drop_block() is set, otherwise, drop the reference in the
> > > > > + * block->context.
> > > > > + */
> > > > > +void fsverity_drop_block(struct inode *inode,
> > > > > + struct fsverity_blockbuf *block);
> > > > > +
> > > > > #endif /* _FSVERITY_PRIVATE_H */
> > > >
> > > > This should be paired with a helper function that reads a Merkle tree block by
> > > > calling ->read_merkle_tree_block or ->read_merkle_tree_page as needed. Besides
> > > > being consistent with having a helper function for drop, this would prevent code
> > > > duplication between verify_data_block() and fsverity_read_merkle_tree().
> > > >
> > > > I recommend that it look like this:
> > > >
> > > > int fsverity_read_merkle_tree_block(struct inode *inode, u64 pos,
> > > > unsigned long ra_bytes,
> > > > struct fsverity_blockbuf *block);
> > > >
> > > > 'pos' would be the byte position of the block in the Merkle tree, and 'ra_bytes'
> > > > would be the number of bytes for the filesystem to (optionally) readahead if the
> > > > block is not yet cached. I think that things work out simpler if these values
> > > > are measured in bytes, not blocks. 'block' would be at the end because it's an
> > > > output, and it can be confusing to interleave inputs and outputs in parameters.
> > >
> > > FWIW I don't really like 'pos' here because that's usually short for
> > > "file position", which is a byte, and this looks a lot more like a
> > > merkle tree block number.
> > >
> > > u64 blkno?
> > >
> > > Or better yet use a typedef ("merkle_blkno_t") to make it really clear
> > > when we're dealing with a tree block number. Ignore checkpatch
> > > complaining about typeedefs. :)
> >
> > My suggestion is for 'pos' to be a byte position, in alignment with the
> > pagecache naming convention as well as ->write_merkle_tree_block which currently
> > uses a 'u64 pos' byte position too.
> >
> > It would also work for it to be a block index, in which case it should be named
> > 'index' or 'blkno' and have type unsigned long. I *think* that things work out
> > a bit cleaner if it's a byte position, but I could be convinced that it's
> > actually better for it to be a block index.
>
> It's probably cleaner (or at least willy says so) to measure everything
> in bytes. The only place that gets messy is if we want to cache merkle
> tree blocks in an xarray or something, in which case we'd want to >> by
> the block_shift to avoid leaving gaps.
...and now having just done that, yes, I would like to retain passing
log_blocksize to the ->read_merkle_tree_block function. I cleaned it up
a bit for my own purposes:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/commit/?h=fsverity-cleanups-6.9&id=4d36c90be4ac143f3e31989317bde7cd6c109c6e
Though I saw you had other feedback to Andrey about that. :)
--D
> > > > How about changing the prototype to:
> > > >
> > > > void fsverity_invalidate_merkle_tree_block(struct inode *inode, u64 pos);
> > > >
> > > > Also, is it a kernel bug for the pos to be beyond the end of the Merkle tree, or
> > > > can it happen in cases like filesystem corruption? If it can only happen due to
> > > > kernel bugs, WARN_ON_ONCE() might be more appropriate than an error message.
> > >
> > > I think XFS only passes to _invalidate_* the same pos that was passed to
> > > ->read_merkle_tree_block, so this is a kernel bug, not a fs corruption
> > > problem.
> > >
> > > Perhaps this function ought to note that @pos is supposed to be the same
> > > value that was given to ->read_merkle_tree_block?
> > >
> > > Or: make the implementations return 1 for "reloaded from disk", 0 for
> > > "still in cache", or a negative error code. Then fsverity can call
> > > the invalidation routine itself and XFS doesn't have to worry about this
> > > part.
> > >
> > > (I think? I have questions about the xfs_invalidate_blocks function.)
> >
> > It looks like XFS can invalidate blocks other than the one being read by
> > ->read_merkle_tree_block.
> >
> > If it really was only a matter of the single block being read, then it would
> > indeed be simpler to just make it a piece of information returned from
> > ->read_merkle_tree_block.
> >
> > If the generic invalidation function is needed, it needs to be clearly
> > documented when filesystems are expected to invalidate blocks.
>
> I /think/ the generic invalidation is only necessary with the weird
> DOUBLE_ALLOC thing. If the other two implementations (ext4/f2fs)
> haven't needed a "nuke from orbit" function then xfs shouldn't require
> one too.
>
> > > > > +
> > > > > + /**
> > > > > + * Release the reference to a Merkle tree block
> > > > > + *
> > > > > + * @block: the block to release
> > > > > + *
> > > > > + * This is called when fs-verity is done with a block obtained with
> > > > > + * ->read_merkle_tree_block().
> > > > > + */
> > > > > + void (*drop_block)(struct fsverity_blockbuf *block);
> > > >
> > > > drop_merkle_tree_block, so that it's clearly paired with read_merkle_tree_block
> > >
> > > Yep. I noticed that xfs_verity.c doesn't put them together, which made
> > > me wonder if the write_merkle_tree_block path made use of that. It
> > > doesn't, AFAICT.
> > >
> > > And I think the reason is that when we're setting up the merkle tree,
> > > we want to stream the contents straight to disk instead of ending up
> > > with a huge cache that might not all be necessary?
> >
> > In the current patchset, fsverity_blockbuf isn't used for writes (despite the
> > comment saying it is). I think that's fine and that it keeps things simpler.
> > Filesystems can still cache the blocks that are passed to
> > ->write_merkle_tree_block if they want to. That already happens on ext4 and
> > f2fs; FS_IOC_ENABLE_VERITY results in the Merkle tree being in the pagecache.
>
> Oh! Maybe it should do that then. At least the root block?
>
> --D
>
> > - Eric
> >
>
next prev parent reply other threads:[~2024-03-09 16:24 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-04 19:10 [PATCH v5 00/24] fs-verity support for XFS Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 01/24] fsverity: remove hash page spin lock Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 02/24] xfs: add parent pointer support to attribute code Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 03/24] xfs: define parent pointer ondisk extended attribute format Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 04/24] xfs: add parent pointer validator functions Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 05/24] fs: add FS_XFLAG_VERITY for verity files Andrey Albershteyn
2024-03-04 22:35 ` Eric Biggers
2024-03-07 21:39 ` Darrick J. Wong
2024-03-07 22:06 ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 06/24] fsverity: pass tree_blocksize to end_enable_verity() Andrey Albershteyn
2024-03-05 0:52 ` Eric Biggers
2024-03-06 16:30 ` Darrick J. Wong
2024-03-07 22:02 ` Eric Biggers
2024-03-08 3:46 ` Darrick J. Wong
2024-03-08 4:40 ` Eric Biggers
2024-03-11 22:38 ` Darrick J. Wong
2024-03-12 15:13 ` David Hildenbrand
2024-03-12 15:33 ` David Hildenbrand
2024-03-12 16:44 ` Darrick J. Wong
2024-03-13 12:29 ` David Hildenbrand
2024-03-13 17:19 ` Darrick J. Wong
2024-03-13 19:10 ` David Hildenbrand
2024-03-13 21:03 ` David Hildenbrand
2024-03-08 21:34 ` Dave Chinner
2024-03-09 16:19 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 07/24] fsverity: support block-based Merkle tree caching Andrey Albershteyn
2024-03-06 3:56 ` Eric Biggers
2024-03-07 21:54 ` Darrick J. Wong
2024-03-07 22:49 ` Eric Biggers
2024-03-08 3:50 ` Darrick J. Wong
2024-03-09 16:24 ` Darrick J. Wong [this message]
2024-03-11 23:22 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 08/24] fsverity: add per-sb workqueue for post read processing Andrey Albershteyn
2024-03-05 1:08 ` Eric Biggers
2024-03-07 21:58 ` Darrick J. Wong
2024-03-07 22:26 ` Eric Biggers
2024-03-08 3:53 ` Darrick J. Wong
2024-03-07 22:55 ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 09/24] fsverity: add tracepoints Andrey Albershteyn
2024-03-05 0:33 ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 10/24] iomap: integrate fs-verity verification into iomap's read path Andrey Albershteyn
2024-03-04 23:39 ` Eric Biggers
2024-03-07 22:06 ` Darrick J. Wong
2024-03-07 22:19 ` Eric Biggers
2024-03-07 23:38 ` Dave Chinner
2024-03-07 23:45 ` Darrick J. Wong
2024-03-08 0:47 ` Dave Chinner
2024-03-07 23:59 ` Eric Biggers
2024-03-08 1:20 ` Dave Chinner
2024-03-08 3:16 ` Eric Biggers
2024-03-08 3:57 ` Darrick J. Wong
2024-03-08 3:22 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 11/24] xfs: add XBF_VERITY_SEEN xfs_buf flag Andrey Albershteyn
2024-03-07 22:46 ` Darrick J. Wong
2024-03-08 1:59 ` Dave Chinner
2024-03-08 3:31 ` Darrick J. Wong
2024-03-09 16:28 ` Darrick J. Wong
2024-03-11 0:26 ` Dave Chinner
2024-03-11 15:25 ` Darrick J. Wong
2024-03-12 2:43 ` Eric Biggers
2024-03-12 4:15 ` Darrick J. Wong
2024-03-12 2:45 ` Darrick J. Wong
2024-03-12 7:01 ` Dave Chinner
2024-03-12 20:04 ` Darrick J. Wong
2024-03-12 21:45 ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 12/24] xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 13/24] xfs: add attribute type for fs-verity Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 14/24] xfs: make xfs_buf_get() to take XBF_* flags Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 15/24] xfs: add XBF_DOUBLE_ALLOC to increase size of the buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 16/24] xfs: add fs-verity ro-compat flag Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 17/24] xfs: add inode on-disk VERITY flag Andrey Albershteyn
2024-03-07 22:06 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 18/24] xfs: initialize fs-verity on file open and cleanup on inode destruction Andrey Albershteyn
2024-03-07 22:09 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 19/24] xfs: don't allow to enable DAX on fs-verity sealsed inode Andrey Albershteyn
2024-03-07 22:09 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 20/24] xfs: disable direct read path for fs-verity files Andrey Albershteyn
2024-03-07 22:11 ` Darrick J. Wong
2024-03-12 12:02 ` Andrey Albershteyn
2024-03-12 16:36 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 21/24] xfs: add fs-verity support Andrey Albershteyn
2024-03-06 4:55 ` Eric Biggers
2024-03-06 5:01 ` Eric Biggers
2024-03-07 23:10 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 22/24] xfs: make scrub aware of verity dinode flag Andrey Albershteyn
2024-03-07 22:18 ` Darrick J. Wong
2024-03-12 12:10 ` Andrey Albershteyn
2024-03-12 16:38 ` Darrick J. Wong
2024-03-13 1:35 ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 23/24] xfs: add fs-verity ioctls Andrey Albershteyn
2024-03-07 22:14 ` Darrick J. Wong
2024-03-12 12:42 ` Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 24/24] xfs: enable ro-compat fs-verity flag Andrey Albershteyn
2024-03-07 22:16 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240309162441.GP1927156@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=aalbersh@redhat.com \
--cc=chandan.babu@oracle.com \
--cc=ebiggers@kernel.org \
--cc=fsverity@lists.linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox