public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Andrey Albershteyn <aalbersh@redhat.com>,
	fsverity@lists.linux.dev, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, chandan.babu@oracle.com
Subject: Re: [PATCH v5 07/24] fsverity: support block-based Merkle tree caching
Date: Thu, 7 Mar 2024 19:50:36 -0800	[thread overview]
Message-ID: <20240308035036.GL1927156@frogsfrogsfrogs> (raw)
In-Reply-To: <20240307224903.GE1799@sol.localdomain>

On Thu, Mar 07, 2024 at 02:49:03PM -0800, Eric Biggers wrote:
> On Thu, Mar 07, 2024 at 01:54:01PM -0800, Darrick J. Wong wrote:
> > On Tue, Mar 05, 2024 at 07:56:22PM -0800, Eric Biggers wrote:
> > > On Mon, Mar 04, 2024 at 08:10:30PM +0100, Andrey Albershteyn wrote:
> > > > diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
> > > > index b3506f56e180..dad33e6ff0d6 100644
> > > > --- a/fs/verity/fsverity_private.h
> > > > +++ b/fs/verity/fsverity_private.h
> > > > @@ -154,4 +154,12 @@ static inline void fsverity_init_signature(void)
> > > >  
> > > >  void __init fsverity_init_workqueue(void);
> > > >  
> > > > +/*
> > > > + * Drop 'block' obtained with ->read_merkle_tree_block(). Calls out back to
> > > > + * filesystem if ->drop_block() is set, otherwise, drop the reference in the
> > > > + * block->context.
> > > > + */
> > > > +void fsverity_drop_block(struct inode *inode,
> > > > +			 struct fsverity_blockbuf *block);
> > > > +
> > > >  #endif /* _FSVERITY_PRIVATE_H */
> > > 
> > > This should be paired with a helper function that reads a Merkle tree block by
> > > calling ->read_merkle_tree_block or ->read_merkle_tree_page as needed.  Besides
> > > being consistent with having a helper function for drop, this would prevent code
> > > duplication between verify_data_block() and fsverity_read_merkle_tree().
> > > 
> > > I recommend that it look like this:
> > > 
> > > int fsverity_read_merkle_tree_block(struct inode *inode, u64 pos,
> > > 				    unsigned long ra_bytes,
> > > 				    struct fsverity_blockbuf *block);
> > > 
> > > 'pos' would be the byte position of the block in the Merkle tree, and 'ra_bytes'
> > > would be the number of bytes for the filesystem to (optionally) readahead if the
> > > block is not yet cached.  I think that things work out simpler if these values
> > > are measured in bytes, not blocks.  'block' would be at the end because it's an
> > > output, and it can be confusing to interleave inputs and outputs in parameters.
> > 
> > FWIW I don't really like 'pos' here because that's usually short for
> > "file position", which is a byte, and this looks a lot more like a
> > merkle tree block number.
> > 
> > u64 blkno?
> > 
> > Or better yet use a typedef ("merkle_blkno_t") to make it really clear
> > when we're dealing with a tree block number.  Ignore checkpatch
> > complaining about typeedefs. :)
> 
> My suggestion is for 'pos' to be a byte position, in alignment with the
> pagecache naming convention as well as ->write_merkle_tree_block which currently
> uses a 'u64 pos' byte position too.
> 
> It would also work for it to be a block index, in which case it should be named
> 'index' or 'blkno' and have type unsigned long.  I *think* that things work out
> a bit cleaner if it's a byte position, but I could be convinced that it's
> actually better for it to be a block index.

It's probably cleaner (or at least willy says so) to measure everything
in bytes.  The only place that gets messy is if we want to cache merkle
tree blocks in an xarray or something, in which case we'd want to >> by
the block_shift to avoid leaving gaps.

> > > How about changing the prototype to:
> > > 
> > > void fsverity_invalidate_merkle_tree_block(struct inode *inode, u64 pos);
> > >
> > > Also, is it a kernel bug for the pos to be beyond the end of the Merkle tree, or
> > > can it happen in cases like filesystem corruption?  If it can only happen due to
> > > kernel bugs, WARN_ON_ONCE() might be more appropriate than an error message.
> > 
> > I think XFS only passes to _invalidate_* the same pos that was passed to
> > ->read_merkle_tree_block, so this is a kernel bug, not a fs corruption
> > problem.
> > 
> > Perhaps this function ought to note that @pos is supposed to be the same
> > value that was given to ->read_merkle_tree_block?
> > 
> > Or: make the implementations return 1 for "reloaded from disk", 0 for
> > "still in cache", or a negative error code.  Then fsverity can call
> > the invalidation routine itself and XFS doesn't have to worry about this
> > part.
> > 
> > (I think?  I have questions about the xfs_invalidate_blocks function.)
> 
> It looks like XFS can invalidate blocks other than the one being read by
> ->read_merkle_tree_block.
> 
> If it really was only a matter of the single block being read, then it would
> indeed be simpler to just make it a piece of information returned from
> ->read_merkle_tree_block.
> 
> If the generic invalidation function is needed, it needs to be clearly
> documented when filesystems are expected to invalidate blocks.

I /think/ the generic invalidation is only necessary with the weird
DOUBLE_ALLOC thing.  If the other two implementations (ext4/f2fs)
haven't needed a "nuke from orbit" function then xfs shouldn't require
one too.

> > > > +
> > > > +	/**
> > > > +	 * Release the reference to a Merkle tree block
> > > > +	 *
> > > > +	 * @block: the block to release
> > > > +	 *
> > > > +	 * This is called when fs-verity is done with a block obtained with
> > > > +	 * ->read_merkle_tree_block().
> > > > +	 */
> > > > +	void (*drop_block)(struct fsverity_blockbuf *block);
> > > 
> > > drop_merkle_tree_block, so that it's clearly paired with read_merkle_tree_block
> > 
> > Yep.  I noticed that xfs_verity.c doesn't put them together, which made
> > me wonder if the write_merkle_tree_block path made use of that.  It
> > doesn't, AFAICT.
> > 
> > And I think the reason is that when we're setting up the merkle tree,
> > we want to stream the contents straight to disk instead of ending up
> > with a huge cache that might not all be necessary?
> 
> In the current patchset, fsverity_blockbuf isn't used for writes (despite the
> comment saying it is).  I think that's fine and that it keeps things simpler.
> Filesystems can still cache the blocks that are passed to
> ->write_merkle_tree_block if they want to.  That already happens on ext4 and
> f2fs; FS_IOC_ENABLE_VERITY results in the Merkle tree being in the pagecache.

Oh!  Maybe it should do that then.  At least the root block?

--D

> - Eric
> 

  reply	other threads:[~2024-03-08  3:50 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 19:10 [PATCH v5 00/24] fs-verity support for XFS Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 01/24] fsverity: remove hash page spin lock Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 02/24] xfs: add parent pointer support to attribute code Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 03/24] xfs: define parent pointer ondisk extended attribute format Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 04/24] xfs: add parent pointer validator functions Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 05/24] fs: add FS_XFLAG_VERITY for verity files Andrey Albershteyn
2024-03-04 22:35   ` Eric Biggers
2024-03-07 21:39     ` Darrick J. Wong
2024-03-07 22:06       ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 06/24] fsverity: pass tree_blocksize to end_enable_verity() Andrey Albershteyn
2024-03-05  0:52   ` Eric Biggers
2024-03-06 16:30     ` Darrick J. Wong
2024-03-07 22:02       ` Eric Biggers
2024-03-08  3:46         ` Darrick J. Wong
2024-03-08  4:40           ` Eric Biggers
2024-03-11 22:38             ` Darrick J. Wong
2024-03-12 15:13               ` David Hildenbrand
2024-03-12 15:33                 ` David Hildenbrand
2024-03-12 16:44                   ` Darrick J. Wong
2024-03-13 12:29                     ` David Hildenbrand
2024-03-13 17:19                       ` Darrick J. Wong
2024-03-13 19:10                         ` David Hildenbrand
2024-03-13 21:03                           ` David Hildenbrand
2024-03-08 21:34           ` Dave Chinner
2024-03-09 16:19             ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 07/24] fsverity: support block-based Merkle tree caching Andrey Albershteyn
2024-03-06  3:56   ` Eric Biggers
2024-03-07 21:54     ` Darrick J. Wong
2024-03-07 22:49       ` Eric Biggers
2024-03-08  3:50         ` Darrick J. Wong [this message]
2024-03-09 16:24           ` Darrick J. Wong
2024-03-11 23:22   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 08/24] fsverity: add per-sb workqueue for post read processing Andrey Albershteyn
2024-03-05  1:08   ` Eric Biggers
2024-03-07 21:58     ` Darrick J. Wong
2024-03-07 22:26       ` Eric Biggers
2024-03-08  3:53         ` Darrick J. Wong
2024-03-07 22:55       ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 09/24] fsverity: add tracepoints Andrey Albershteyn
2024-03-05  0:33   ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 10/24] iomap: integrate fs-verity verification into iomap's read path Andrey Albershteyn
2024-03-04 23:39   ` Eric Biggers
2024-03-07 22:06     ` Darrick J. Wong
2024-03-07 22:19       ` Eric Biggers
2024-03-07 23:38     ` Dave Chinner
2024-03-07 23:45       ` Darrick J. Wong
2024-03-08  0:47         ` Dave Chinner
2024-03-07 23:59       ` Eric Biggers
2024-03-08  1:20         ` Dave Chinner
2024-03-08  3:16           ` Eric Biggers
2024-03-08  3:57             ` Darrick J. Wong
2024-03-08  3:22           ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 11/24] xfs: add XBF_VERITY_SEEN xfs_buf flag Andrey Albershteyn
2024-03-07 22:46   ` Darrick J. Wong
2024-03-08  1:59     ` Dave Chinner
2024-03-08  3:31       ` Darrick J. Wong
2024-03-09 16:28         ` Darrick J. Wong
2024-03-11  0:26           ` Dave Chinner
2024-03-11 15:25             ` Darrick J. Wong
2024-03-12  2:43               ` Eric Biggers
2024-03-12  4:15                 ` Darrick J. Wong
2024-03-12  2:45               ` Darrick J. Wong
2024-03-12  7:01                 ` Dave Chinner
2024-03-12 20:04                   ` Darrick J. Wong
2024-03-12 21:45                     ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 12/24] xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 13/24] xfs: add attribute type for fs-verity Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 14/24] xfs: make xfs_buf_get() to take XBF_* flags Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 15/24] xfs: add XBF_DOUBLE_ALLOC to increase size of the buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 16/24] xfs: add fs-verity ro-compat flag Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 17/24] xfs: add inode on-disk VERITY flag Andrey Albershteyn
2024-03-07 22:06   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 18/24] xfs: initialize fs-verity on file open and cleanup on inode destruction Andrey Albershteyn
2024-03-07 22:09   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 19/24] xfs: don't allow to enable DAX on fs-verity sealsed inode Andrey Albershteyn
2024-03-07 22:09   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 20/24] xfs: disable direct read path for fs-verity files Andrey Albershteyn
2024-03-07 22:11   ` Darrick J. Wong
2024-03-12 12:02     ` Andrey Albershteyn
2024-03-12 16:36       ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 21/24] xfs: add fs-verity support Andrey Albershteyn
2024-03-06  4:55   ` Eric Biggers
2024-03-06  5:01     ` Eric Biggers
2024-03-07 23:10   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 22/24] xfs: make scrub aware of verity dinode flag Andrey Albershteyn
2024-03-07 22:18   ` Darrick J. Wong
2024-03-12 12:10     ` Andrey Albershteyn
2024-03-12 16:38       ` Darrick J. Wong
2024-03-13  1:35         ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 23/24] xfs: add fs-verity ioctls Andrey Albershteyn
2024-03-07 22:14   ` Darrick J. Wong
2024-03-12 12:42     ` Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 24/24] xfs: enable ro-compat fs-verity flag Andrey Albershteyn
2024-03-07 22:16   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240308035036.GL1927156@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=aalbersh@redhat.com \
    --cc=chandan.babu@oracle.com \
    --cc=ebiggers@kernel.org \
    --cc=fsverity@lists.linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox