linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Andrey Albershteyn <aalbersh@redhat.com>
Cc: fsverity@lists.linux.dev, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, chandan.babu@oracle.com,
	djwong@kernel.org
Subject: Re: [PATCH v4 07/25] fsverity: support block-based Merkle tree caching
Date: Thu, 22 Feb 2024 21:24:59 -0800	[thread overview]
Message-ID: <20240223052459.GC25631@sol.localdomain> (raw)
In-Reply-To: <20240212165821.1901300-8-aalbersh@redhat.com>

On Mon, Feb 12, 2024 at 05:58:04PM +0100, Andrey Albershteyn wrote:
> diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
> index f58432772d9e..7e153356e7bc 100644
> --- a/fs/verity/read_metadata.c
> +++ b/fs/verity/read_metadata.c
[...]
>  	/*
> -	 * Iterate through each Merkle tree page in the requested range and copy
> -	 * the requested portion to userspace.  Note that the Merkle tree block
> -	 * size isn't important here, as we are returning a byte stream; i.e.,
> -	 * we can just work with pages even if the tree block size != PAGE_SIZE.
> +	 * Iterate through each Merkle tree block in the requested range and
> +	 * copy the requested portion to userspace. Note that we are returning
> +	 * a byte stream, so PAGE_SIZE & block_size are not important here.

The block size *is* important here now, because this code is now working with
the data in blocks.  Maybe just delete the last sentence from the comment.

> +		fsverity_drop_block(inode, &block);
> +		block.kaddr = NULL;

Either the 'block.kaddr = NULL' should not be here, or it should be done
automatically in fsverity_drop_block().

> +		num_ra_pages = level == 0 ?
> +			min(max_ra_pages, params->tree_pages - hpage_idx) : 0;
> +		err = fsverity_read_merkle_tree_block(
> +			inode, hblock_idx << params->log_blocksize, block,
> +			params->log_blocksize, num_ra_pages);

'hblock_idx << params->log_blocksize' needs to be
'(u64)hblock_idx << params->log_blocksize'

>  	for (; level > 0; level--) {
> -		kunmap_local(hblocks[level - 1].addr);
> -		put_page(hblocks[level - 1].page);
> +		fsverity_drop_block(inode, &hblocks[level - 1].block);
>  	}

Braces should be removed above

> +/**
> + * fsverity_invalidate_range() - invalidate range of Merkle tree blocks
> + * @inode: inode to which this Merkle tree blocks belong
> + * @offset: offset into the Merkle tree
> + * @size: number of bytes to invalidate starting from @offset

Maybe use @pos instead of @offset, to make it clear that it's in bytes.

But, what happens if the region passed is not Merkle tree block aligned?
Perhaps this function should operate on blocks, to avoid that case?

> + * Note! As this function clears fs-verity bitmap and can be run from multiple
> + * threads simultaneously, filesystem has to take care of operation ordering
> + * while invalidating Merkle tree and caching it. See fsverity_invalidate_page()
> + * as reference.

I'm not sure what this means.  What specifically does the filesystem have to do?

> +/* fsverity_invalidate_page() - invalidate Merkle tree blocks in the page

Is this intended to be kerneldoc?  Kerneldoc comments start with /**

Also, this function is only used within fs/verity/verify.c itself.  So it should
be static, and it shouldn't be declared in include/linux/fsverity.h.

> + * @inode: inode to which this Merkle tree blocks belong
> + * @page: page which contains blocks which need to be invalidated
> + * @index: index of the first Merkle tree block in the page

The only value that is assigned to index is 'pos >> PAGE_SHIFT', which implies
it is in units of pages, not Merkle tree blocks.  Which is correct?

> + *
> + * This function invalidates "verified" state of all Merkle tree blocks within
> + * the 'page'.
> + *
> + * When the Merkle tree block size and page size are the same, then the
> + * ->hash_block_verified bitmap isn't allocated, and we use PG_checked
> + * to directly indicate whether the page's block has been verified. This
> + * function does nothing in this case as page is invalidated by evicting from
> + * the memory.
> + *
> + * Using PG_checked also guarantees that we re-verify hash pages that
> + * get evicted and re-instantiated from the backing storage, as new
> + * pages always start out with PG_checked cleared.

This comment duplicates information from the comment in the function itself.

> +void fsverity_drop_block(struct inode *inode,
> +		struct fsverity_blockbuf *block)
> +{
> +	if (inode->i_sb->s_vop->drop_block)
> +		inode->i_sb->s_vop->drop_block(block);
> +	else {
> +		struct page *page = (struct page *)block->context;
> +
> +		/* Merkle tree block size == PAGE_SIZE; */
> +		if (block->verified)
> +			SetPageChecked(page);
> +
> +		kunmap_local(block->kaddr);
> +		put_page(page);
> +	}
> +}

I don't think this is the logical place for the call to SetPageChecked().
verity_data_block() currently does:

        if (vi->hash_block_verified)
                set_bit(hblock_idx, vi->hash_block_verified);
        else
                SetPageChecked(page);

You're proposing moving the SetPageChecked() to fsverity_drop_block().  Why?  We
should try to do things in a consistent place.

Similarly, I don't see why is_hash_block_verified() shouldn't keep the
PageChecked().

If we just keep PG_checked be get and set in the same places it currently is,
then adding fsverity_blockbuf::verified wouldn't be necessary.

Maybe you intended to move the awareness of PG_checked out of fs/verity/ and
into the filesystems?  Your change in how PG_checked is get and set is sort of a
step towards that, but it doesn't complete it.  It doesn't make sense to leave
in this half-finished state.  IMO, keeping fs/verity/ aware of PG_checked is
fine for now.  It avoids the need for some indirect calls, which is nice.

> +/**
> + * struct fsverity_blockbuf - Merkle Tree block
> + * @kaddr: virtual address of the block's data
> + * @size: buffer size

Is "buffer size" different from block size?

> + * @verified: true if block is verified against Merkle tree

This field has confusing semantics, as it's not used by the filesystems but only
by fs/verity/ internally.  As per my feedback above, I don't think this field is
necessary.

> + * Buffer containing single Merkle Tree block. These buffers are passed
> + *  - to filesystem, when fs-verity is building/writing merkel tree,
> + *  - from filesystem, when fs-verity is reading merkle tree from a disk.
> + * Filesystems sets kaddr together with size to point to a memory which contains
> + * Merkle tree block. Same is done by fs-verity when Merkle tree is need to be
> + * written down to disk.

Writes actually still use fsverity_operations::write_merkle_tree_block(), which
does not use this struct.

> + * For Merkle tree block == PAGE_SIZE, fs-verity sets verified flag to true if
> + * block in the buffer was verified.

Again, I think we can do without this field.

> +	/**
> +	 * Read a Merkle tree block of the given inode.
> +	 * @inode: the inode
> +	 * @pos: byte offset of the block within the Merkle tree
> +	 * @block: block buffer for filesystem to point it to the block
> +	 * @log_blocksize: size of the expected block

Presumably @log_blocksize is the log2 of the size of the block?

> +	 * @num_ra_pages: The number of pages with blocks that should be
> +	 *		  prefetched starting at @index if the page at @index
> +	 *		  isn't already cached.  Implementations may ignore this
> +	 *		  argument; it's only a performance optimization.

There's no parameter named @index.

> +	 * As filesystem does caching of the blocks, this functions needs to tell
> +	 * fsverity which blocks are not valid anymore (were evicted from memory)
> +	 * by calling fsverity_invalidate_range().

This function only reads a single block, so what does this mean by "blocks"?

Since there's only one block being read, why isn't the validation status just
conveyed through a bool in fsverity_blockbuf?

> +	/**
> +	 * Release the reference to a Merkle tree block
> +	 *
> +	 * @page: the block to release

@block, not @page

- Eric

  reply	other threads:[~2024-02-23  5:25 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-12 16:57 [PATCH v4 00/25] fs-verity support for XFS Andrey Albershteyn
2024-02-12 16:57 ` [PATCH v4 01/25] fsverity: remove hash page spin lock Andrey Albershteyn
2024-02-12 16:57 ` [PATCH v4 02/25] xfs: add parent pointer support to attribute code Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 03/25] xfs: define parent pointer ondisk extended attribute format Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 04/25] xfs: add parent pointer validator functions Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 05/25] fs: add FS_XFLAG_VERITY for verity files Andrey Albershteyn
2024-02-23  4:23   ` Eric Biggers
2024-02-23 12:55     ` Andrey Albershteyn
2024-02-23 17:59       ` Eric Biggers
2024-02-12 16:58 ` [PATCH v4 06/25] fsverity: pass log_blocksize to end_enable_verity() Andrey Albershteyn
2024-02-15 21:45   ` Dave Chinner
2024-02-16 16:18     ` Andrey Albershteyn
2024-02-23  4:26   ` Eric Biggers
2024-02-23 13:02     ` Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 07/25] fsverity: support block-based Merkle tree caching Andrey Albershteyn
2024-02-23  5:24   ` Eric Biggers [this message]
2024-02-23 16:02     ` Andrey Albershteyn
2024-02-23 18:07       ` Eric Biggers
2024-02-24 14:10         ` Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 08/25] fsverity: calculate readahead in bytes instead of pages Andrey Albershteyn
2024-02-23  5:29   ` Eric Biggers
2024-02-12 16:58 ` [PATCH v4 09/25] fsverity: add tracepoints Andrey Albershteyn
2024-02-23  5:31   ` Eric Biggers
2024-02-23 13:23     ` Andrey Albershteyn
2024-02-23 18:27       ` Eric Biggers
2024-02-26  2:24         ` Dave Chinner
2024-02-12 16:58 ` [PATCH v4 10/25] iomap: integrate fsverity verification into iomap's read path Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 11/25] xfs: add XBF_VERITY_SEEN xfs_buf flag Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 12/25] xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 13/25] xfs: introduce workqueue for post read IO work Andrey Albershteyn
2024-02-15 22:11   ` Dave Chinner
2024-02-16 16:29     ` Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 14/25] xfs: add attribute type for fs-verity Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 15/25] xfs: make xfs_buf_get() to take XBF_* flags Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 16/25] xfs: add XBF_DOUBLE_ALLOC to increase size of the buffer Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 17/25] xfs: add fs-verity ro-compat flag Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 18/25] xfs: add inode on-disk VERITY flag Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 19/25] xfs: initialize fs-verity on file open and cleanup on inode destruction Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 20/25] xfs: don't allow to enable DAX on fs-verity sealsed inode Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 21/25] xfs: disable direct read path for fs-verity files Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 22/25] xfs: add fs-verity support Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 23/25] xfs: make scrub aware of verity dinode flag Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 24/25] xfs: add fs-verity ioctls Andrey Albershteyn
2024-02-12 16:58 ` [PATCH v4 25/25] xfs: enable ro-compat fs-verity flag Andrey Albershteyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240223052459.GC25631@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=aalbersh@redhat.com \
    --cc=chandan.babu@oracle.com \
    --cc=djwong@kernel.org \
    --cc=fsverity@lists.linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).