Linux fsverity development list
 help / color / mirror / Atom feed
* [PATCH] fsverity: remove hash page spin lock
@ 2024-01-31 18:20 Andrey Albershteyn
  2024-02-01  5:37 ` Eric Biggers
  0 siblings, 1 reply; 2+ messages in thread
From: Andrey Albershteyn @ 2024-01-31 18:20 UTC (permalink / raw)
  To: fsverity, ebiggers; +Cc: Andrey Albershteyn

The spin lock is not necessary here as it can be replaced with
memory barrier which should be better performance-wise.

When Merkle tree block size differs from page size, in
is_hash_block_verified() two things are modified during check - a
bitmap and PG_checked flag of the page.

Each bit in the bitmap represent verification status of the Merkle
tree blocks. PG_checked flag tells if page was just re-instantiated
or was in pagecache. Both of this states are shared between
verification threads. Page which was re-instantiated can not have
already verified blocks (bit set in bitmap).

The spin lock was used to allow only one thread to modify both of
these states and keep order of operations. The only requirement here
is that PG_Checked is set strictly after bitmap is updated.
This way other threads which see that PG_Checked=1 (page cached)
knows that bitmap is up-to-date. Otherwise, if PG_Checked is set
before bitmap is cleared, other threads can see bit=1 and therefore
will not perform verification of that Merkle tree block.

However, there's still the case when one thread is setting a bit in
verify_data_block() and other thread is clearing it in
is_hash_block_verified(). This can happen if two threads get to
!PageChecked branch and one of the threads is rescheduled before
resetting the bitmap. This is fine as at worst blocks are
re-verified in each thread.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---

Notes:
    I've tried removing the spinlock/barrier and triggering the case
    when thread get rescheduled, so verification is missed. But without
    success. The test was trimmed generic/574 reading file with single
    corrupted block and single Merkle tree block; with stress-ng
    allocating a lot of memory.
    
    The '-s ext4_4k -s ext4_1k -g verity' passed without problems. generic/579
    seems to perform the same.

 fs/verity/fsverity_private.h |  1 -
 fs/verity/open.c             |  1 -
 fs/verity/verify.c           | 21 +++++++++++++--------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index d071a6e32581..9611eeae3527 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -69,7 +69,6 @@ struct fsverity_info {
 	u8 file_digest[FS_VERITY_MAX_DIGEST_SIZE];
 	const struct inode *inode;
 	unsigned long *hash_block_verified;
-	spinlock_t hash_page_init_lock;
 };
 
 #define FS_VERITY_MAX_SIGNATURE_SIZE	(FS_VERITY_MAX_DESCRIPTOR_SIZE - \
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 6c31a871b84b..fdeb95eca3af 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -239,7 +239,6 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
 			err = -ENOMEM;
 			goto fail;
 		}
-		spin_lock_init(&vi->hash_page_init_lock);
 	}
 
 	return vi;
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 904ccd7e8e16..7759f6dbd16f 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -46,20 +46,25 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 	 * the hash page is newly instantiated or not.
 	 *
 	 * The first thread that sees PG_checked=0 must clear the corresponding
-	 * bitmap bits, then set PG_checked=1.  This requires a spinlock.  To
-	 * avoid having to take this spinlock in the common case of
-	 * PG_checked=1, we start with an opportunistic lockless read.
+	 * bitmap bits, then set PG_checked=1. The requirement is that
+	 * PG_checked=1 is set after bits are cleared. Other threads could see
+	 * the case of bit set to 1 and PG_checked=1 (block is verified and
+	 * wasn't evicted from memory). This will lead to one of the thread
+	 * skipping verification of a block. To restrict order of operations
+	 * memory barrier is used.
+	 *
+	 * However, it can happen that two threads get into PG_checked=0
+	 * section. This is fine as both of them will reset bits to 0 and do
+	 * same verification job. This is not expected to happen often.
 	 */
 	if (PageChecked(hpage)) {
 		/*
 		 * A read memory barrier is needed here to give ACQUIRE
 		 * semantics to the above PageChecked() test.
+		 *
+		 * Matching to the smp_wmb() in PG_checked=0
 		 */
 		smp_rmb();
-		return test_bit(hblock_idx, vi->hash_block_verified);
-	}
-	spin_lock(&vi->hash_page_init_lock);
-	if (PageChecked(hpage)) {
 		verified = test_bit(hblock_idx, vi->hash_block_verified);
 	} else {
 		blocks_per_page = vi->tree_params.blocks_per_page;
@@ -74,7 +79,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 		SetPageChecked(hpage);
 		verified = false;
 	}
-	spin_unlock(&vi->hash_page_init_lock);
+
 	return verified;
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] fsverity: remove hash page spin lock
  2024-01-31 18:20 [PATCH] fsverity: remove hash page spin lock Andrey Albershteyn
@ 2024-02-01  5:37 ` Eric Biggers
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Biggers @ 2024-02-01  5:37 UTC (permalink / raw)
  To: Andrey Albershteyn; +Cc: fsverity

On Wed, Jan 31, 2024 at 07:20:23PM +0100, Andrey Albershteyn wrote:
> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> index 904ccd7e8e16..7759f6dbd16f 100644
> --- a/fs/verity/verify.c
> +++ b/fs/verity/verify.c
> @@ -46,20 +46,25 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
>  	 * the hash page is newly instantiated or not.
>  	 *
>  	 * The first thread that sees PG_checked=0 must clear the corresponding
> -	 * bitmap bits, then set PG_checked=1.  This requires a spinlock.  To
> -	 * avoid having to take this spinlock in the common case of
> -	 * PG_checked=1, we start with an opportunistic lockless read.
> +	 * bitmap bits, then set PG_checked=1. The requirement is that
> +	 * PG_checked=1 is set after bits are cleared. Other threads could see
> +	 * the case of bit set to 1 and PG_checked=1 (block is verified and
> +	 * wasn't evicted from memory). This will lead to one of the thread
> +	 * skipping verification of a block. To restrict order of operations
> +	 * memory barrier is used.
> +	 *
> +	 * However, it can happen that two threads get into PG_checked=0
> +	 * section. This is fine as both of them will reset bits to 0 and do
> +	 * same verification job. This is not expected to happen often.
>  	 */
>  	if (PageChecked(hpage)) {
>  		/*
>  		 * A read memory barrier is needed here to give ACQUIRE
>  		 * semantics to the above PageChecked() test.
> +		 *
> +		 * Matching to the smp_wmb() in PG_checked=0
>  		 */
>  		smp_rmb();
> -		return test_bit(hblock_idx, vi->hash_block_verified);
> -	}
> -	spin_lock(&vi->hash_page_init_lock);
> -	if (PageChecked(hpage)) {
>  		verified = test_bit(hblock_idx, vi->hash_block_verified);
>  	} else {
>  		blocks_per_page = vi->tree_params.blocks_per_page;
> @@ -74,7 +79,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
>  		SetPageChecked(hpage);
>  		verified = false;
>  	}
> -	spin_unlock(&vi->hash_page_init_lock);
> +
>  	return verified;
>  }

Thanks!  I made some changes to the comment to make the explanation more
complete and hopefully easier to understand, and removed the 'verified' variable
which is no longer necessary.  Sent out v2 at
https://lore.kernel.org/fsverity/20240201052813.68380-1-ebiggers@kernel.org.
Let me know if it looks good.

- Eric

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-02-01  5:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-31 18:20 [PATCH] fsverity: remove hash page spin lock Andrey Albershteyn
2024-02-01  5:37 ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox