* [PATCH 1/7] dm-verity: move hash algorithm setup into its own function
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 6:48 ` [PATCH 2/7] dm-verity: move data hash mismatch handling " Eric Biggers
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
Move the code that sets up the hash transformation into its own
function. No change in behavior.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-target.c | 70 +++++++++++++++++++----------------
1 file changed, 39 insertions(+), 31 deletions(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index bb5da66da4c1..88d2a49dca43 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1224,10 +1224,47 @@ static int verity_parse_opt_args(struct dm_arg_set *as, struct dm_verity *v,
} while (argc && !r);
return r;
}
+static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
+{
+ struct dm_target *ti = v->ti;
+ struct crypto_ahash *ahash;
+
+ v->alg_name = kstrdup(alg_name, GFP_KERNEL);
+ if (!v->alg_name) {
+ ti->error = "Cannot allocate algorithm name";
+ return -ENOMEM;
+ }
+
+ ahash = crypto_alloc_ahash(alg_name, 0,
+ v->use_bh_wq ? CRYPTO_ALG_ASYNC : 0);
+ if (IS_ERR(ahash)) {
+ ti->error = "Cannot initialize hash function";
+ return PTR_ERR(ahash);
+ }
+ v->tfm = ahash;
+
+ /*
+ * dm-verity performance can vary greatly depending on which hash
+ * algorithm implementation is used. Help people debug performance
+ * problems by logging the ->cra_driver_name.
+ */
+ DMINFO("%s using implementation \"%s\"", alg_name,
+ crypto_hash_alg_common(ahash)->base.cra_driver_name);
+
+ v->digest_size = crypto_ahash_digestsize(ahash);
+ if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
+ ti->error = "Digest size too big";
+ return -EINVAL;
+ }
+ v->ahash_reqsize = sizeof(struct ahash_request) +
+ crypto_ahash_reqsize(ahash);
+ return 0;
+}
+
/*
* Target parameters:
* <version> The current format is version 1.
* Vsn 0 is compatible with original Chromium OS releases.
* <data device>
@@ -1348,42 +1385,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
r = -EINVAL;
goto bad;
}
v->hash_start = num_ll;
- v->alg_name = kstrdup(argv[7], GFP_KERNEL);
- if (!v->alg_name) {
- ti->error = "Cannot allocate algorithm name";
- r = -ENOMEM;
- goto bad;
- }
-
- v->tfm = crypto_alloc_ahash(v->alg_name, 0,
- v->use_bh_wq ? CRYPTO_ALG_ASYNC : 0);
- if (IS_ERR(v->tfm)) {
- ti->error = "Cannot initialize hash function";
- r = PTR_ERR(v->tfm);
- v->tfm = NULL;
- goto bad;
- }
-
- /*
- * dm-verity performance can vary greatly depending on which hash
- * algorithm implementation is used. Help people debug performance
- * problems by logging the ->cra_driver_name.
- */
- DMINFO("%s using implementation \"%s\"", v->alg_name,
- crypto_hash_alg_common(v->tfm)->base.cra_driver_name);
-
- v->digest_size = crypto_ahash_digestsize(v->tfm);
- if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
- ti->error = "Digest size too big";
- r = -EINVAL;
+ r = verity_setup_hash_alg(v, argv[7]);
+ if (r)
goto bad;
- }
- v->ahash_reqsize = sizeof(struct ahash_request) +
- crypto_ahash_reqsize(v->tfm);
v->root_digest = kmalloc(v->digest_size, GFP_KERNEL);
if (!v->root_digest) {
ti->error = "Cannot allocate root digest";
r = -ENOMEM;
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/7] dm-verity: move data hash mismatch handling into its own function
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
2024-07-02 6:48 ` [PATCH 1/7] dm-verity: move hash algorithm setup into its own function Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 6:48 ` [PATCH 3/7] dm-verity: make real_digest and want_digest fixed-length Eric Biggers
` (5 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
Move the code that handles mismatches of data block hashes into its own
function so that it doesn't clutter up verity_verify_io().
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-target.c | 64 ++++++++++++++++++++---------------
1 file changed, 36 insertions(+), 28 deletions(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 88d2a49dca43..796d85526696 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -540,10 +540,42 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
mempool_free(page, &v->recheck_pool);
return r;
}
+static int verity_handle_data_hash_mismatch(struct dm_verity *v,
+ struct dm_verity_io *io,
+ struct bio *bio, sector_t blkno,
+ struct bvec_iter *start)
+{
+ if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
+ /*
+ * Error handling code (FEC included) cannot be run in the
+ * BH workqueue, so fallback to a standard workqueue.
+ */
+ return -EAGAIN;
+ }
+ if (verity_recheck(v, io, *start, blkno) == 0) {
+ if (v->validated_blocks)
+ set_bit(blkno, v->validated_blocks);
+ return 0;
+ }
+#if defined(CONFIG_DM_VERITY_FEC)
+ if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, blkno,
+ NULL, start) == 0)
+ return 0;
+#endif
+ if (bio->bi_status)
+ return -EIO; /* Error correction failed; Just return error */
+
+ if (verity_handle_err(v, DM_VERITY_BLOCK_TYPE_DATA, blkno)) {
+ dm_audit_log_bio(DM_MSG_PREFIX, "verify-data", bio, blkno, 0);
+ return -EIO;
+ }
+ return 0;
+}
+
static int verity_bv_zero(struct dm_verity *v, struct dm_verity_io *io,
u8 *data, size_t len)
{
memset(data, 0, len);
return 0;
@@ -632,39 +664,15 @@ static int verity_verify_io(struct dm_verity_io *io)
if (likely(memcmp(verity_io_real_digest(v, io),
verity_io_want_digest(v, io), v->digest_size) == 0)) {
if (v->validated_blocks)
set_bit(cur_block, v->validated_blocks);
continue;
- } else if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
- /*
- * Error handling code (FEC included) cannot be run in a
- * tasklet since it may sleep, so fallback to work-queue.
- */
- return -EAGAIN;
- } else if (verity_recheck(v, io, start, cur_block) == 0) {
- if (v->validated_blocks)
- set_bit(cur_block, v->validated_blocks);
- continue;
-#if defined(CONFIG_DM_VERITY_FEC)
- } else if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA,
- cur_block, NULL, &start) == 0) {
- continue;
-#endif
- } else {
- if (bio->bi_status) {
- /*
- * Error correction failed; Just return error
- */
- return -EIO;
- }
- if (verity_handle_err(v, DM_VERITY_BLOCK_TYPE_DATA,
- cur_block)) {
- dm_audit_log_bio(DM_MSG_PREFIX, "verify-data",
- bio, cur_block, 0);
- return -EIO;
- }
}
+ r = verity_handle_data_hash_mismatch(v, io, bio, cur_block,
+ &start);
+ if (unlikely(r))
+ return r;
}
return 0;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 3/7] dm-verity: make real_digest and want_digest fixed-length
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
2024-07-02 6:48 ` [PATCH 1/7] dm-verity: move hash algorithm setup into its own function Eric Biggers
2024-07-02 6:48 ` [PATCH 2/7] dm-verity: move data hash mismatch handling " Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 6:48 ` [PATCH 4/7] dm-verity: provide dma_alignment limit in io_hints Eric Biggers
` (4 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
Change the digest fields in struct dm_verity_io from variable-length to
fixed-length, since their maximum length is fixed at
HASH_MAX_DIGESTSIZE, i.e. 64 bytes, which is not too big. This is
simpler and makes the fields a bit faster to access.
(HASH_MAX_DIGESTSIZE did not exist when this code was written, which may
explain why it wasn't used.)
This makes the verity_io_real_digest() and verity_io_want_digest()
functions trivial, but this patch leaves them in place temporarily since
most of their callers will go away in a later patch anyway.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-target.c | 3 +--
drivers/md/dm-verity.h | 17 +++++++----------
2 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 796d85526696..4ef814a7faf4 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1527,12 +1527,11 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
ti->error = "Cannot allocate workqueue";
r = -ENOMEM;
goto bad;
}
- ti->per_io_data_size = sizeof(struct dm_verity_io) +
- v->ahash_reqsize + v->digest_size * 2;
+ ti->per_io_data_size = sizeof(struct dm_verity_io) + v->ahash_reqsize;
r = verity_fec_ctr(v);
if (r)
goto bad;
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 20b1bcf03474..5d3da9f5fc95 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -89,19 +89,16 @@ struct dm_verity_io {
struct work_struct work;
struct work_struct bh_work;
char *recheck_buffer;
+ u8 real_digest[HASH_MAX_DIGESTSIZE];
+ u8 want_digest[HASH_MAX_DIGESTSIZE];
+
/*
- * Three variably-size fields follow this struct:
- *
- * u8 hash_req[v->ahash_reqsize];
- * u8 real_digest[v->digest_size];
- * u8 want_digest[v->digest_size];
- *
- * To access them use: verity_io_hash_req(), verity_io_real_digest()
- * and verity_io_want_digest().
+ * This struct is followed by a variable-sized struct ahash_request of
+ * size v->ahash_reqsize. To access it, use verity_io_hash_req().
*/
};
static inline struct ahash_request *verity_io_hash_req(struct dm_verity *v,
struct dm_verity_io *io)
@@ -110,17 +107,17 @@ static inline struct ahash_request *verity_io_hash_req(struct dm_verity *v,
}
static inline u8 *verity_io_real_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
- return (u8 *)(io + 1) + v->ahash_reqsize;
+ return io->real_digest;
}
static inline u8 *verity_io_want_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
- return (u8 *)(io + 1) + v->ahash_reqsize + v->digest_size;
+ return io->want_digest;
}
extern int verity_for_bv_block(struct dm_verity *v, struct dm_verity_io *io,
struct bvec_iter *iter,
int (*process)(struct dm_verity *v,
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/7] dm-verity: provide dma_alignment limit in io_hints
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
` (2 preceding siblings ...)
2024-07-02 6:48 ` [PATCH 3/7] dm-verity: make real_digest and want_digest fixed-length Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 7:16 ` Christoph Hellwig
2024-07-02 6:48 ` [PATCH 5/7] dm-verity: always "map" the data blocks Eric Biggers
` (3 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
Since Linux v6.1, some filesystems support submitting direct I/O that is
aligned to only dma_alignment instead of the logical_block_size
alignment that was required before. I/O that is not aligned to the
logical_block_size is difficult to handle in device-mapper targets that
do cryptographic processing of data, as it makes the units of data that
are hashed or encrypted possibly be split across pages, creating rarely
used and rarely tested edge cases.
As such, dm-crypt and dm-integrity have already opted out of this by
setting dma_alignment to 'logical_block_size - 1'.
Although dm-verity does have code that handles these cases (or at least
is intended to do so), supporting direct I/O with such a low amount of
alignment is not really useful on dm-verity devices. So, opt dm-verity
out of it too so that it's not necessary to handle these edge cases.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-target.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 4ef814a7faf4..c6a0e3280e39 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1021,10 +1021,12 @@ static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits)
if (limits->physical_block_size < 1 << v->data_dev_block_bits)
limits->physical_block_size = 1 << v->data_dev_block_bits;
blk_limits_io_min(limits, limits->logical_block_size);
+
+ limits->dma_alignment = limits->logical_block_size - 1;
}
static void verity_dtr(struct dm_target *ti)
{
struct dm_verity *v = ti->private;
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 5/7] dm-verity: always "map" the data blocks
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
` (3 preceding siblings ...)
2024-07-02 6:48 ` [PATCH 4/7] dm-verity: provide dma_alignment limit in io_hints Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 6:48 ` [PATCH 6/7] dm-verity: make verity_hash() take dm_verity_io instead of ahash_request Eric Biggers
` (2 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
dm-verity needs to access data blocks by virtual address in three
different cases (zeroization, recheck, and forward error correction),
and one more case (shash support) is coming. Since it's guaranteed that
dm-verity data blocks never cross pages, and kmap_local_page and
kunmap_local are no-ops on modern platforms anyway, just unconditionally
"map" every data block's page and work with the virtual buffer directly.
This simplifies the code and eliminates unnecessary overhead.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-fec.c | 26 +----
drivers/md/dm-verity-fec.h | 6 +-
drivers/md/dm-verity-target.c | 182 +++++++---------------------------
drivers/md/dm-verity.h | 8 --
4 files changed, 42 insertions(+), 180 deletions(-)
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index e46aee6f932e..b838d21183b5 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -402,28 +402,13 @@ static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io,
}
return 0;
}
-static int fec_bv_copy(struct dm_verity *v, struct dm_verity_io *io, u8 *data,
- size_t len)
-{
- struct dm_verity_fec_io *fio = fec_io(io);
-
- memcpy(data, &fio->output[fio->output_pos], len);
- fio->output_pos += len;
-
- return 0;
-}
-
-/*
- * Correct errors in a block. Copies corrected block to dest if non-NULL,
- * otherwise to a bio_vec starting from iter.
- */
+/* Correct errors in a block. Copies corrected block to dest. */
int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
- enum verity_block_type type, sector_t block, u8 *dest,
- struct bvec_iter *iter)
+ enum verity_block_type type, sector_t block, u8 *dest)
{
int r;
struct dm_verity_fec_io *fio = fec_io(io);
u64 offset, res, rsb;
@@ -469,16 +454,11 @@ int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
r = fec_decode_rsb(v, io, fio, rsb, offset, true);
if (r < 0)
goto done;
}
- if (dest)
- memcpy(dest, fio->output, 1 << v->data_dev_block_bits);
- else if (iter) {
- fio->output_pos = 0;
- r = verity_for_bv_block(v, io, iter, fec_bv_copy);
- }
+ memcpy(dest, fio->output, 1 << v->data_dev_block_bits);
done:
fio->level--;
return r;
}
diff --git a/drivers/md/dm-verity-fec.h b/drivers/md/dm-verity-fec.h
index 8454070d2824..09123a612953 100644
--- a/drivers/md/dm-verity-fec.h
+++ b/drivers/md/dm-verity-fec.h
@@ -55,11 +55,10 @@ struct dm_verity_fec_io {
struct rs_control *rs; /* Reed-Solomon state */
int erasures[DM_VERITY_FEC_MAX_RSN]; /* erasures for decode_rs8 */
u8 *bufs[DM_VERITY_FEC_BUF_MAX]; /* bufs for deinterleaving */
unsigned int nbufs; /* number of buffers allocated */
u8 *output; /* buffer for corrected output */
- size_t output_pos;
unsigned int level; /* recursion level */
};
#ifdef CONFIG_DM_VERITY_FEC
@@ -68,11 +67,11 @@ struct dm_verity_fec_io {
extern bool verity_fec_is_enabled(struct dm_verity *v);
extern int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
enum verity_block_type type, sector_t block,
- u8 *dest, struct bvec_iter *iter);
+ u8 *dest);
extern unsigned int verity_fec_status_table(struct dm_verity *v, unsigned int sz,
char *result, unsigned int maxlen);
extern void verity_fec_finish_io(struct dm_verity_io *io);
@@ -98,12 +97,11 @@ static inline bool verity_fec_is_enabled(struct dm_verity *v)
}
static inline int verity_fec_decode(struct dm_verity *v,
struct dm_verity_io *io,
enum verity_block_type type,
- sector_t block, u8 *dest,
- struct bvec_iter *iter)
+ sector_t block, u8 *dest)
{
return -EOPNOTSUPP;
}
static inline unsigned int verity_fec_status_table(struct dm_verity *v,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index c6a0e3280e39..3e2e4f41714c 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -340,11 +340,11 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
* tasklet since it may sleep, so fallback to work-queue.
*/
r = -EAGAIN;
goto release_ret_r;
} else if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_METADATA,
- hash_block, data, NULL) == 0)
+ hash_block, data) == 0)
aux->hash_verified = 1;
else if (verity_handle_err(v,
DM_VERITY_BLOCK_TYPE_METADATA,
hash_block)) {
struct bio *bio =
@@ -402,102 +402,12 @@ int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
*is_zero = false;
return r;
}
-/*
- * Calculates the digest for the given bio
- */
-static int verity_for_io_block(struct dm_verity *v, struct dm_verity_io *io,
- struct bvec_iter *iter, struct crypto_wait *wait)
-{
- unsigned int todo = 1 << v->data_dev_block_bits;
- struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
- struct scatterlist sg;
- struct ahash_request *req = verity_io_hash_req(v, io);
-
- do {
- int r;
- unsigned int len;
- struct bio_vec bv = bio_iter_iovec(bio, *iter);
-
- sg_init_table(&sg, 1);
-
- len = bv.bv_len;
-
- if (likely(len >= todo))
- len = todo;
- /*
- * Operating on a single page at a time looks suboptimal
- * until you consider the typical block size is 4,096B.
- * Going through this loops twice should be very rare.
- */
- sg_set_page(&sg, bv.bv_page, len, bv.bv_offset);
- ahash_request_set_crypt(req, &sg, NULL, len);
- r = crypto_wait_req(crypto_ahash_update(req), wait);
-
- if (unlikely(r < 0)) {
- DMERR("%s crypto op failed: %d", __func__, r);
- return r;
- }
-
- bio_advance_iter(bio, iter, len);
- todo -= len;
- } while (todo);
-
- return 0;
-}
-
-/*
- * Calls function process for 1 << v->data_dev_block_bits bytes in the bio_vec
- * starting from iter.
- */
-int verity_for_bv_block(struct dm_verity *v, struct dm_verity_io *io,
- struct bvec_iter *iter,
- int (*process)(struct dm_verity *v,
- struct dm_verity_io *io, u8 *data,
- size_t len))
-{
- unsigned int todo = 1 << v->data_dev_block_bits;
- struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
-
- do {
- int r;
- u8 *page;
- unsigned int len;
- struct bio_vec bv = bio_iter_iovec(bio, *iter);
-
- page = bvec_kmap_local(&bv);
- len = bv.bv_len;
-
- if (likely(len >= todo))
- len = todo;
-
- r = process(v, io, page, len);
- kunmap_local(page);
-
- if (r < 0)
- return r;
-
- bio_advance_iter(bio, iter, len);
- todo -= len;
- } while (todo);
-
- return 0;
-}
-
-static int verity_recheck_copy(struct dm_verity *v, struct dm_verity_io *io,
- u8 *data, size_t len)
-{
- memcpy(data, io->recheck_buffer, len);
- io->recheck_buffer += len;
-
- return 0;
-}
-
static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
- struct bvec_iter start, sector_t cur_block)
+ sector_t cur_block, u8 *dest)
{
struct page *page;
void *buffer;
int r;
struct dm_io_request io_req;
@@ -528,42 +438,38 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
verity_io_want_digest(v, io), v->digest_size)) {
r = -EIO;
goto free_ret;
}
- io->recheck_buffer = buffer;
- r = verity_for_bv_block(v, io, &start, verity_recheck_copy);
- if (unlikely(r))
- goto free_ret;
-
+ memcpy(dest, buffer, 1 << v->data_dev_block_bits);
r = 0;
free_ret:
mempool_free(page, &v->recheck_pool);
return r;
}
static int verity_handle_data_hash_mismatch(struct dm_verity *v,
struct dm_verity_io *io,
struct bio *bio, sector_t blkno,
- struct bvec_iter *start)
+ u8 *data)
{
if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
* Error handling code (FEC included) cannot be run in the
* BH workqueue, so fallback to a standard workqueue.
*/
return -EAGAIN;
}
- if (verity_recheck(v, io, *start, blkno) == 0) {
+ if (verity_recheck(v, io, blkno, data) == 0) {
if (v->validated_blocks)
set_bit(blkno, v->validated_blocks);
return 0;
}
#if defined(CONFIG_DM_VERITY_FEC)
if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, blkno,
- NULL, start) == 0)
+ data) == 0)
return 0;
#endif
if (bio->bi_status)
return -EIO; /* Error correction failed; Just return error */
@@ -572,40 +478,19 @@ static int verity_handle_data_hash_mismatch(struct dm_verity *v,
return -EIO;
}
return 0;
}
-static int verity_bv_zero(struct dm_verity *v, struct dm_verity_io *io,
- u8 *data, size_t len)
-{
- memset(data, 0, len);
- return 0;
-}
-
-/*
- * Moves the bio iter one data block forward.
- */
-static inline void verity_bv_skip_block(struct dm_verity *v,
- struct dm_verity_io *io,
- struct bvec_iter *iter)
-{
- struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
-
- bio_advance_iter(bio, iter, 1 << v->data_dev_block_bits);
-}
-
/*
* Verify one "dm_verity_io" structure.
*/
static int verity_verify_io(struct dm_verity_io *io)
{
- bool is_zero;
struct dm_verity *v = io->v;
- struct bvec_iter start;
+ const unsigned int block_size = 1 << v->data_dev_block_bits;
struct bvec_iter iter_copy;
struct bvec_iter *iter;
- struct crypto_wait wait;
struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
unsigned int b;
if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
@@ -615,62 +500,69 @@ static int verity_verify_io(struct dm_verity_io *io)
iter_copy = io->iter;
iter = &iter_copy;
} else
iter = &io->iter;
- for (b = 0; b < io->n_blocks; b++) {
+ for (b = 0; b < io->n_blocks;
+ b++, bio_advance_iter(bio, iter, block_size)) {
int r;
sector_t cur_block = io->block + b;
- struct ahash_request *req = verity_io_hash_req(v, io);
+ bool is_zero;
+ struct bio_vec bv;
+ void *data;
if (v->validated_blocks && bio->bi_status == BLK_STS_OK &&
- likely(test_bit(cur_block, v->validated_blocks))) {
- verity_bv_skip_block(v, io, iter);
+ likely(test_bit(cur_block, v->validated_blocks)))
continue;
- }
r = verity_hash_for_block(v, io, cur_block,
verity_io_want_digest(v, io),
&is_zero);
if (unlikely(r < 0))
return r;
+ bv = bio_iter_iovec(bio, *iter);
+ if (unlikely(bv.bv_len < block_size)) {
+ /*
+ * Data block spans pages. This should not happen,
+ * since dm-verity sets dma_alignment to the data block
+ * size minus 1, and dm-verity also doesn't allow the
+ * data block size to be greater than PAGE_SIZE.
+ */
+ DMERR_LIMIT("unaligned io (data block spans pages)");
+ return -EIO;
+ }
+
+ data = bvec_kmap_local(&bv);
+
if (is_zero) {
/*
* If we expect a zero block, don't validate, just
* return zeros.
*/
- r = verity_for_bv_block(v, io, iter,
- verity_bv_zero);
- if (unlikely(r < 0))
- return r;
-
+ memset(data, 0, block_size);
+ kunmap_local(data);
continue;
}
- r = verity_hash_init(v, req, &wait, !io->in_bh);
- if (unlikely(r < 0))
- return r;
-
- start = *iter;
- r = verity_for_io_block(v, io, iter, &wait);
- if (unlikely(r < 0))
- return r;
-
- r = verity_hash_final(v, req, verity_io_real_digest(v, io),
- &wait);
- if (unlikely(r < 0))
+ r = verity_hash(v, verity_io_hash_req(v, io), data, block_size,
+ verity_io_real_digest(v, io), !io->in_bh);
+ if (unlikely(r < 0)) {
+ kunmap_local(data);
return r;
+ }
if (likely(memcmp(verity_io_real_digest(v, io),
verity_io_want_digest(v, io), v->digest_size) == 0)) {
if (v->validated_blocks)
set_bit(cur_block, v->validated_blocks);
+ kunmap_local(data);
continue;
}
r = verity_handle_data_hash_mismatch(v, io, bio, cur_block,
- &start);
+ data);
+ kunmap_local(data);
if (unlikely(r))
return r;
}
return 0;
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 5d3da9f5fc95..bd461c28b710 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -87,12 +87,10 @@ struct dm_verity_io {
bool in_bh;
struct work_struct work;
struct work_struct bh_work;
- char *recheck_buffer;
-
u8 real_digest[HASH_MAX_DIGESTSIZE];
u8 want_digest[HASH_MAX_DIGESTSIZE];
/*
* This struct is followed by a variable-sized struct ahash_request of
@@ -116,16 +114,10 @@ static inline u8 *verity_io_want_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
return io->want_digest;
}
-extern int verity_for_bv_block(struct dm_verity *v, struct dm_verity_io *io,
- struct bvec_iter *iter,
- int (*process)(struct dm_verity *v,
- struct dm_verity_io *io,
- u8 *data, size_t len));
-
extern int verity_hash(struct dm_verity *v, struct ahash_request *req,
const u8 *data, size_t len, u8 *digest, bool may_sleep);
extern int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
sector_t block, u8 *digest, bool *is_zero);
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 6/7] dm-verity: make verity_hash() take dm_verity_io instead of ahash_request
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
` (4 preceding siblings ...)
2024-07-02 6:48 ` [PATCH 5/7] dm-verity: always "map" the data blocks Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 6:48 ` [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible Eric Biggers
2024-07-03 19:15 ` [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
7 siblings, 0 replies; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
In preparation for adding shash support to dm-verity, change
verity_hash() to take a pointer to a struct dm_verity_io instead of a
pointer to the ahash_request embedded inside it.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-fec.c | 6 ++----
drivers/md/dm-verity-target.c | 21 ++++++++++-----------
drivers/md/dm-verity.h | 2 +-
3 files changed, 13 insertions(+), 16 deletions(-)
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index b838d21183b5..62b1a44b8dd2 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -184,12 +184,11 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io,
* Locate data block erasures using verity hashes.
*/
static int fec_is_erasure(struct dm_verity *v, struct dm_verity_io *io,
u8 *want_digest, u8 *data)
{
- if (unlikely(verity_hash(v, verity_io_hash_req(v, io),
- data, 1 << v->data_dev_block_bits,
+ if (unlikely(verity_hash(v, io, data, 1 << v->data_dev_block_bits,
verity_io_real_digest(v, io), true)))
return 0;
return memcmp(verity_io_real_digest(v, io), want_digest,
v->digest_size) != 0;
@@ -386,12 +385,11 @@ static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io,
pos += fio->nbufs << DM_VERITY_FEC_BUF_RS_BITS;
}
/* Always re-validate the corrected block against the expected hash */
- r = verity_hash(v, verity_io_hash_req(v, io), fio->output,
- 1 << v->data_dev_block_bits,
+ r = verity_hash(v, io, fio->output, 1 << v->data_dev_block_bits,
verity_io_real_digest(v, io), true);
if (unlikely(r < 0))
return r;
if (memcmp(verity_io_real_digest(v, io), verity_io_want_digest(v, io),
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 3e2e4f41714c..4aa140751166 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -178,13 +178,14 @@ static int verity_hash_final(struct dm_verity *v, struct ahash_request *req,
r = crypto_wait_req(crypto_ahash_final(req), wait);
out:
return r;
}
-int verity_hash(struct dm_verity *v, struct ahash_request *req,
+int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
const u8 *data, size_t len, u8 *digest, bool may_sleep)
{
+ struct ahash_request *req = verity_io_hash_req(v, io);
int r;
struct crypto_wait wait;
r = verity_hash_init(v, req, &wait, may_sleep);
if (unlikely(r < 0))
@@ -323,12 +324,11 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
if (skip_unverified) {
r = 1;
goto release_ret_r;
}
- r = verity_hash(v, verity_io_hash_req(v, io),
- data, 1 << v->hash_dev_block_bits,
+ r = verity_hash(v, io, data, 1 << v->hash_dev_block_bits,
verity_io_real_digest(v, io), !io->in_bh);
if (unlikely(r < 0))
goto release_ret_r;
if (likely(memcmp(verity_io_real_digest(v, io), want_digest,
@@ -426,12 +426,11 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
io_loc.count = 1 << (v->data_dev_block_bits - SECTOR_SHIFT);
r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
goto free_ret;
- r = verity_hash(v, verity_io_hash_req(v, io), buffer,
- 1 << v->data_dev_block_bits,
+ r = verity_hash(v, io, buffer, 1 << v->data_dev_block_bits,
verity_io_real_digest(v, io), true);
if (unlikely(r))
goto free_ret;
if (memcmp(verity_io_real_digest(v, io),
@@ -542,11 +541,11 @@ static int verity_verify_io(struct dm_verity_io *io)
memset(data, 0, block_size);
kunmap_local(data);
continue;
}
- r = verity_hash(v, verity_io_hash_req(v, io), data, block_size,
+ r = verity_hash(v, io, data, block_size,
verity_io_real_digest(v, io), !io->in_bh);
if (unlikely(r < 0)) {
kunmap_local(data);
return r;
}
@@ -983,33 +982,33 @@ static int verity_alloc_most_once(struct dm_verity *v)
}
static int verity_alloc_zero_digest(struct dm_verity *v)
{
int r = -ENOMEM;
- struct ahash_request *req;
+ struct dm_verity_io *io;
u8 *zero_data;
v->zero_digest = kmalloc(v->digest_size, GFP_KERNEL);
if (!v->zero_digest)
return r;
- req = kmalloc(v->ahash_reqsize, GFP_KERNEL);
+ io = kmalloc(sizeof(*io) + v->ahash_reqsize, GFP_KERNEL);
- if (!req)
+ if (!io)
return r; /* verity_dtr will free zero_digest */
zero_data = kzalloc(1 << v->data_dev_block_bits, GFP_KERNEL);
if (!zero_data)
goto out;
- r = verity_hash(v, req, zero_data, 1 << v->data_dev_block_bits,
+ r = verity_hash(v, io, zero_data, 1 << v->data_dev_block_bits,
v->zero_digest, true);
out:
- kfree(req);
+ kfree(io);
kfree(zero_data);
return r;
}
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index bd461c28b710..0e1dd02a916f 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -114,11 +114,11 @@ static inline u8 *verity_io_want_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
return io->want_digest;
}
-extern int verity_hash(struct dm_verity *v, struct ahash_request *req,
+extern int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
const u8 *data, size_t len, u8 *digest, bool may_sleep);
extern int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
sector_t block, u8 *digest, bool *is_zero);
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
` (5 preceding siblings ...)
2024-07-02 6:48 ` [PATCH 6/7] dm-verity: make verity_hash() take dm_verity_io instead of ahash_request Eric Biggers
@ 2024-07-02 6:48 ` Eric Biggers
2024-07-02 7:41 ` Ard Biesheuvel
2024-07-03 19:15 ` [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
7 siblings, 1 reply; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 6:48 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche
From: Eric Biggers <ebiggers@google.com>
Currently dm-verity computes the hash of each block by using multiple
calls to the "ahash" crypto API. While the exact sequence depends on
the chosen dm-verity settings, in the vast majority of cases it is:
1. crypto_ahash_init()
2. crypto_ahash_update() [salt]
3. crypto_ahash_update() [data]
4. crypto_ahash_final()
This is inefficient for two main reasons:
- It makes multiple indirect calls, which is expensive on modern CPUs
especially when mitigations for CPU vulnerabilities are enabled.
Since the salt is the same across all blocks on a given dm-verity
device, a much more efficient sequence would be to do an import of the
pre-salted state, then a finup.
- It uses the ahash (asynchronous hash) API, despite the fact that
CPU-based hashing is almost always used in practice, and therefore it
experiences the overhead of the ahash-based wrapper for shash.
Because dm-verity was intentionally converted to ahash to support
off-CPU crypto accelerators, a full reversion to shash might not be
acceptable. Yet, we should still provide a fast path for shash with
the most common dm-verity settings.
Another reason for shash over ahash is that the upcoming multibuffer
hashing support, which is specific to CPU-based hashing, is much
better suited for shash than for ahash. Supporting it via ahash would
add significant complexity and overhead. And it's not possible for
the "same" code to properly support both multibuffer hashing and HW
accelerators at the same time anyway, given the different computation
models. Unfortunately there will always be code specific to each
model needed (for users who want to support both).
Therefore, this patch adds a new shash import+finup based fast path to
dm-verity. It is used automatically when appropriate. This makes
dm-verity optimized for what the vast majority of users want: CPU-based
hashing with the most common settings, while still retaining support for
rarer settings and off-CPU crypto accelerators.
In benchmarks with veritysetup's default parameters (SHA-256, 4K data
and hash block sizes, 32-byte salt), which also match the parameters
that Android currently uses, this patch improves block hashing
performance by about 15% on x86_64 using the SHA-NI instructions, or by
about 5% on arm64 using the ARMv8 SHA2 instructions. On x86_64 roughly
two-thirds of the improvement comes from the use of import and finup,
while the remaining third comes from the switch from ahash to shash.
Note that another benefit of using "import" to handle the salt is that
if the salt size is equal to the input size of the hash algorithm's
compression function, e.g. 64 bytes for SHA-256, then the performance is
exactly the same as no salt. This doesn't seem to be much better than
veritysetup's current default of 32-byte salts, due to the way SHA-256's
finalization padding works, but it should be marginally better.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/md/dm-verity-target.c | 169 ++++++++++++++++++++++++----------
drivers/md/dm-verity.h | 18 ++--
2 files changed, 130 insertions(+), 57 deletions(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 4aa140751166..d16c51958465 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -46,10 +46,13 @@ static unsigned int dm_verity_prefetch_cluster = DM_VERITY_DEFAULT_PREFETCH_SIZE
module_param_named(prefetch_cluster, dm_verity_prefetch_cluster, uint, 0644);
static DEFINE_STATIC_KEY_FALSE(use_bh_wq_enabled);
+/* Is at least one dm-verity instance using ahash_tfm instead of shash_tfm? */
+static DEFINE_STATIC_KEY_FALSE(ahash_enabled);
+
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
unsigned short ioprio;
sector_t block;
@@ -100,11 +103,11 @@ static sector_t verity_position_at_level(struct dm_verity *v, sector_t block,
int level)
{
return block >> (level * v->hash_per_block_bits);
}
-static int verity_hash_update(struct dm_verity *v, struct ahash_request *req,
+static int verity_ahash_update(struct dm_verity *v, struct ahash_request *req,
const u8 *data, size_t len,
struct crypto_wait *wait)
{
struct scatterlist sg;
@@ -133,16 +136,16 @@ static int verity_hash_update(struct dm_verity *v, struct ahash_request *req,
}
/*
* Wrapper for crypto_ahash_init, which handles verity salting.
*/
-static int verity_hash_init(struct dm_verity *v, struct ahash_request *req,
+static int verity_ahash_init(struct dm_verity *v, struct ahash_request *req,
struct crypto_wait *wait, bool may_sleep)
{
int r;
- ahash_request_set_tfm(req, v->tfm);
+ ahash_request_set_tfm(req, v->ahash_tfm);
ahash_request_set_callback(req,
may_sleep ? CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG : 0,
crypto_req_done, (void *)wait);
crypto_init_wait(wait);
@@ -153,22 +156,22 @@ static int verity_hash_init(struct dm_verity *v, struct ahash_request *req,
DMERR("crypto_ahash_init failed: %d", r);
return r;
}
if (likely(v->salt_size && (v->version >= 1)))
- r = verity_hash_update(v, req, v->salt, v->salt_size, wait);
+ r = verity_ahash_update(v, req, v->salt, v->salt_size, wait);
return r;
}
-static int verity_hash_final(struct dm_verity *v, struct ahash_request *req,
- u8 *digest, struct crypto_wait *wait)
+static int verity_ahash_final(struct dm_verity *v, struct ahash_request *req,
+ u8 *digest, struct crypto_wait *wait)
{
int r;
if (unlikely(v->salt_size && (!v->version))) {
- r = verity_hash_update(v, req, v->salt, v->salt_size, wait);
+ r = verity_ahash_update(v, req, v->salt, v->salt_size, wait);
if (r < 0) {
DMERR("%s failed updating salt: %d", __func__, r);
goto out;
}
@@ -181,25 +184,28 @@ static int verity_hash_final(struct dm_verity *v, struct ahash_request *req,
}
int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
const u8 *data, size_t len, u8 *digest, bool may_sleep)
{
- struct ahash_request *req = verity_io_hash_req(v, io);
int r;
- struct crypto_wait wait;
-
- r = verity_hash_init(v, req, &wait, may_sleep);
- if (unlikely(r < 0))
- goto out;
- r = verity_hash_update(v, req, data, len, &wait);
- if (unlikely(r < 0))
- goto out;
+ if (static_branch_unlikely(&ahash_enabled) && !v->shash_tfm) {
+ struct ahash_request *req = verity_io_hash_req(v, io);
+ struct crypto_wait wait;
- r = verity_hash_final(v, req, digest, &wait);
+ r = verity_ahash_init(v, req, &wait, may_sleep) ?:
+ verity_ahash_update(v, req, data, len, &wait) ?:
+ verity_ahash_final(v, req, digest, &wait);
+ } else {
+ struct shash_desc *desc = verity_io_hash_req(v, io);
-out:
+ desc->tfm = v->shash_tfm;
+ r = crypto_shash_import(desc, v->initial_hashstate) ?:
+ crypto_shash_finup(desc, data, len, digest);
+ }
+ if (unlikely(r))
+ DMERR("Error hashing block: %d", r);
return r;
}
static void verity_hash_at_level(struct dm_verity *v, sector_t block, int level,
sector_t *hash_block, unsigned int *offset)
@@ -932,15 +938,20 @@ static void verity_dtr(struct dm_target *ti)
if (v->bufio)
dm_bufio_client_destroy(v->bufio);
kvfree(v->validated_blocks);
kfree(v->salt);
+ kfree(v->initial_hashstate);
kfree(v->root_digest);
kfree(v->zero_digest);
- if (v->tfm)
- crypto_free_ahash(v->tfm);
+ if (v->ahash_tfm) {
+ static_branch_dec(&ahash_enabled);
+ crypto_free_ahash(v->ahash_tfm);
+ } else {
+ crypto_free_shash(v->shash_tfm);
+ }
kfree(v->alg_name);
if (v->hash_dev)
dm_put_device(ti, v->hash_dev);
@@ -990,11 +1001,11 @@ static int verity_alloc_zero_digest(struct dm_verity *v)
v->zero_digest = kmalloc(v->digest_size, GFP_KERNEL);
if (!v->zero_digest)
return r;
- io = kmalloc(sizeof(*io) + v->ahash_reqsize, GFP_KERNEL);
+ io = kmalloc(sizeof(*io) + v->hash_reqsize, GFP_KERNEL);
if (!io)
return r; /* verity_dtr will free zero_digest */
zero_data = kzalloc(1 << v->data_dev_block_bits, GFP_KERNEL);
@@ -1129,40 +1140,110 @@ static int verity_parse_opt_args(struct dm_arg_set *as, struct dm_verity *v,
static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
{
struct dm_target *ti = v->ti;
struct crypto_ahash *ahash;
+ struct crypto_shash *shash = NULL;
+ const char *driver_name;
v->alg_name = kstrdup(alg_name, GFP_KERNEL);
if (!v->alg_name) {
ti->error = "Cannot allocate algorithm name";
return -ENOMEM;
}
+ /*
+ * Allocate the hash transformation object that this dm-verity instance
+ * will use. The vast majority of dm-verity users use CPU-based
+ * hashing, so when possible use the shash API to minimize the crypto
+ * API overhead. If the ahash API resolves to a different driver
+ * (likely an off-CPU hardware offload), use ahash instead. Also use
+ * ahash if the obsolete dm-verity format with the appended salt is
+ * being used, so that quirk only needs to be handled in one place.
+ */
ahash = crypto_alloc_ahash(alg_name, 0,
v->use_bh_wq ? CRYPTO_ALG_ASYNC : 0);
if (IS_ERR(ahash)) {
ti->error = "Cannot initialize hash function";
return PTR_ERR(ahash);
}
- v->tfm = ahash;
-
- /*
- * dm-verity performance can vary greatly depending on which hash
- * algorithm implementation is used. Help people debug performance
- * problems by logging the ->cra_driver_name.
- */
- DMINFO("%s using implementation \"%s\"", alg_name,
- crypto_hash_alg_common(ahash)->base.cra_driver_name);
-
- v->digest_size = crypto_ahash_digestsize(ahash);
+ driver_name = crypto_ahash_driver_name(ahash);
+ if (v->version >= 1 /* salt prepended, not appended? */) {
+ shash = crypto_alloc_shash(alg_name, 0, 0);
+ if (!IS_ERR(shash) &&
+ strcmp(crypto_shash_driver_name(shash), driver_name) != 0) {
+ /*
+ * ahash gave a different driver than shash, so probably
+ * this is a case of real hardware offload. Use ahash.
+ */
+ crypto_free_shash(shash);
+ shash = NULL;
+ }
+ }
+ if (!IS_ERR_OR_NULL(shash)) {
+ crypto_free_ahash(ahash);
+ ahash = NULL;
+ v->shash_tfm = shash;
+ v->digest_size = crypto_shash_digestsize(shash);
+ v->hash_reqsize = sizeof(struct shash_desc) +
+ crypto_shash_descsize(shash);
+ DMINFO("%s using shash \"%s\"", alg_name, driver_name);
+ } else {
+ v->ahash_tfm = ahash;
+ static_branch_inc(&ahash_enabled);
+ v->digest_size = crypto_ahash_digestsize(ahash);
+ v->hash_reqsize = sizeof(struct ahash_request) +
+ crypto_ahash_reqsize(ahash);
+ DMINFO("%s using ahash \"%s\"", alg_name, driver_name);
+ }
if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
ti->error = "Digest size too big";
return -EINVAL;
}
- v->ahash_reqsize = sizeof(struct ahash_request) +
- crypto_ahash_reqsize(ahash);
+ return 0;
+}
+
+static int verity_setup_salt_and_hashstate(struct dm_verity *v, const char *arg)
+{
+ struct dm_target *ti = v->ti;
+
+ if (strcmp(arg, "-") != 0) {
+ v->salt_size = strlen(arg) / 2;
+ v->salt = kmalloc(v->salt_size, GFP_KERNEL);
+ if (!v->salt) {
+ ti->error = "Cannot allocate salt";
+ return -ENOMEM;
+ }
+ if (strlen(arg) != v->salt_size * 2 ||
+ hex2bin(v->salt, arg, v->salt_size)) {
+ ti->error = "Invalid salt";
+ return -EINVAL;
+ }
+ }
+ if (v->shash_tfm) {
+ SHASH_DESC_ON_STACK(desc, v->shash_tfm);
+ int r;
+
+ /*
+ * Compute the pre-salted hash state that can be passed to
+ * crypto_shash_import() for each block later.
+ */
+ v->initial_hashstate = kmalloc(
+ crypto_shash_statesize(v->shash_tfm), GFP_KERNEL);
+ if (!v->initial_hashstate) {
+ ti->error = "Cannot allocate initial hash state";
+ return -ENOMEM;
+ }
+ desc->tfm = v->shash_tfm;
+ r = crypto_shash_init(desc) ?:
+ crypto_shash_update(desc, v->salt, v->salt_size) ?:
+ crypto_shash_export(desc, v->initial_hashstate);
+ if (r) {
+ ti->error = "Cannot set up initial hash state";
+ return r;
+ }
+ }
return 0;
}
/*
* Target parameters:
@@ -1304,25 +1385,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
r = -EINVAL;
goto bad;
}
root_hash_digest_to_validate = argv[8];
- if (strcmp(argv[9], "-")) {
- v->salt_size = strlen(argv[9]) / 2;
- v->salt = kmalloc(v->salt_size, GFP_KERNEL);
- if (!v->salt) {
- ti->error = "Cannot allocate salt";
- r = -ENOMEM;
- goto bad;
- }
- if (strlen(argv[9]) != v->salt_size * 2 ||
- hex2bin(v->salt, argv[9], v->salt_size)) {
- ti->error = "Invalid salt";
- r = -EINVAL;
- goto bad;
- }
- }
+ r = verity_setup_salt_and_hashstate(v, argv[9]);
+ if (r)
+ goto bad;
argv += 10;
argc -= 10;
/* Optional parameters */
@@ -1420,11 +1489,11 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
ti->error = "Cannot allocate workqueue";
r = -ENOMEM;
goto bad;
}
- ti->per_io_data_size = sizeof(struct dm_verity_io) + v->ahash_reqsize;
+ ti->per_io_data_size = sizeof(struct dm_verity_io) + v->hash_reqsize;
r = verity_fec_ctr(v);
if (r)
goto bad;
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 0e1dd02a916f..aac3a1b1d94a 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -37,13 +37,15 @@ struct dm_verity {
struct dm_dev *data_dev;
struct dm_dev *hash_dev;
struct dm_target *ti;
struct dm_bufio_client *bufio;
char *alg_name;
- struct crypto_ahash *tfm;
+ struct crypto_ahash *ahash_tfm; /* either this or shash_tfm is set */
+ struct crypto_shash *shash_tfm; /* either this or ahash_tfm is set */
u8 *root_digest; /* digest of the root block */
u8 *salt; /* salt: its size is salt_size */
+ u8 *initial_hashstate; /* salted initial state, if shash_tfm is set */
u8 *zero_digest; /* digest for a zero block */
unsigned int salt_size;
sector_t data_start; /* data offset in 512-byte sectors */
sector_t hash_start; /* hash start in blocks */
sector_t data_blocks; /* the number of data blocks */
@@ -54,11 +56,11 @@ struct dm_verity {
unsigned char levels; /* the number of tree levels */
unsigned char version;
bool hash_failed:1; /* set if hash of any block failed */
bool use_bh_wq:1; /* try to verify in BH wq before normal work-queue */
unsigned int digest_size; /* digest size for the current hash algorithm */
- unsigned int ahash_reqsize;/* the size of temporary space for crypto */
+ unsigned int hash_reqsize; /* the size of temporary space for crypto */
enum verity_mode mode; /* mode for handling verification errors */
unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
struct workqueue_struct *verify_wq;
@@ -91,19 +93,21 @@ struct dm_verity_io {
u8 real_digest[HASH_MAX_DIGESTSIZE];
u8 want_digest[HASH_MAX_DIGESTSIZE];
/*
- * This struct is followed by a variable-sized struct ahash_request of
- * size v->ahash_reqsize. To access it, use verity_io_hash_req().
+ * This struct is followed by a variable-sized hash request of size
+ * v->hash_reqsize, either a struct ahash_request or a struct shash_desc
+ * (depending on whether ahash_tfm or shash_tfm is being used). To
+ * access it, use verity_io_hash_req().
*/
};
-static inline struct ahash_request *verity_io_hash_req(struct dm_verity *v,
- struct dm_verity_io *io)
+static inline void *verity_io_hash_req(struct dm_verity *v,
+ struct dm_verity_io *io)
{
- return (struct ahash_request *)(io + 1);
+ return io + 1;
}
static inline u8 *verity_io_real_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible
2024-07-02 6:48 ` [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible Eric Biggers
@ 2024-07-02 7:41 ` Ard Biesheuvel
2024-07-02 17:16 ` Eric Biggers
0 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2024-07-02 7:41 UTC (permalink / raw)
To: Eric Biggers
Cc: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka,
linux-kernel, Sami Tolvanen, Bart Van Assche
Hi Eric,
One question below.
On Tue, 2 Jul 2024 at 08:49, Eric Biggers <ebiggers@kernel.org> wrote:
>
> From: Eric Biggers <ebiggers@google.com>
>
> Currently dm-verity computes the hash of each block by using multiple
> calls to the "ahash" crypto API. While the exact sequence depends on
> the chosen dm-verity settings, in the vast majority of cases it is:
>
> 1. crypto_ahash_init()
> 2. crypto_ahash_update() [salt]
> 3. crypto_ahash_update() [data]
> 4. crypto_ahash_final()
>
> This is inefficient for two main reasons:
>
> - It makes multiple indirect calls, which is expensive on modern CPUs
> especially when mitigations for CPU vulnerabilities are enabled.
>
> Since the salt is the same across all blocks on a given dm-verity
> device, a much more efficient sequence would be to do an import of the
> pre-salted state, then a finup.
>
> - It uses the ahash (asynchronous hash) API, despite the fact that
> CPU-based hashing is almost always used in practice, and therefore it
> experiences the overhead of the ahash-based wrapper for shash.
>
> Because dm-verity was intentionally converted to ahash to support
> off-CPU crypto accelerators, a full reversion to shash might not be
> acceptable. Yet, we should still provide a fast path for shash with
> the most common dm-verity settings.
>
> Another reason for shash over ahash is that the upcoming multibuffer
> hashing support, which is specific to CPU-based hashing, is much
> better suited for shash than for ahash. Supporting it via ahash would
> add significant complexity and overhead. And it's not possible for
> the "same" code to properly support both multibuffer hashing and HW
> accelerators at the same time anyway, given the different computation
> models. Unfortunately there will always be code specific to each
> model needed (for users who want to support both).
>
> Therefore, this patch adds a new shash import+finup based fast path to
> dm-verity. It is used automatically when appropriate. This makes
> dm-verity optimized for what the vast majority of users want: CPU-based
> hashing with the most common settings, while still retaining support for
> rarer settings and off-CPU crypto accelerators.
>
> In benchmarks with veritysetup's default parameters (SHA-256, 4K data
> and hash block sizes, 32-byte salt), which also match the parameters
> that Android currently uses, this patch improves block hashing
> performance by about 15% on x86_64 using the SHA-NI instructions, or by
> about 5% on arm64 using the ARMv8 SHA2 instructions. On x86_64 roughly
> two-thirds of the improvement comes from the use of import and finup,
> while the remaining third comes from the switch from ahash to shash.
>
> Note that another benefit of using "import" to handle the salt is that
> if the salt size is equal to the input size of the hash algorithm's
> compression function, e.g. 64 bytes for SHA-256, then the performance is
> exactly the same as no salt. This doesn't seem to be much better than
> veritysetup's current default of 32-byte salts, due to the way SHA-256's
> finalization padding works, but it should be marginally better.
>
> Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
> drivers/md/dm-verity-target.c | 169 ++++++++++++++++++++++++----------
> drivers/md/dm-verity.h | 18 ++--
> 2 files changed, 130 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index 4aa140751166..d16c51958465 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -46,10 +46,13 @@ static unsigned int dm_verity_prefetch_cluster = DM_VERITY_DEFAULT_PREFETCH_SIZE
>
> module_param_named(prefetch_cluster, dm_verity_prefetch_cluster, uint, 0644);
>
> static DEFINE_STATIC_KEY_FALSE(use_bh_wq_enabled);
>
> +/* Is at least one dm-verity instance using ahash_tfm instead of shash_tfm? */
> +static DEFINE_STATIC_KEY_FALSE(ahash_enabled);
> +
> struct dm_verity_prefetch_work {
> struct work_struct work;
> struct dm_verity *v;
> unsigned short ioprio;
> sector_t block;
> @@ -100,11 +103,11 @@ static sector_t verity_position_at_level(struct dm_verity *v, sector_t block,
> int level)
> {
> return block >> (level * v->hash_per_block_bits);
> }
>
> -static int verity_hash_update(struct dm_verity *v, struct ahash_request *req,
> +static int verity_ahash_update(struct dm_verity *v, struct ahash_request *req,
> const u8 *data, size_t len,
> struct crypto_wait *wait)
> {
> struct scatterlist sg;
>
> @@ -133,16 +136,16 @@ static int verity_hash_update(struct dm_verity *v, struct ahash_request *req,
> }
>
> /*
> * Wrapper for crypto_ahash_init, which handles verity salting.
> */
> -static int verity_hash_init(struct dm_verity *v, struct ahash_request *req,
> +static int verity_ahash_init(struct dm_verity *v, struct ahash_request *req,
> struct crypto_wait *wait, bool may_sleep)
> {
> int r;
>
> - ahash_request_set_tfm(req, v->tfm);
> + ahash_request_set_tfm(req, v->ahash_tfm);
> ahash_request_set_callback(req,
> may_sleep ? CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG : 0,
> crypto_req_done, (void *)wait);
> crypto_init_wait(wait);
>
> @@ -153,22 +156,22 @@ static int verity_hash_init(struct dm_verity *v, struct ahash_request *req,
> DMERR("crypto_ahash_init failed: %d", r);
> return r;
> }
>
> if (likely(v->salt_size && (v->version >= 1)))
> - r = verity_hash_update(v, req, v->salt, v->salt_size, wait);
> + r = verity_ahash_update(v, req, v->salt, v->salt_size, wait);
>
> return r;
> }
>
> -static int verity_hash_final(struct dm_verity *v, struct ahash_request *req,
> - u8 *digest, struct crypto_wait *wait)
> +static int verity_ahash_final(struct dm_verity *v, struct ahash_request *req,
> + u8 *digest, struct crypto_wait *wait)
> {
> int r;
>
> if (unlikely(v->salt_size && (!v->version))) {
> - r = verity_hash_update(v, req, v->salt, v->salt_size, wait);
> + r = verity_ahash_update(v, req, v->salt, v->salt_size, wait);
>
> if (r < 0) {
> DMERR("%s failed updating salt: %d", __func__, r);
> goto out;
> }
> @@ -181,25 +184,28 @@ static int verity_hash_final(struct dm_verity *v, struct ahash_request *req,
> }
>
> int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
> const u8 *data, size_t len, u8 *digest, bool may_sleep)
> {
> - struct ahash_request *req = verity_io_hash_req(v, io);
> int r;
> - struct crypto_wait wait;
> -
> - r = verity_hash_init(v, req, &wait, may_sleep);
> - if (unlikely(r < 0))
> - goto out;
>
> - r = verity_hash_update(v, req, data, len, &wait);
> - if (unlikely(r < 0))
> - goto out;
> + if (static_branch_unlikely(&ahash_enabled) && !v->shash_tfm) {
Is the static key really worth the hassle? Couldn't this just be
if (unlikely(!v->shash_tfm)) {
so that the ahash logic is moved to the cold path? We need to
dereference v->shash_tfm right away in any case, and if it is never
NULL, the branch predictor should be able to remember that.
> + struct ahash_request *req = verity_io_hash_req(v, io);
> + struct crypto_wait wait;
>
> - r = verity_hash_final(v, req, digest, &wait);
> + r = verity_ahash_init(v, req, &wait, may_sleep) ?:
> + verity_ahash_update(v, req, data, len, &wait) ?:
> + verity_ahash_final(v, req, digest, &wait);
> + } else {
> + struct shash_desc *desc = verity_io_hash_req(v, io);
>
> -out:
> + desc->tfm = v->shash_tfm;
> + r = crypto_shash_import(desc, v->initial_hashstate) ?:
> + crypto_shash_finup(desc, data, len, digest);
> + }
> + if (unlikely(r))
> + DMERR("Error hashing block: %d", r);
> return r;
> }
>
> static void verity_hash_at_level(struct dm_verity *v, sector_t block, int level,
> sector_t *hash_block, unsigned int *offset)
> @@ -932,15 +938,20 @@ static void verity_dtr(struct dm_target *ti)
> if (v->bufio)
> dm_bufio_client_destroy(v->bufio);
>
> kvfree(v->validated_blocks);
> kfree(v->salt);
> + kfree(v->initial_hashstate);
> kfree(v->root_digest);
> kfree(v->zero_digest);
>
> - if (v->tfm)
> - crypto_free_ahash(v->tfm);
> + if (v->ahash_tfm) {
> + static_branch_dec(&ahash_enabled);
> + crypto_free_ahash(v->ahash_tfm);
> + } else {
> + crypto_free_shash(v->shash_tfm);
> + }
>
> kfree(v->alg_name);
>
> if (v->hash_dev)
> dm_put_device(ti, v->hash_dev);
> @@ -990,11 +1001,11 @@ static int verity_alloc_zero_digest(struct dm_verity *v)
> v->zero_digest = kmalloc(v->digest_size, GFP_KERNEL);
>
> if (!v->zero_digest)
> return r;
>
> - io = kmalloc(sizeof(*io) + v->ahash_reqsize, GFP_KERNEL);
> + io = kmalloc(sizeof(*io) + v->hash_reqsize, GFP_KERNEL);
>
> if (!io)
> return r; /* verity_dtr will free zero_digest */
>
> zero_data = kzalloc(1 << v->data_dev_block_bits, GFP_KERNEL);
> @@ -1129,40 +1140,110 @@ static int verity_parse_opt_args(struct dm_arg_set *as, struct dm_verity *v,
>
> static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
> {
> struct dm_target *ti = v->ti;
> struct crypto_ahash *ahash;
> + struct crypto_shash *shash = NULL;
> + const char *driver_name;
>
> v->alg_name = kstrdup(alg_name, GFP_KERNEL);
> if (!v->alg_name) {
> ti->error = "Cannot allocate algorithm name";
> return -ENOMEM;
> }
>
> + /*
> + * Allocate the hash transformation object that this dm-verity instance
> + * will use. The vast majority of dm-verity users use CPU-based
> + * hashing, so when possible use the shash API to minimize the crypto
> + * API overhead. If the ahash API resolves to a different driver
> + * (likely an off-CPU hardware offload), use ahash instead. Also use
> + * ahash if the obsolete dm-verity format with the appended salt is
> + * being used, so that quirk only needs to be handled in one place.
> + */
> ahash = crypto_alloc_ahash(alg_name, 0,
> v->use_bh_wq ? CRYPTO_ALG_ASYNC : 0);
> if (IS_ERR(ahash)) {
> ti->error = "Cannot initialize hash function";
> return PTR_ERR(ahash);
> }
> - v->tfm = ahash;
> -
> - /*
> - * dm-verity performance can vary greatly depending on which hash
> - * algorithm implementation is used. Help people debug performance
> - * problems by logging the ->cra_driver_name.
> - */
> - DMINFO("%s using implementation \"%s\"", alg_name,
> - crypto_hash_alg_common(ahash)->base.cra_driver_name);
> -
> - v->digest_size = crypto_ahash_digestsize(ahash);
> + driver_name = crypto_ahash_driver_name(ahash);
> + if (v->version >= 1 /* salt prepended, not appended? */) {
> + shash = crypto_alloc_shash(alg_name, 0, 0);
> + if (!IS_ERR(shash) &&
> + strcmp(crypto_shash_driver_name(shash), driver_name) != 0) {
> + /*
> + * ahash gave a different driver than shash, so probably
> + * this is a case of real hardware offload. Use ahash.
> + */
> + crypto_free_shash(shash);
> + shash = NULL;
> + }
> + }
> + if (!IS_ERR_OR_NULL(shash)) {
> + crypto_free_ahash(ahash);
> + ahash = NULL;
> + v->shash_tfm = shash;
> + v->digest_size = crypto_shash_digestsize(shash);
> + v->hash_reqsize = sizeof(struct shash_desc) +
> + crypto_shash_descsize(shash);
> + DMINFO("%s using shash \"%s\"", alg_name, driver_name);
> + } else {
> + v->ahash_tfm = ahash;
> + static_branch_inc(&ahash_enabled);
> + v->digest_size = crypto_ahash_digestsize(ahash);
> + v->hash_reqsize = sizeof(struct ahash_request) +
> + crypto_ahash_reqsize(ahash);
> + DMINFO("%s using ahash \"%s\"", alg_name, driver_name);
> + }
> if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
> ti->error = "Digest size too big";
> return -EINVAL;
> }
> - v->ahash_reqsize = sizeof(struct ahash_request) +
> - crypto_ahash_reqsize(ahash);
> + return 0;
> +}
> +
> +static int verity_setup_salt_and_hashstate(struct dm_verity *v, const char *arg)
> +{
> + struct dm_target *ti = v->ti;
> +
> + if (strcmp(arg, "-") != 0) {
> + v->salt_size = strlen(arg) / 2;
> + v->salt = kmalloc(v->salt_size, GFP_KERNEL);
> + if (!v->salt) {
> + ti->error = "Cannot allocate salt";
> + return -ENOMEM;
> + }
> + if (strlen(arg) != v->salt_size * 2 ||
> + hex2bin(v->salt, arg, v->salt_size)) {
> + ti->error = "Invalid salt";
> + return -EINVAL;
> + }
> + }
> + if (v->shash_tfm) {
> + SHASH_DESC_ON_STACK(desc, v->shash_tfm);
> + int r;
> +
> + /*
> + * Compute the pre-salted hash state that can be passed to
> + * crypto_shash_import() for each block later.
> + */
> + v->initial_hashstate = kmalloc(
> + crypto_shash_statesize(v->shash_tfm), GFP_KERNEL);
> + if (!v->initial_hashstate) {
> + ti->error = "Cannot allocate initial hash state";
> + return -ENOMEM;
> + }
> + desc->tfm = v->shash_tfm;
> + r = crypto_shash_init(desc) ?:
> + crypto_shash_update(desc, v->salt, v->salt_size) ?:
> + crypto_shash_export(desc, v->initial_hashstate);
> + if (r) {
> + ti->error = "Cannot set up initial hash state";
> + return r;
> + }
> + }
> return 0;
> }
>
> /*
> * Target parameters:
> @@ -1304,25 +1385,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
> r = -EINVAL;
> goto bad;
> }
> root_hash_digest_to_validate = argv[8];
>
> - if (strcmp(argv[9], "-")) {
> - v->salt_size = strlen(argv[9]) / 2;
> - v->salt = kmalloc(v->salt_size, GFP_KERNEL);
> - if (!v->salt) {
> - ti->error = "Cannot allocate salt";
> - r = -ENOMEM;
> - goto bad;
> - }
> - if (strlen(argv[9]) != v->salt_size * 2 ||
> - hex2bin(v->salt, argv[9], v->salt_size)) {
> - ti->error = "Invalid salt";
> - r = -EINVAL;
> - goto bad;
> - }
> - }
> + r = verity_setup_salt_and_hashstate(v, argv[9]);
> + if (r)
> + goto bad;
>
> argv += 10;
> argc -= 10;
>
> /* Optional parameters */
> @@ -1420,11 +1489,11 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
> ti->error = "Cannot allocate workqueue";
> r = -ENOMEM;
> goto bad;
> }
>
> - ti->per_io_data_size = sizeof(struct dm_verity_io) + v->ahash_reqsize;
> + ti->per_io_data_size = sizeof(struct dm_verity_io) + v->hash_reqsize;
>
> r = verity_fec_ctr(v);
> if (r)
> goto bad;
>
> diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
> index 0e1dd02a916f..aac3a1b1d94a 100644
> --- a/drivers/md/dm-verity.h
> +++ b/drivers/md/dm-verity.h
> @@ -37,13 +37,15 @@ struct dm_verity {
> struct dm_dev *data_dev;
> struct dm_dev *hash_dev;
> struct dm_target *ti;
> struct dm_bufio_client *bufio;
> char *alg_name;
> - struct crypto_ahash *tfm;
> + struct crypto_ahash *ahash_tfm; /* either this or shash_tfm is set */
> + struct crypto_shash *shash_tfm; /* either this or ahash_tfm is set */
> u8 *root_digest; /* digest of the root block */
> u8 *salt; /* salt: its size is salt_size */
> + u8 *initial_hashstate; /* salted initial state, if shash_tfm is set */
> u8 *zero_digest; /* digest for a zero block */
> unsigned int salt_size;
> sector_t data_start; /* data offset in 512-byte sectors */
> sector_t hash_start; /* hash start in blocks */
> sector_t data_blocks; /* the number of data blocks */
> @@ -54,11 +56,11 @@ struct dm_verity {
> unsigned char levels; /* the number of tree levels */
> unsigned char version;
> bool hash_failed:1; /* set if hash of any block failed */
> bool use_bh_wq:1; /* try to verify in BH wq before normal work-queue */
> unsigned int digest_size; /* digest size for the current hash algorithm */
> - unsigned int ahash_reqsize;/* the size of temporary space for crypto */
> + unsigned int hash_reqsize; /* the size of temporary space for crypto */
> enum verity_mode mode; /* mode for handling verification errors */
> unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
>
> struct workqueue_struct *verify_wq;
>
> @@ -91,19 +93,21 @@ struct dm_verity_io {
>
> u8 real_digest[HASH_MAX_DIGESTSIZE];
> u8 want_digest[HASH_MAX_DIGESTSIZE];
>
> /*
> - * This struct is followed by a variable-sized struct ahash_request of
> - * size v->ahash_reqsize. To access it, use verity_io_hash_req().
> + * This struct is followed by a variable-sized hash request of size
> + * v->hash_reqsize, either a struct ahash_request or a struct shash_desc
> + * (depending on whether ahash_tfm or shash_tfm is being used). To
> + * access it, use verity_io_hash_req().
> */
> };
>
> -static inline struct ahash_request *verity_io_hash_req(struct dm_verity *v,
> - struct dm_verity_io *io)
> +static inline void *verity_io_hash_req(struct dm_verity *v,
> + struct dm_verity_io *io)
> {
> - return (struct ahash_request *)(io + 1);
> + return io + 1;
> }
>
> static inline u8 *verity_io_real_digest(struct dm_verity *v,
> struct dm_verity_io *io)
> {
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible
2024-07-02 7:41 ` Ard Biesheuvel
@ 2024-07-02 17:16 ` Eric Biggers
2024-07-03 13:17 ` Ard Biesheuvel
0 siblings, 1 reply; 14+ messages in thread
From: Eric Biggers @ 2024-07-02 17:16 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka,
linux-kernel, Sami Tolvanen, Bart Van Assche
Hi Ard,
On Tue, Jul 02, 2024 at 09:41:19AM +0200, Ard Biesheuvel wrote:
> > int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
> > const u8 *data, size_t len, u8 *digest, bool may_sleep)
> > {
> > - struct ahash_request *req = verity_io_hash_req(v, io);
> > int r;
> > - struct crypto_wait wait;
> > -
> > - r = verity_hash_init(v, req, &wait, may_sleep);
> > - if (unlikely(r < 0))
> > - goto out;
> >
> > - r = verity_hash_update(v, req, data, len, &wait);
> > - if (unlikely(r < 0))
> > - goto out;
> > + if (static_branch_unlikely(&ahash_enabled) && !v->shash_tfm) {
>
> Is the static key really worth the hassle? Couldn't this just be
>
> if (unlikely(!v->shash_tfm)) {
>
> so that the ahash logic is moved to the cold path? We need to
> dereference v->shash_tfm right away in any case, and if it is never
> NULL, the branch predictor should be able to remember that.
The value of the static key is indeed marginal. I included it because of the
precedent of dm-verity's existing use_bh_wq_enabled static key, which exists for
a similar purpose. As long as we're going through the trouble of doing that, I
think it makes sense to use the same pattern for ahash too. It's another rarely
needed option that can be patched in in the very rare case that it's needed.
- Eric
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible
2024-07-02 17:16 ` Eric Biggers
@ 2024-07-03 13:17 ` Ard Biesheuvel
0 siblings, 0 replies; 14+ messages in thread
From: Ard Biesheuvel @ 2024-07-03 13:17 UTC (permalink / raw)
To: Eric Biggers
Cc: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka,
linux-kernel, Sami Tolvanen, Bart Van Assche
On Tue, 2 Jul 2024 at 19:16, Eric Biggers <ebiggers@kernel.org> wrote:
>
> Hi Ard,
>
> On Tue, Jul 02, 2024 at 09:41:19AM +0200, Ard Biesheuvel wrote:
> > > int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
> > > const u8 *data, size_t len, u8 *digest, bool may_sleep)
> > > {
> > > - struct ahash_request *req = verity_io_hash_req(v, io);
> > > int r;
> > > - struct crypto_wait wait;
> > > -
> > > - r = verity_hash_init(v, req, &wait, may_sleep);
> > > - if (unlikely(r < 0))
> > > - goto out;
> > >
> > > - r = verity_hash_update(v, req, data, len, &wait);
> > > - if (unlikely(r < 0))
> > > - goto out;
> > > + if (static_branch_unlikely(&ahash_enabled) && !v->shash_tfm) {
> >
> > Is the static key really worth the hassle? Couldn't this just be
> >
> > if (unlikely(!v->shash_tfm)) {
> >
> > so that the ahash logic is moved to the cold path? We need to
> > dereference v->shash_tfm right away in any case, and if it is never
> > NULL, the branch predictor should be able to remember that.
>
> The value of the static key is indeed marginal. I included it because of the
> precedent of dm-verity's existing use_bh_wq_enabled static key, which exists for
> a similar purpose. As long as we're going through the trouble of doing that, I
> think it makes sense to use the same pattern for ahash too. It's another rarely
> needed option that can be patched in in the very rare case that it's needed.
>
If it's an existing pattern, fair enough.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/7] dm-verity cleanups and optimizations
2024-07-02 6:48 [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
` (6 preceding siblings ...)
2024-07-02 6:48 ` [PATCH 7/7] dm-verity: hash blocks with shash import+finup when possible Eric Biggers
@ 2024-07-03 19:15 ` Eric Biggers
2024-07-04 15:05 ` Mikulas Patocka
7 siblings, 1 reply; 14+ messages in thread
From: Eric Biggers @ 2024-07-03 19:15 UTC (permalink / raw)
To: Mikulas Patocka
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche,
dm-devel, Alasdair Kergon, Mike Snitzer
On Mon, Jul 01, 2024 at 11:48:28PM -0700, Eric Biggers wrote:
> This series contains some cleanups and optimizations for dm-verity that
> I've split out from my multibuffer hashing series
> https://lore.kernel.org/linux-crypto/20240621165922.77672-1-ebiggers@kernel.org/.
>
> This series does not depend on any crypto API changes, so it can be
> applied right away.
>
> Eric Biggers (7):
> dm-verity: move hash algorithm setup into its own function
> dm-verity: move data hash mismatch handling into its own function
> dm-verity: make real_digest and want_digest fixed-length
> dm-verity: provide dma_alignment limit in io_hints
> dm-verity: always "map" the data blocks
> dm-verity: make verity_hash() take dm_verity_io instead of
> ahash_request
> dm-verity: hash blocks with shash import+finup when possible
>
> drivers/md/dm-verity-fec.c | 32 +--
> drivers/md/dm-verity-fec.h | 6 +-
> drivers/md/dm-verity-target.c | 461 ++++++++++++++++------------------
> drivers/md/dm-verity.h | 39 ++-
> 4 files changed, 242 insertions(+), 296 deletions(-)
Thanks for applying this, Mikulas!
I sent out a new version of "dm-verity: provide dma_alignment limit in io_hints"
to address Christoph's comment.
Also, I noticed that when you applied the patches, patch 2 somehow ended up with
the same commit message as patch 1. Can you fix that? Thanks!
- Eric
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 0/7] dm-verity cleanups and optimizations
2024-07-03 19:15 ` [PATCH 0/7] dm-verity cleanups and optimizations Eric Biggers
@ 2024-07-04 15:05 ` Mikulas Patocka
0 siblings, 0 replies; 14+ messages in thread
From: Mikulas Patocka @ 2024-07-04 15:05 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-kernel, Ard Biesheuvel, Sami Tolvanen, Bart Van Assche,
dm-devel, Alasdair Kergon, Mike Snitzer
On Wed, 3 Jul 2024, Eric Biggers wrote:
> On Mon, Jul 01, 2024 at 11:48:28PM -0700, Eric Biggers wrote:
> > This series contains some cleanups and optimizations for dm-verity that
> > I've split out from my multibuffer hashing series
> > https://lore.kernel.org/linux-crypto/20240621165922.77672-1-ebiggers@kernel.org/.
> >
> > This series does not depend on any crypto API changes, so it can be
> > applied right away.
> >
> > Eric Biggers (7):
> > dm-verity: move hash algorithm setup into its own function
> > dm-verity: move data hash mismatch handling into its own function
> > dm-verity: make real_digest and want_digest fixed-length
> > dm-verity: provide dma_alignment limit in io_hints
> > dm-verity: always "map" the data blocks
> > dm-verity: make verity_hash() take dm_verity_io instead of
> > ahash_request
> > dm-verity: hash blocks with shash import+finup when possible
> >
> > drivers/md/dm-verity-fec.c | 32 +--
> > drivers/md/dm-verity-fec.h | 6 +-
> > drivers/md/dm-verity-target.c | 461 ++++++++++++++++------------------
> > drivers/md/dm-verity.h | 39 ++-
> > 4 files changed, 242 insertions(+), 296 deletions(-)
>
> Thanks for applying this, Mikulas!
>
> I sent out a new version of "dm-verity: provide dma_alignment limit in io_hints"
> to address Christoph's comment.
>
> Also, I noticed that when you applied the patches, patch 2 somehow ended up with
> the same commit message as patch 1. Can you fix that? Thanks!
>
> - Eric
Hi
It should be fixed now.
Mikulas
^ permalink raw reply [flat|nested] 14+ messages in thread