* [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing
@ 2025-10-14 21:16 Eric Biggers
2025-10-14 21:16 ` [PATCH 1/4] dm-verity: remove log message with shash driver name Eric Biggers
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Eric Biggers @ 2025-10-14 21:16 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-crypto, linux-kernel, Eric Biggers
This series makes dm-verity take advantage of the 2-way interleaved
SHA-256 hashing code that was added in 6.18. fsverity already does
this; this brings the same optimization to dm-verity. This increases
SHA-256 hashing throughput by up to 98% on x86_64 and 70% on arm64.
To make this possible, this series makes dm-verity use the SHA-256
library when the hash algorithm is SHA-256 (which it usually is). That
actually provides a ~2% performance improvement by itself, in addition
to the larger improvement from interleaved hashing on x86_64 and arm64.
As usual, I verified that 'verity-compat-test' still passes.
This series is targeting linux-dm/for-next.
Eric Biggers (4):
dm-verity: remove log message with shash driver name
dm-verity: use SHA-256 library for SHA-256
dm-verity: reduce scope of real and wanted digests
dm-verity: use 2-way interleaved SHA-256 hashing when supported
drivers/md/Kconfig | 1 +
drivers/md/dm-verity-fec.c | 21 ++--
drivers/md/dm-verity-fec.h | 5 +-
drivers/md/dm-verity-target.c | 203 +++++++++++++++++++++++++---------
drivers/md/dm-verity.h | 52 +++++----
5 files changed, 196 insertions(+), 86 deletions(-)
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/4] dm-verity: remove log message with shash driver name
2025-10-14 21:16 [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing Eric Biggers
@ 2025-10-14 21:16 ` Eric Biggers
2025-10-14 21:16 ` [PATCH 2/4] dm-verity: use SHA-256 library for SHA-256 Eric Biggers
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Eric Biggers @ 2025-10-14 21:16 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-crypto, linux-kernel, Eric Biggers
I added this log message in commit bbf6a566920e ("dm verity: log the
hash algorithm implementation"), to help people debug issues where they
forgot to enable the architecture-optimized SHA-256 code in their
kconfig or accidentally enabled a slow hardware offload driver (such as
QCE) that overrode the faster CPU-accelerated code. However:
- The crypto layer now always enables the architecture-optimized SHA-1,
SHA-256, and SHA-512 code. Moreover, for simplicity the driver name
is now fixed at "sha1-lib", "sha256-lib", etc.
- dm-verity now uses crypto_shash instead of crypto_ahash, preventing
the mistake of accidentally using a slow driver such as QCE.
Therefore, this log message generally no longer provides useful
information. Remove it.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
drivers/md/dm-verity-target.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 66a00a8ccb398..20ddf560d22e3 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1250,11 +1250,10 @@ static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
ti->error = "Cannot initialize hash function";
return PTR_ERR(shash);
}
v->shash_tfm = shash;
v->digest_size = crypto_shash_digestsize(shash);
- DMINFO("%s using \"%s\"", alg_name, crypto_shash_driver_name(shash));
if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
ti->error = "Digest size too big";
return -EINVAL;
}
return 0;
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/4] dm-verity: use SHA-256 library for SHA-256
2025-10-14 21:16 [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing Eric Biggers
2025-10-14 21:16 ` [PATCH 1/4] dm-verity: remove log message with shash driver name Eric Biggers
@ 2025-10-14 21:16 ` Eric Biggers
2025-10-14 21:16 ` [PATCH 3/4] dm-verity: reduce scope of real and wanted digests Eric Biggers
2025-10-14 21:16 ` [PATCH 4/4] dm-verity: use 2-way interleaved SHA-256 hashing when supported Eric Biggers
3 siblings, 0 replies; 5+ messages in thread
From: Eric Biggers @ 2025-10-14 21:16 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-crypto, linux-kernel, Eric Biggers
When the hash algorithm is SHA-256 and the verity version is not 0, use
the SHA-256 library instead of crypto_shash.
This is a prerequisite for making dm-verity interleave the computation
of SHA-256 hashes for increased performance. That optimization is
available in the SHA-256 library but not in crypto_shash.
Even without interleaved hashing, switching to the library also slightly
improves performance by itself because it avoids the overhead of
crypto_shash, including indirect calls and other API overhead.
(Benchmark on x86_64, AMD Zen 5: hashing 4K blocks gets 2.1% faster.)
SHA-256 is by far the most common hash algorithm used with dm-verity.
It makes sense to optimize for the common case and fall back to the
generic crypto layer for uncommon cases, as suggested by Linus:
https://lore.kernel.org/r/CAHk-=wgp-fOSsZsYrbyzqCAfEvrt5jQs1jL-97Wc4seMNTUyng@mail.gmail.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
drivers/md/Kconfig | 1 +
drivers/md/dm-verity-target.c | 61 +++++++++++++++++++++++++++--------
drivers/md/dm-verity.h | 20 +++++++++---
3 files changed, 64 insertions(+), 18 deletions(-)
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 104aa53550905..cac4926fc3401 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -544,10 +544,11 @@ config DM_FLAKEY
config DM_VERITY
tristate "Verity target support"
depends on BLK_DEV_DM
select CRYPTO
select CRYPTO_HASH
+ select CRYPTO_LIB_SHA256
select DM_BUFIO
help
This device-mapper target creates a read-only device that
transparently validates the data on one underlying device against
a pre-generated tree of cryptographic checksums stored on a second
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 20ddf560d22e3..bba9810805631 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -115,23 +115,37 @@ static sector_t verity_position_at_level(struct dm_verity *v, sector_t block,
}
int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
const u8 *data, size_t len, u8 *digest)
{
- struct shash_desc *desc = &io->hash_desc;
+ struct shash_desc *desc;
int r;
+ if (likely(v->use_sha256_lib)) {
+ struct sha256_ctx *ctx = &io->hash_ctx.sha256;
+
+ /*
+ * Fast path using SHA-256 library. This is enabled only for
+ * verity version 1, where the salt is at the beginning.
+ */
+ *ctx = *v->initial_hashstate.sha256;
+ sha256_update(ctx, data, len);
+ sha256_final(ctx, digest);
+ return 0;
+ }
+
+ desc = &io->hash_ctx.shash;
desc->tfm = v->shash_tfm;
- if (unlikely(v->initial_hashstate == NULL)) {
+ if (unlikely(v->initial_hashstate.shash == NULL)) {
/* Version 0: salt at end */
r = crypto_shash_init(desc) ?:
crypto_shash_update(desc, data, len) ?:
crypto_shash_update(desc, v->salt, v->salt_size) ?:
crypto_shash_final(desc, digest);
} else {
/* Version 1: salt at beginning */
- r = crypto_shash_import(desc, v->initial_hashstate) ?:
+ r = crypto_shash_import(desc, v->initial_hashstate.shash) ?:
crypto_shash_finup(desc, data, len, digest);
}
if (unlikely(r))
DMERR("Error hashing block: %d", r);
return r;
@@ -1002,11 +1016,11 @@ static void verity_dtr(struct dm_target *ti)
if (v->bufio)
dm_bufio_client_destroy(v->bufio);
kvfree(v->validated_blocks);
kfree(v->salt);
- kfree(v->initial_hashstate);
+ kfree(v->initial_hashstate.shash);
kfree(v->root_digest);
kfree(v->zero_digest);
verity_free_sig(v);
crypto_free_shash(v->shash_tfm);
@@ -1067,12 +1081,11 @@ static int verity_alloc_zero_digest(struct dm_verity *v)
v->zero_digest = kmalloc(v->digest_size, GFP_KERNEL);
if (!v->zero_digest)
return r;
- io = kmalloc(sizeof(*io) + crypto_shash_descsize(v->shash_tfm),
- GFP_KERNEL);
+ io = kmalloc(v->ti->per_io_data_size, GFP_KERNEL);
if (!io)
return r; /* verity_dtr will free zero_digest */
zero_data = kzalloc(1 << v->data_dev_block_bits, GFP_KERNEL);
@@ -1254,10 +1267,24 @@ static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
v->digest_size = crypto_shash_digestsize(shash);
if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) {
ti->error = "Digest size too big";
return -EINVAL;
}
+ if (likely(v->version && strcmp(alg_name, "sha256") == 0)) {
+ /*
+ * Fast path: use the library API for reduced overhead and
+ * interleaved hashing support.
+ */
+ v->use_sha256_lib = true;
+ ti->per_io_data_size =
+ offsetofend(struct dm_verity_io, hash_ctx.sha256);
+ } else {
+ /* Fallback case: use the generic crypto API. */
+ ti->per_io_data_size =
+ offsetofend(struct dm_verity_io, hash_ctx.shash) +
+ crypto_shash_descsize(shash);
+ }
return 0;
}
static int verity_setup_salt_and_hashstate(struct dm_verity *v, const char *arg)
{
@@ -1274,28 +1301,39 @@ static int verity_setup_salt_and_hashstate(struct dm_verity *v, const char *arg)
hex2bin(v->salt, arg, v->salt_size)) {
ti->error = "Invalid salt";
return -EINVAL;
}
}
- if (v->version) { /* Version 1: salt at beginning */
+ if (likely(v->use_sha256_lib)) {
+ /* Implies version 1: salt at beginning */
+ v->initial_hashstate.sha256 =
+ kmalloc(sizeof(struct sha256_ctx), GFP_KERNEL);
+ if (!v->initial_hashstate.sha256) {
+ ti->error = "Cannot allocate initial hash state";
+ return -ENOMEM;
+ }
+ sha256_init(v->initial_hashstate.sha256);
+ sha256_update(v->initial_hashstate.sha256,
+ v->salt, v->salt_size);
+ } else if (v->version) { /* Version 1: salt at beginning */
SHASH_DESC_ON_STACK(desc, v->shash_tfm);
int r;
/*
* Compute the pre-salted hash state that can be passed to
* crypto_shash_import() for each block later.
*/
- v->initial_hashstate = kmalloc(
+ v->initial_hashstate.shash = kmalloc(
crypto_shash_statesize(v->shash_tfm), GFP_KERNEL);
- if (!v->initial_hashstate) {
+ if (!v->initial_hashstate.shash) {
ti->error = "Cannot allocate initial hash state";
return -ENOMEM;
}
desc->tfm = v->shash_tfm;
r = crypto_shash_init(desc) ?:
crypto_shash_update(desc, v->salt, v->salt_size) ?:
- crypto_shash_export(desc, v->initial_hashstate);
+ crypto_shash_export(desc, v->initial_hashstate.shash);
if (r) {
ti->error = "Cannot set up initial hash state";
return r;
}
}
@@ -1553,13 +1591,10 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
ti->error = "Cannot allocate workqueue";
r = -ENOMEM;
goto bad;
}
- ti->per_io_data_size = sizeof(struct dm_verity_io) +
- crypto_shash_descsize(v->shash_tfm);
-
r = verity_fec_ctr(v);
if (r)
goto bad;
ti->per_io_data_size = roundup(ti->per_io_data_size,
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 6d141abd965c7..cdcee68a4bc0a 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -14,10 +14,11 @@
#include <linux/dm-io.h>
#include <linux/dm-bufio.h>
#include <linux/device-mapper.h>
#include <linux/interrupt.h>
#include <crypto/hash.h>
+#include <crypto/sha2.h>
#define DM_VERITY_MAX_LEVELS 63
enum verity_mode {
DM_VERITY_MODE_EIO,
@@ -40,11 +41,14 @@ struct dm_verity {
struct dm_bufio_client *bufio;
char *alg_name;
struct crypto_shash *shash_tfm;
u8 *root_digest; /* digest of the root block */
u8 *salt; /* salt: its size is salt_size */
- u8 *initial_hashstate; /* salted initial state, if version >= 1 */
+ union {
+ struct sha256_ctx *sha256; /* for use_sha256_lib=1 */
+ u8 *shash; /* for use_sha256_lib=0 */
+ } initial_hashstate; /* salted initial state, if version >= 1 */
u8 *zero_digest; /* digest for a zero block */
#ifdef CONFIG_SECURITY
u8 *root_digest_sig; /* signature of the root digest */
unsigned int sig_size; /* root digest signature size */
#endif /* CONFIG_SECURITY */
@@ -57,10 +61,11 @@ struct dm_verity {
unsigned char hash_per_block_bits; /* log2(hashes in hash block) */
unsigned char levels; /* the number of tree levels */
unsigned char version;
bool hash_failed:1; /* set if hash of any block failed */
bool use_bh_wq:1; /* try to verify in BH wq before normal work-queue */
+ bool use_sha256_lib:1; /* use SHA-256 library instead of generic crypto API */
unsigned int digest_size; /* digest size for the current hash algorithm */
enum verity_mode mode; /* mode for handling verification errors */
enum verity_mode error_mode;/* mode for handling I/O errors */
unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
@@ -96,15 +101,20 @@ struct dm_verity_io {
u8 real_digest[HASH_MAX_DIGESTSIZE];
u8 want_digest[HASH_MAX_DIGESTSIZE];
/*
- * Temporary space for hashing. This is variable-length and must be at
- * the end of the struct. struct shash_desc is just the fixed part;
- * it's followed by a context of size crypto_shash_descsize(shash_tfm).
+ * Temporary space for hashing. Either sha256 or shash is used,
+ * depending on the value of use_sha256_lib. If shash is used,
+ * then this field is variable-length, with total size
+ * sizeof(struct shash_desc) + crypto_shash_descsize(shash_tfm).
+ * For this reason, this field must be the end of the struct.
*/
- struct shash_desc hash_desc;
+ union {
+ struct sha256_ctx sha256;
+ struct shash_desc shash;
+ } hash_ctx;
};
static inline u8 *verity_io_real_digest(struct dm_verity *v,
struct dm_verity_io *io)
{
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/4] dm-verity: reduce scope of real and wanted digests
2025-10-14 21:16 [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing Eric Biggers
2025-10-14 21:16 ` [PATCH 1/4] dm-verity: remove log message with shash driver name Eric Biggers
2025-10-14 21:16 ` [PATCH 2/4] dm-verity: use SHA-256 library for SHA-256 Eric Biggers
@ 2025-10-14 21:16 ` Eric Biggers
2025-10-14 21:16 ` [PATCH 4/4] dm-verity: use 2-way interleaved SHA-256 hashing when supported Eric Biggers
3 siblings, 0 replies; 5+ messages in thread
From: Eric Biggers @ 2025-10-14 21:16 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-crypto, linux-kernel, Eric Biggers
In preparation for supporting interleaved hashing where dm-verity will
need to keep track of the real and wanted digests for multiple data
blocks simultaneously, stop using the want_digest and real_digest fields
of struct dm_verity_io from so many different places. Specifically:
- Make various functions take want_digest as a parameter rather than
having it be implicitly passed via the struct dm_verity_io.
- Add a new tmp_digest field, and use this instead of real_digest when
computing a digest solely for the purpose of immediately checking it.
The result is that real_digest and want_digest are used only by
verity_verify_io().
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
drivers/md/dm-verity-fec.c | 21 +++++++++----------
drivers/md/dm-verity-fec.h | 5 +++--
drivers/md/dm-verity-target.c | 38 ++++++++++++++++++-----------------
drivers/md/dm-verity.h | 1 +
4 files changed, 34 insertions(+), 31 deletions(-)
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index d382a390d39ab..301a9c01bf865 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -186,18 +186,17 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io,
/*
* Locate data block erasures using verity hashes.
*/
static int fec_is_erasure(struct dm_verity *v, struct dm_verity_io *io,
- u8 *want_digest, u8 *data)
+ const u8 *want_digest, const u8 *data)
{
if (unlikely(verity_hash(v, io, data, 1 << v->data_dev_block_bits,
- verity_io_real_digest(v, io))))
+ io->tmp_digest)))
return 0;
- return memcmp(verity_io_real_digest(v, io), want_digest,
- v->digest_size) != 0;
+ return memcmp(io->tmp_digest, want_digest, v->digest_size) != 0;
}
/*
* Read data blocks that are part of the RS block and deinterleave as much as
* fits into buffers. Check for erasure locations if @neras is non-NULL.
@@ -364,11 +363,11 @@ static void fec_init_bufs(struct dm_verity *v, struct dm_verity_fec_io *fio)
* (indicated by @offset) in fio->output. If @use_erasures is non-zero, uses
* hashes to locate erasures.
*/
static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io,
struct dm_verity_fec_io *fio, u64 rsb, u64 offset,
- bool use_erasures)
+ const u8 *want_digest, bool use_erasures)
{
int r, neras = 0;
unsigned int pos;
r = fec_alloc_bufs(v, fio);
@@ -390,27 +389,27 @@ static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io,
pos += fio->nbufs << DM_VERITY_FEC_BUF_RS_BITS;
}
/* Always re-validate the corrected block against the expected hash */
r = verity_hash(v, io, fio->output, 1 << v->data_dev_block_bits,
- verity_io_real_digest(v, io));
+ io->tmp_digest);
if (unlikely(r < 0))
return r;
- if (memcmp(verity_io_real_digest(v, io), verity_io_want_digest(v, io),
- v->digest_size)) {
+ if (memcmp(io->tmp_digest, want_digest, v->digest_size)) {
DMERR_LIMIT("%s: FEC %llu: failed to correct (%d erasures)",
v->data_dev->name, (unsigned long long)rsb, neras);
return -EILSEQ;
}
return 0;
}
/* Correct errors in a block. Copies corrected block to dest. */
int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
- enum verity_block_type type, sector_t block, u8 *dest)
+ enum verity_block_type type, const u8 *want_digest,
+ sector_t block, u8 *dest)
{
int r;
struct dm_verity_fec_io *fio = fec_io(io);
u64 offset, res, rsb;
@@ -449,13 +448,13 @@ int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
/*
* Locating erasures is slow, so attempt to recover the block without
* them first. Do a second attempt with erasures if the corruption is
* bad enough.
*/
- r = fec_decode_rsb(v, io, fio, rsb, offset, false);
+ r = fec_decode_rsb(v, io, fio, rsb, offset, want_digest, false);
if (r < 0) {
- r = fec_decode_rsb(v, io, fio, rsb, offset, true);
+ r = fec_decode_rsb(v, io, fio, rsb, offset, want_digest, true);
if (r < 0)
goto done;
}
memcpy(dest, fio->output, 1 << v->data_dev_block_bits);
diff --git a/drivers/md/dm-verity-fec.h b/drivers/md/dm-verity-fec.h
index 09123a6129538..a6689cdc489db 100644
--- a/drivers/md/dm-verity-fec.h
+++ b/drivers/md/dm-verity-fec.h
@@ -66,12 +66,12 @@ struct dm_verity_fec_io {
#define DM_VERITY_OPTS_FEC 8
extern bool verity_fec_is_enabled(struct dm_verity *v);
extern int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io,
- enum verity_block_type type, sector_t block,
- u8 *dest);
+ enum verity_block_type type, const u8 *want_digest,
+ sector_t block, u8 *dest);
extern unsigned int verity_fec_status_table(struct dm_verity *v, unsigned int sz,
char *result, unsigned int maxlen);
extern void verity_fec_finish_io(struct dm_verity_io *io);
@@ -97,10 +97,11 @@ static inline bool verity_fec_is_enabled(struct dm_verity *v)
}
static inline int verity_fec_decode(struct dm_verity *v,
struct dm_verity_io *io,
enum verity_block_type type,
+ const u8 *want_digest,
sector_t block, u8 *dest)
{
return -EOPNOTSUPP;
}
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index bba9810805631..af9f1544af3ea 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -227,16 +227,16 @@ static int verity_handle_err(struct dm_verity *v, enum verity_block_type type,
/*
* Verify hash of a metadata block pertaining to the specified data block
* ("block" argument) at a specified level ("level" argument).
*
- * On successful return, verity_io_want_digest(v, io) contains the hash value
- * for a lower tree level or for the data block (if we're at the lowest level).
+ * On successful return, want_digest contains the hash value for a lower tree
+ * level or for the data block (if we're at the lowest level).
*
* If "skip_unverified" is true, unverified buffer is skipped and 1 is returned.
* If "skip_unverified" is false, unverified buffer is hashed and verified
- * against current value of verity_io_want_digest(v, io).
+ * against current value of want_digest.
*/
static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
sector_t block, int level, bool skip_unverified,
u8 *want_digest)
{
@@ -271,11 +271,11 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
r = PTR_ERR(data);
data = dm_bufio_new(v->bufio, hash_block, &buf);
if (IS_ERR(data))
return r;
if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_METADATA,
- hash_block, data) == 0) {
+ want_digest, hash_block, data) == 0) {
aux = dm_bufio_get_aux_data(buf);
aux->hash_verified = 1;
goto release_ok;
} else {
dm_bufio_release(buf);
@@ -291,26 +291,26 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
r = 1;
goto release_ret_r;
}
r = verity_hash(v, io, data, 1 << v->hash_dev_block_bits,
- verity_io_real_digest(v, io));
+ io->tmp_digest);
if (unlikely(r < 0))
goto release_ret_r;
- if (likely(memcmp(verity_io_real_digest(v, io), want_digest,
+ if (likely(memcmp(io->tmp_digest, want_digest,
v->digest_size) == 0))
aux->hash_verified = 1;
else if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
* Error handling code (FEC included) cannot be run in a
* tasklet since it may sleep, so fallback to work-queue.
*/
r = -EAGAIN;
goto release_ret_r;
} else if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_METADATA,
- hash_block, data) == 0)
+ want_digest, hash_block, data) == 0)
aux->hash_verified = 1;
else if (verity_handle_err(v,
DM_VERITY_BLOCK_TYPE_METADATA,
hash_block)) {
struct bio *bio;
@@ -370,11 +370,12 @@ int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
return r;
}
static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
- sector_t cur_block, u8 *dest)
+ const u8 *want_digest, sector_t cur_block,
+ u8 *dest)
{
struct page *page;
void *buffer;
int r;
struct dm_io_request io_req;
@@ -394,16 +395,15 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
goto free_ret;
r = verity_hash(v, io, buffer, 1 << v->data_dev_block_bits,
- verity_io_real_digest(v, io));
+ io->tmp_digest);
if (unlikely(r))
goto free_ret;
- if (memcmp(verity_io_real_digest(v, io),
- verity_io_want_digest(v, io), v->digest_size)) {
+ if (memcmp(io->tmp_digest, want_digest, v->digest_size)) {
r = -EIO;
goto free_ret;
}
memcpy(dest, buffer, 1 << v->data_dev_block_bits);
@@ -414,28 +414,29 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
return r;
}
static int verity_handle_data_hash_mismatch(struct dm_verity *v,
struct dm_verity_io *io,
- struct bio *bio, sector_t blkno,
- u8 *data)
+ struct bio *bio,
+ const u8 *want_digest,
+ sector_t blkno, u8 *data)
{
if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
* Error handling code (FEC included) cannot be run in the
* BH workqueue, so fallback to a standard workqueue.
*/
return -EAGAIN;
}
- if (verity_recheck(v, io, blkno, data) == 0) {
+ if (verity_recheck(v, io, want_digest, blkno, data) == 0) {
if (v->validated_blocks)
set_bit(blkno, v->validated_blocks);
return 0;
}
#if defined(CONFIG_DM_VERITY_FEC)
- if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, blkno,
- data) == 0)
+ if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, want_digest,
+ blkno, data) == 0)
return 0;
#endif
if (bio->bi_status)
return -EIO; /* Error correction failed; Just return error */
@@ -523,12 +524,13 @@ static int verity_verify_io(struct dm_verity_io *io)
if (v->validated_blocks)
set_bit(cur_block, v->validated_blocks);
kunmap_local(data);
continue;
}
- r = verity_handle_data_hash_mismatch(v, io, bio, cur_block,
- data);
+ r = verity_handle_data_hash_mismatch(v, io, bio,
+ verity_io_want_digest(v, io),
+ cur_block, data);
kunmap_local(data);
if (unlikely(r))
return r;
}
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index cdcee68a4bc0a..cf7973ed30596 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -97,10 +97,11 @@ struct dm_verity_io {
bool had_mismatch;
struct work_struct work;
struct work_struct bh_work;
+ u8 tmp_digest[HASH_MAX_DIGESTSIZE];
u8 real_digest[HASH_MAX_DIGESTSIZE];
u8 want_digest[HASH_MAX_DIGESTSIZE];
/*
* Temporary space for hashing. Either sha256 or shash is used,
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 4/4] dm-verity: use 2-way interleaved SHA-256 hashing when supported
2025-10-14 21:16 [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing Eric Biggers
` (2 preceding siblings ...)
2025-10-14 21:16 ` [PATCH 3/4] dm-verity: reduce scope of real and wanted digests Eric Biggers
@ 2025-10-14 21:16 ` Eric Biggers
3 siblings, 0 replies; 5+ messages in thread
From: Eric Biggers @ 2025-10-14 21:16 UTC (permalink / raw)
To: dm-devel, Alasdair Kergon, Mike Snitzer, Mikulas Patocka
Cc: linux-crypto, linux-kernel, Eric Biggers
When the crypto library provides an optimized implementation of
sha256_finup_2x(), use it to interleave the hashing of pairs of data
blocks. On some CPUs this nearly doubles hashing performance. The
increase in overall throughput of cold-cache dm-verity reads that I'm
seeing on arm64 and x86_64 is roughly 35% (though this metric is hard to
measure as it jumps around a lot).
For now this is done only on data blocks, not Merkle tree blocks. We
could use sha256_finup_2x() on Merkle tree blocks too, but that is less
important as there aren't as many Merkle tree blocks as data blocks, and
that would require some additional code restructuring.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
drivers/md/dm-verity-target.c | 113 ++++++++++++++++++++++++++--------
drivers/md/dm-verity.h | 31 +++++-----
2 files changed, 103 insertions(+), 41 deletions(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index af9f1544af3ea..bf0aee73b074c 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -415,13 +415,16 @@ static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io,
}
static int verity_handle_data_hash_mismatch(struct dm_verity *v,
struct dm_verity_io *io,
struct bio *bio,
- const u8 *want_digest,
- sector_t blkno, u8 *data)
+ struct pending_block *block)
{
+ const u8 *want_digest = block->want_digest;
+ sector_t blkno = block->blkno;
+ u8 *data = block->data;
+
if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
* Error handling code (FEC included) cannot be run in the
* BH workqueue, so fallback to a standard workqueue.
*/
@@ -446,21 +449,77 @@ static int verity_handle_data_hash_mismatch(struct dm_verity *v,
return -EIO;
}
return 0;
}
+static void verity_clear_pending_blocks(struct dm_verity_io *io)
+{
+ int i;
+
+ for (i = io->num_pending - 1; i >= 0; i--) {
+ kunmap_local(io->pending_blocks[i].data);
+ io->pending_blocks[i].data = NULL;
+ }
+ io->num_pending = 0;
+}
+
+static int verity_verify_pending_blocks(struct dm_verity *v,
+ struct dm_verity_io *io,
+ struct bio *bio)
+{
+ const unsigned int block_size = 1 << v->data_dev_block_bits;
+ int i, r;
+
+ if (io->num_pending == 2) {
+ /* num_pending == 2 implies that the algorithm is SHA-256 */
+ sha256_finup_2x(v->initial_hashstate.sha256,
+ io->pending_blocks[0].data,
+ io->pending_blocks[1].data, block_size,
+ io->pending_blocks[0].real_digest,
+ io->pending_blocks[1].real_digest);
+ } else {
+ for (i = 0; i < io->num_pending; i++) {
+ r = verity_hash(v, io, io->pending_blocks[i].data,
+ block_size,
+ io->pending_blocks[i].real_digest);
+ if (unlikely(r))
+ return r;
+ }
+ }
+
+ for (i = 0; i < io->num_pending; i++) {
+ struct pending_block *block = &io->pending_blocks[i];
+
+ if (likely(memcmp(block->real_digest, block->want_digest,
+ v->digest_size) == 0)) {
+ if (v->validated_blocks)
+ set_bit(block->blkno, v->validated_blocks);
+ } else {
+ r = verity_handle_data_hash_mismatch(v, io, bio, block);
+ if (unlikely(r))
+ return r;
+ }
+ }
+ verity_clear_pending_blocks(io);
+ return 0;
+}
+
/*
* Verify one "dm_verity_io" structure.
*/
static int verity_verify_io(struct dm_verity_io *io)
{
struct dm_verity *v = io->v;
const unsigned int block_size = 1 << v->data_dev_block_bits;
+ const int max_pending = v->use_sha256_finup_2x ? 2 : 1;
struct bvec_iter iter_copy;
struct bvec_iter *iter;
struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
unsigned int b;
+ int r;
+
+ io->num_pending = 0;
if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) {
/*
* Copy the iterator in case we need to restart
* verification in a work-queue.
@@ -470,36 +529,38 @@ static int verity_verify_io(struct dm_verity_io *io)
} else
iter = &io->iter;
for (b = 0; b < io->n_blocks;
b++, bio_advance_iter(bio, iter, block_size)) {
- int r;
- sector_t cur_block = io->block + b;
+ sector_t blkno = io->block + b;
+ struct pending_block *block;
bool is_zero;
struct bio_vec bv;
void *data;
if (v->validated_blocks && bio->bi_status == BLK_STS_OK &&
- likely(test_bit(cur_block, v->validated_blocks)))
+ likely(test_bit(blkno, v->validated_blocks)))
continue;
- r = verity_hash_for_block(v, io, cur_block,
- verity_io_want_digest(v, io),
+ block = &io->pending_blocks[io->num_pending];
+
+ r = verity_hash_for_block(v, io, blkno, block->want_digest,
&is_zero);
if (unlikely(r < 0))
- return r;
+ goto error;
bv = bio_iter_iovec(bio, *iter);
if (unlikely(bv.bv_len < block_size)) {
/*
* Data block spans pages. This should not happen,
* since dm-verity sets dma_alignment to the data block
* size minus 1, and dm-verity also doesn't allow the
* data block size to be greater than PAGE_SIZE.
*/
DMERR_LIMIT("unaligned io (data block spans pages)");
- return -EIO;
+ r = -EIO;
+ goto error;
}
data = bvec_kmap_local(&bv);
if (is_zero) {
@@ -509,34 +570,30 @@ static int verity_verify_io(struct dm_verity_io *io)
*/
memset(data, 0, block_size);
kunmap_local(data);
continue;
}
-
- r = verity_hash(v, io, data, block_size,
- verity_io_real_digest(v, io));
- if (unlikely(r < 0)) {
- kunmap_local(data);
- return r;
+ block->data = data;
+ block->blkno = blkno;
+ if (++io->num_pending == max_pending) {
+ r = verity_verify_pending_blocks(v, io, bio);
+ if (unlikely(r))
+ goto error;
}
+ }
- if (likely(memcmp(verity_io_real_digest(v, io),
- verity_io_want_digest(v, io), v->digest_size) == 0)) {
- if (v->validated_blocks)
- set_bit(cur_block, v->validated_blocks);
- kunmap_local(data);
- continue;
- }
- r = verity_handle_data_hash_mismatch(v, io, bio,
- verity_io_want_digest(v, io),
- cur_block, data);
- kunmap_local(data);
+ if (io->num_pending) {
+ r = verity_verify_pending_blocks(v, io, bio);
if (unlikely(r))
- return r;
+ goto error;
}
return 0;
+
+error:
+ verity_clear_pending_blocks(io);
+ return r;
}
/*
* Skip verity work in response to I/O error when system is shutting down.
*/
@@ -1275,10 +1332,12 @@ static int verity_setup_hash_alg(struct dm_verity *v, const char *alg_name)
/*
* Fast path: use the library API for reduced overhead and
* interleaved hashing support.
*/
v->use_sha256_lib = true;
+ if (sha256_finup_2x_is_optimized())
+ v->use_sha256_finup_2x = true;
ti->per_io_data_size =
offsetofend(struct dm_verity_io, hash_ctx.sha256);
} else {
/* Fallback case: use the generic crypto API. */
ti->per_io_data_size =
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index cf7973ed30596..f975a9e5c5d6b 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -62,10 +62,11 @@ struct dm_verity {
unsigned char levels; /* the number of tree levels */
unsigned char version;
bool hash_failed:1; /* set if hash of any block failed */
bool use_bh_wq:1; /* try to verify in BH wq before normal work-queue */
bool use_sha256_lib:1; /* use SHA-256 library instead of generic crypto API */
+ bool use_sha256_finup_2x:1; /* use interleaved hashing optimization */
unsigned int digest_size; /* digest size for the current hash algorithm */
enum verity_mode mode; /* mode for handling verification errors */
enum verity_mode error_mode;/* mode for handling I/O errors */
unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
@@ -81,10 +82,17 @@ struct dm_verity {
struct dm_io_client *io;
mempool_t recheck_pool;
};
+struct pending_block {
+ void *data;
+ sector_t blkno;
+ u8 want_digest[HASH_MAX_DIGESTSIZE];
+ u8 real_digest[HASH_MAX_DIGESTSIZE];
+};
+
struct dm_verity_io {
struct dm_verity *v;
/* original value of bio->bi_end_io */
bio_end_io_t *orig_bi_end_io;
@@ -98,12 +106,19 @@ struct dm_verity_io {
struct work_struct work;
struct work_struct bh_work;
u8 tmp_digest[HASH_MAX_DIGESTSIZE];
- u8 real_digest[HASH_MAX_DIGESTSIZE];
- u8 want_digest[HASH_MAX_DIGESTSIZE];
+
+ /*
+ * This is the queue of data blocks that are pending verification. When
+ * the crypto layer supports interleaved hashing, we allow multiple
+ * blocks to be queued up in order to utilize it. This can improve
+ * performance significantly vs. sequential hashing of each block.
+ */
+ int num_pending;
+ struct pending_block pending_blocks[2];
/*
* Temporary space for hashing. Either sha256 or shash is used,
* depending on the value of use_sha256_lib. If shash is used,
* then this field is variable-length, with total size
@@ -114,22 +129,10 @@ struct dm_verity_io {
struct sha256_ctx sha256;
struct shash_desc shash;
} hash_ctx;
};
-static inline u8 *verity_io_real_digest(struct dm_verity *v,
- struct dm_verity_io *io)
-{
- return io->real_digest;
-}
-
-static inline u8 *verity_io_want_digest(struct dm_verity *v,
- struct dm_verity_io *io)
-{
- return io->want_digest;
-}
-
extern int verity_hash(struct dm_verity *v, struct dm_verity_io *io,
const u8 *data, size_t len, u8 *digest);
extern int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
sector_t block, u8 *digest, bool *is_zero);
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-14 21:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-14 21:16 [PATCH 0/4] Optimize dm-verity using 2-way interleaved SHA-256 hashing Eric Biggers
2025-10-14 21:16 ` [PATCH 1/4] dm-verity: remove log message with shash driver name Eric Biggers
2025-10-14 21:16 ` [PATCH 2/4] dm-verity: use SHA-256 library for SHA-256 Eric Biggers
2025-10-14 21:16 ` [PATCH 3/4] dm-verity: reduce scope of real and wanted digests Eric Biggers
2025-10-14 21:16 ` [PATCH 4/4] dm-verity: use 2-way interleaved SHA-256 hashing when supported Eric Biggers
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.