* [v2 PATCH 00/11] Multibuffer hashing take two
@ 2025-02-16 3:07 Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 01/11] crypto: ahash - Only save callback and data in ahash_save_req Herbert Xu
` (12 more replies)
0 siblings, 13 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
This patch-set introduces two additions to the ahash interface.
First of all request chaining is added so that an arbitrary number
of requests can be submitted in one go. Incidentally this also
reduces the cost of indirect calls by amortisation.
It then adds virtual address support to ahash. This allows the
user to supply a virtual address as the input instead of an SG
list.
This is assumed to be not DMA-capable so it is always copied
before it's passed to an existing ahash driver. New drivers can
elect to take virtual addresses directly. Of course existing shash
algorithms are able to take virtual addresses without any copying.
The next patch resurrects the old SHA2 AVX2 muiltibuffer code as
a proof of concept that this API works. The result shows that with
a full complement of 8 requests, this API is able to achieve parity
with the more modern but single-threaded SHA-NI code. This passes
the multibuffer fuzz tests.
Finally introduce a sync hash interface that is similar to the sync
skcipher interface. This will replace the shash interface for users.
Use it in fsverity and enable multibuffer hashing.
Eric Biggers (1):
fsverity: improve performance by using multibuffer hashing
Herbert Xu (10):
crypto: ahash - Only save callback and data in ahash_save_req
crypto: x86/ghash - Use proper helpers to clone request
crypto: hash - Add request chaining API
crypto: tcrypt - Restore multibuffer ahash tests
crypto: ahash - Add virtual address support
crypto: ahash - Set default reqsize from ahash_alg
crypto: testmgr - Add multibuffer hash testing
crypto: x86/sha2 - Restore multibuffer AVX2 support
crypto: hash - Add sync hash interface
fsverity: Use sync hash instead of shash
arch/x86/crypto/Makefile | 2 +-
arch/x86/crypto/ghash-clmulni-intel_glue.c | 23 +-
arch/x86/crypto/sha256_mb_mgr_datastruct.S | 304 +++++++++++
arch/x86/crypto/sha256_ssse3_glue.c | 540 ++++++++++++++++--
arch/x86/crypto/sha256_x8_avx2.S | 598 ++++++++++++++++++++
crypto/ahash.c | 605 ++++++++++++++++++---
crypto/algapi.c | 2 +-
crypto/tcrypt.c | 231 ++++++++
crypto/testmgr.c | 132 ++++-
fs/verity/fsverity_private.h | 4 +-
fs/verity/hash_algs.c | 41 +-
fs/verity/verify.c | 179 +++++-
include/crypto/algapi.h | 11 +
include/crypto/hash.h | 172 +++++-
include/crypto/internal/hash.h | 17 +-
include/linux/crypto.h | 24 +
16 files changed, 2659 insertions(+), 226 deletions(-)
create mode 100644 arch/x86/crypto/sha256_mb_mgr_datastruct.S
create mode 100644 arch/x86/crypto/sha256_x8_avx2.S
--
2.39.5
^ permalink raw reply [flat|nested] 42+ messages in thread
* [v2 PATCH 01/11] crypto: ahash - Only save callback and data in ahash_save_req
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 02/11] crypto: x86/ghash - Use proper helpers to clone request Herbert Xu
` (11 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
As unaligned operations are supported by the underlying algorithm,
ahash_save_req and ahash_restore_req can be greatly simplified to
only preserve the callback and data.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/ahash.c | 97 ++++++++++++++++---------------------------
include/crypto/hash.h | 3 --
2 files changed, 35 insertions(+), 65 deletions(-)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index bcd9de009a91..c8e7327c6949 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -27,6 +27,12 @@
#define CRYPTO_ALG_TYPE_AHASH_MASK 0x0000000e
+struct ahash_save_req_state {
+ struct ahash_request *req;
+ crypto_completion_t compl;
+ void *data;
+};
+
/*
* For an ahash tfm that is using an shash algorithm (instead of an ahash
* algorithm), this returns the underlying shash tfm.
@@ -262,67 +268,34 @@ int crypto_ahash_init(struct ahash_request *req)
}
EXPORT_SYMBOL_GPL(crypto_ahash_init);
-static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt,
- bool has_state)
+static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- unsigned int ds = crypto_ahash_digestsize(tfm);
- struct ahash_request *subreq;
- unsigned int subreq_size;
- unsigned int reqsize;
- u8 *result;
+ struct ahash_save_req_state *state;
gfp_t gfp;
u32 flags;
- subreq_size = sizeof(*subreq);
- reqsize = crypto_ahash_reqsize(tfm);
- reqsize = ALIGN(reqsize, crypto_tfm_ctx_alignment());
- subreq_size += reqsize;
- subreq_size += ds;
-
flags = ahash_request_flags(req);
gfp = (flags & CRYPTO_TFM_REQ_MAY_SLEEP) ? GFP_KERNEL : GFP_ATOMIC;
- subreq = kmalloc(subreq_size, gfp);
- if (!subreq)
+ state = kmalloc(sizeof(*state), gfp);
+ if (!state)
return -ENOMEM;
- ahash_request_set_tfm(subreq, tfm);
- ahash_request_set_callback(subreq, flags, cplt, req);
-
- result = (u8 *)(subreq + 1) + reqsize;
-
- ahash_request_set_crypt(subreq, req->src, result, req->nbytes);
-
- if (has_state) {
- void *state;
-
- state = kmalloc(crypto_ahash_statesize(tfm), gfp);
- if (!state) {
- kfree(subreq);
- return -ENOMEM;
- }
-
- crypto_ahash_export(req, state);
- crypto_ahash_import(subreq, state);
- kfree_sensitive(state);
- }
-
- req->priv = subreq;
+ state->compl = req->base.complete;
+ state->data = req->base.data;
+ req->base.complete = cplt;
+ req->base.data = state;
+ state->req = req;
return 0;
}
-static void ahash_restore_req(struct ahash_request *req, int err)
+static void ahash_restore_req(struct ahash_request *req)
{
- struct ahash_request *subreq = req->priv;
+ struct ahash_save_req_state *state = req->base.data;
- if (!err)
- memcpy(req->result, subreq->result,
- crypto_ahash_digestsize(crypto_ahash_reqtfm(req)));
-
- req->priv = NULL;
-
- kfree_sensitive(subreq);
+ req->base.complete = state->compl;
+ req->base.data = state->data;
+ kfree(state);
}
int crypto_ahash_update(struct ahash_request *req)
@@ -374,51 +347,51 @@ EXPORT_SYMBOL_GPL(crypto_ahash_digest);
static void ahash_def_finup_done2(void *data, int err)
{
- struct ahash_request *areq = data;
+ struct ahash_save_req_state *state = data;
+ struct ahash_request *areq = state->req;
if (err == -EINPROGRESS)
return;
- ahash_restore_req(areq, err);
-
+ ahash_restore_req(areq);
ahash_request_complete(areq, err);
}
static int ahash_def_finup_finish1(struct ahash_request *req, int err)
{
- struct ahash_request *subreq = req->priv;
-
if (err)
goto out;
- subreq->base.complete = ahash_def_finup_done2;
+ req->base.complete = ahash_def_finup_done2;
- err = crypto_ahash_alg(crypto_ahash_reqtfm(req))->final(subreq);
+ err = crypto_ahash_alg(crypto_ahash_reqtfm(req))->final(req);
if (err == -EINPROGRESS || err == -EBUSY)
return err;
out:
- ahash_restore_req(req, err);
+ ahash_restore_req(req);
return err;
}
static void ahash_def_finup_done1(void *data, int err)
{
- struct ahash_request *areq = data;
- struct ahash_request *subreq;
+ struct ahash_save_req_state *state0 = data;
+ struct ahash_save_req_state state;
+ struct ahash_request *areq;
+ state = *state0;
+ areq = state.req;
if (err == -EINPROGRESS)
goto out;
- subreq = areq->priv;
- subreq->base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG;
+ areq->base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
err = ahash_def_finup_finish1(areq, err);
if (err == -EINPROGRESS || err == -EBUSY)
return;
out:
- ahash_request_complete(areq, err);
+ state.compl(state.data, err);
}
static int ahash_def_finup(struct ahash_request *req)
@@ -426,11 +399,11 @@ static int ahash_def_finup(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
int err;
- err = ahash_save_req(req, ahash_def_finup_done1, true);
+ err = ahash_save_req(req, ahash_def_finup_done1);
if (err)
return err;
- err = crypto_ahash_alg(tfm)->update(req->priv);
+ err = crypto_ahash_alg(tfm)->update(req);
if (err == -EINPROGRESS || err == -EBUSY)
return err;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 2d5ea9f9ff43..9c1f8ca59a77 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -55,9 +55,6 @@ struct ahash_request {
struct scatterlist *src;
u8 *result;
- /* This field may only be used by the ahash API code. */
- void *priv;
-
void *__ctx[] CRYPTO_MINALIGN_ATTR;
};
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 02/11] crypto: x86/ghash - Use proper helpers to clone request
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 01/11] crypto: ahash - Only save callback and data in ahash_save_req Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 03/11] crypto: hash - Add request chaining API Herbert Xu
` (10 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Rather than copying a request by hand with memcpy, use the correct
API helpers to setup the new request. This will matter once the
API helpers start setting up chained requests as a simple memcpy
will break chaining.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
arch/x86/crypto/ghash-clmulni-intel_glue.c | 23 ++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
index 41bc02e48916..c759ec808bf1 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
+++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
@@ -189,6 +189,20 @@ static int ghash_async_init(struct ahash_request *req)
return crypto_shash_init(desc);
}
+static void ghash_init_cryptd_req(struct ahash_request *req)
+{
+ struct ahash_request *cryptd_req = ahash_request_ctx(req);
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
+ struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
+
+ ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
+ ahash_request_set_callback(cryptd_req, req->base.flags,
+ req->base.complete, req->base.data);
+ ahash_request_set_crypt(cryptd_req, req->src, req->result,
+ req->nbytes);
+}
+
static int ghash_async_update(struct ahash_request *req)
{
struct ahash_request *cryptd_req = ahash_request_ctx(req);
@@ -198,8 +212,7 @@ static int ghash_async_update(struct ahash_request *req)
if (!crypto_simd_usable() ||
(in_atomic() && cryptd_ahash_queued(cryptd_tfm))) {
- memcpy(cryptd_req, req, sizeof(*req));
- ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
+ ghash_init_cryptd_req(req);
return crypto_ahash_update(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
@@ -216,8 +229,7 @@ static int ghash_async_final(struct ahash_request *req)
if (!crypto_simd_usable() ||
(in_atomic() && cryptd_ahash_queued(cryptd_tfm))) {
- memcpy(cryptd_req, req, sizeof(*req));
- ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
+ ghash_init_cryptd_req(req);
return crypto_ahash_final(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
@@ -257,8 +269,7 @@ static int ghash_async_digest(struct ahash_request *req)
if (!crypto_simd_usable() ||
(in_atomic() && cryptd_ahash_queued(cryptd_tfm))) {
- memcpy(cryptd_req, req, sizeof(*req));
- ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
+ ghash_init_cryptd_req(req);
return crypto_ahash_digest(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 03/11] crypto: hash - Add request chaining API
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 01/11] crypto: ahash - Only save callback and data in ahash_save_req Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 02/11] crypto: x86/ghash - Use proper helpers to clone request Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-03-26 9:00 ` Manorit Chawdhry
2025-02-16 3:07 ` [v2 PATCH 04/11] crypto: tcrypt - Restore multibuffer ahash tests Herbert Xu
` (9 subsequent siblings)
12 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
This adds request chaining to the ahash interface. Request chaining
allows multiple requests to be submitted in one shot. An algorithm
can elect to receive chained requests by setting the flag
CRYPTO_ALG_REQ_CHAIN. If this bit is not set, the API will break
up chained requests and submit them one-by-one.
A new err field is added to struct crypto_async_request to record
the return value for each individual request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/ahash.c | 261 +++++++++++++++++++++++++++++----
crypto/algapi.c | 2 +-
include/crypto/algapi.h | 11 ++
include/crypto/hash.h | 28 ++--
include/crypto/internal/hash.h | 10 ++
include/linux/crypto.h | 24 +++
6 files changed, 299 insertions(+), 37 deletions(-)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index c8e7327c6949..0546835f7304 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -28,11 +28,19 @@
#define CRYPTO_ALG_TYPE_AHASH_MASK 0x0000000e
struct ahash_save_req_state {
- struct ahash_request *req;
+ struct list_head head;
+ struct ahash_request *req0;
+ struct ahash_request *cur;
+ int (*op)(struct ahash_request *req);
crypto_completion_t compl;
void *data;
};
+static void ahash_reqchain_done(void *data, int err);
+static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt);
+static void ahash_restore_req(struct ahash_request *req);
+static int ahash_def_finup(struct ahash_request *req);
+
/*
* For an ahash tfm that is using an shash algorithm (instead of an ahash
* algorithm), this returns the underlying shash tfm.
@@ -256,24 +264,145 @@ int crypto_ahash_setkey(struct crypto_ahash *tfm, const u8 *key,
}
EXPORT_SYMBOL_GPL(crypto_ahash_setkey);
+static int ahash_reqchain_finish(struct ahash_save_req_state *state,
+ int err, u32 mask)
+{
+ struct ahash_request *req0 = state->req0;
+ struct ahash_request *req = state->cur;
+ struct ahash_request *n;
+
+ req->base.err = err;
+
+ if (req != req0)
+ list_add_tail(&req->base.list, &req0->base.list);
+
+ list_for_each_entry_safe(req, n, &state->head, base.list) {
+ list_del_init(&req->base.list);
+
+ req->base.flags &= mask;
+ req->base.complete = ahash_reqchain_done;
+ req->base.data = state;
+ state->cur = req;
+ err = state->op(req);
+
+ if (err == -EINPROGRESS) {
+ if (!list_empty(&state->head))
+ err = -EBUSY;
+ goto out;
+ }
+
+ if (err == -EBUSY)
+ goto out;
+
+ req->base.err = err;
+ list_add_tail(&req->base.list, &req0->base.list);
+ }
+
+ ahash_restore_req(req0);
+
+out:
+ return err;
+}
+
+static void ahash_reqchain_done(void *data, int err)
+{
+ struct ahash_save_req_state *state = data;
+ crypto_completion_t compl = state->compl;
+
+ data = state->data;
+
+ if (err == -EINPROGRESS) {
+ if (!list_empty(&state->head))
+ return;
+ goto notify;
+ }
+
+ err = ahash_reqchain_finish(state, err, CRYPTO_TFM_REQ_MAY_BACKLOG);
+ if (err == -EBUSY)
+ return;
+
+notify:
+ compl(data, err);
+}
+
+static int ahash_do_req_chain(struct ahash_request *req,
+ int (*op)(struct ahash_request *req))
+{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct ahash_save_req_state *state;
+ struct ahash_save_req_state state0;
+ int err;
+
+ if (!ahash_request_chained(req) || crypto_ahash_req_chain(tfm))
+ return op(req);
+
+ state = &state0;
+
+ if (ahash_is_async(tfm)) {
+ err = ahash_save_req(req, ahash_reqchain_done);
+ if (err) {
+ struct ahash_request *r2;
+
+ req->base.err = err;
+ list_for_each_entry(r2, &req->base.list, base.list)
+ r2->base.err = err;
+
+ return err;
+ }
+
+ state = req->base.data;
+ }
+
+ state->op = op;
+ state->cur = req;
+ INIT_LIST_HEAD(&state->head);
+ list_splice_init(&req->base.list, &state->head);
+
+ err = op(req);
+ if (err == -EBUSY || err == -EINPROGRESS)
+ return -EBUSY;
+
+ return ahash_reqchain_finish(state, err, ~0);
+}
+
int crypto_ahash_init(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- if (likely(tfm->using_shash))
- return crypto_shash_init(prepare_shash_desc(req, tfm));
+ if (likely(tfm->using_shash)) {
+ struct ahash_request *r2;
+ int err;
+
+ err = crypto_shash_init(prepare_shash_desc(req, tfm));
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct shash_desc *desc;
+
+ desc = prepare_shash_desc(r2, tfm);
+ r2->base.err = crypto_shash_init(desc);
+ }
+
+ return err;
+ }
+
if (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
return -ENOKEY;
- return crypto_ahash_alg(tfm)->init(req);
+
+ return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->init);
}
EXPORT_SYMBOL_GPL(crypto_ahash_init);
static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt)
{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct ahash_save_req_state *state;
gfp_t gfp;
u32 flags;
+ if (!ahash_is_async(tfm))
+ return 0;
+
flags = ahash_request_flags(req);
gfp = (flags & CRYPTO_TFM_REQ_MAY_SLEEP) ? GFP_KERNEL : GFP_ATOMIC;
state = kmalloc(sizeof(*state), gfp);
@@ -284,14 +413,20 @@ static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt)
state->data = req->base.data;
req->base.complete = cplt;
req->base.data = state;
- state->req = req;
+ state->req0 = req;
return 0;
}
static void ahash_restore_req(struct ahash_request *req)
{
- struct ahash_save_req_state *state = req->base.data;
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct ahash_save_req_state *state;
+
+ if (!ahash_is_async(tfm))
+ return;
+
+ state = req->base.data;
req->base.complete = state->compl;
req->base.data = state->data;
@@ -302,10 +437,24 @@ int crypto_ahash_update(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- if (likely(tfm->using_shash))
- return shash_ahash_update(req, ahash_request_ctx(req));
+ if (likely(tfm->using_shash)) {
+ struct ahash_request *r2;
+ int err;
- return crypto_ahash_alg(tfm)->update(req);
+ err = shash_ahash_update(req, ahash_request_ctx(req));
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct shash_desc *desc;
+
+ desc = ahash_request_ctx(r2);
+ r2->base.err = shash_ahash_update(r2, desc);
+ }
+
+ return err;
+ }
+
+ return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->update);
}
EXPORT_SYMBOL_GPL(crypto_ahash_update);
@@ -313,10 +462,24 @@ int crypto_ahash_final(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- if (likely(tfm->using_shash))
- return crypto_shash_final(ahash_request_ctx(req), req->result);
+ if (likely(tfm->using_shash)) {
+ struct ahash_request *r2;
+ int err;
- return crypto_ahash_alg(tfm)->final(req);
+ err = crypto_shash_final(ahash_request_ctx(req), req->result);
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct shash_desc *desc;
+
+ desc = ahash_request_ctx(r2);
+ r2->base.err = crypto_shash_final(desc, r2->result);
+ }
+
+ return err;
+ }
+
+ return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->final);
}
EXPORT_SYMBOL_GPL(crypto_ahash_final);
@@ -324,10 +487,27 @@ int crypto_ahash_finup(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- if (likely(tfm->using_shash))
- return shash_ahash_finup(req, ahash_request_ctx(req));
+ if (likely(tfm->using_shash)) {
+ struct ahash_request *r2;
+ int err;
- return crypto_ahash_alg(tfm)->finup(req);
+ err = shash_ahash_finup(req, ahash_request_ctx(req));
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct shash_desc *desc;
+
+ desc = ahash_request_ctx(r2);
+ r2->base.err = shash_ahash_finup(r2, desc);
+ }
+
+ return err;
+ }
+
+ if (!crypto_ahash_alg(tfm)->finup)
+ return ahash_def_finup(req);
+
+ return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->finup);
}
EXPORT_SYMBOL_GPL(crypto_ahash_finup);
@@ -335,20 +515,34 @@ int crypto_ahash_digest(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- if (likely(tfm->using_shash))
- return shash_ahash_digest(req, prepare_shash_desc(req, tfm));
+ if (likely(tfm->using_shash)) {
+ struct ahash_request *r2;
+ int err;
+
+ err = shash_ahash_digest(req, prepare_shash_desc(req, tfm));
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct shash_desc *desc;
+
+ desc = prepare_shash_desc(r2, tfm);
+ r2->base.err = shash_ahash_digest(r2, desc);
+ }
+
+ return err;
+ }
if (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
return -ENOKEY;
- return crypto_ahash_alg(tfm)->digest(req);
+ return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->digest);
}
EXPORT_SYMBOL_GPL(crypto_ahash_digest);
static void ahash_def_finup_done2(void *data, int err)
{
struct ahash_save_req_state *state = data;
- struct ahash_request *areq = state->req;
+ struct ahash_request *areq = state->req0;
if (err == -EINPROGRESS)
return;
@@ -359,12 +553,15 @@ static void ahash_def_finup_done2(void *data, int err)
static int ahash_def_finup_finish1(struct ahash_request *req, int err)
{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+
if (err)
goto out;
- req->base.complete = ahash_def_finup_done2;
+ if (ahash_is_async(tfm))
+ req->base.complete = ahash_def_finup_done2;
- err = crypto_ahash_alg(crypto_ahash_reqtfm(req))->final(req);
+ err = crypto_ahash_final(req);
if (err == -EINPROGRESS || err == -EBUSY)
return err;
@@ -380,7 +577,7 @@ static void ahash_def_finup_done1(void *data, int err)
struct ahash_request *areq;
state = *state0;
- areq = state.req;
+ areq = state.req0;
if (err == -EINPROGRESS)
goto out;
@@ -396,14 +593,13 @@ static void ahash_def_finup_done1(void *data, int err)
static int ahash_def_finup(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
int err;
err = ahash_save_req(req, ahash_def_finup_done1);
if (err)
return err;
- err = crypto_ahash_alg(tfm)->update(req);
+ err = crypto_ahash_update(req);
if (err == -EINPROGRESS || err == -EBUSY)
return err;
@@ -618,8 +814,6 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
base->cra_type = &crypto_ahash_type;
base->cra_flags |= CRYPTO_ALG_TYPE_AHASH;
- if (!alg->finup)
- alg->finup = ahash_def_finup;
if (!alg->setkey)
alg->setkey = ahash_nosetkey;
@@ -690,5 +884,20 @@ int ahash_register_instance(struct crypto_template *tmpl,
}
EXPORT_SYMBOL_GPL(ahash_register_instance);
+void ahash_request_free(struct ahash_request *req)
+{
+ struct ahash_request *tmp;
+ struct ahash_request *r2;
+
+ if (unlikely(!req))
+ return;
+
+ list_for_each_entry_safe(r2, tmp, &req->base.list, base.list)
+ kfree_sensitive(r2);
+
+ kfree_sensitive(req);
+}
+EXPORT_SYMBOL_GPL(ahash_request_free);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Asynchronous cryptographic hash type");
diff --git a/crypto/algapi.c b/crypto/algapi.c
index 5318c214debb..e7a9a2ada2cf 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -955,7 +955,7 @@ struct crypto_async_request *crypto_dequeue_request(struct crypto_queue *queue)
queue->backlog = queue->backlog->next;
request = queue->list.next;
- list_del(request);
+ list_del_init(request);
return list_entry(request, struct crypto_async_request, list);
}
diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
index 156de41ca760..11065978d360 100644
--- a/include/crypto/algapi.h
+++ b/include/crypto/algapi.h
@@ -11,6 +11,7 @@
#include <linux/align.h>
#include <linux/cache.h>
#include <linux/crypto.h>
+#include <linux/list.h>
#include <linux/types.h>
#include <linux/workqueue.h>
@@ -271,4 +272,14 @@ static inline u32 crypto_tfm_alg_type(struct crypto_tfm *tfm)
return tfm->__crt_alg->cra_flags & CRYPTO_ALG_TYPE_MASK;
}
+static inline bool crypto_request_chained(struct crypto_async_request *req)
+{
+ return !list_empty(&req->list);
+}
+
+static inline bool crypto_tfm_req_chain(struct crypto_tfm *tfm)
+{
+ return tfm->__crt_alg->cra_flags & CRYPTO_ALG_REQ_CHAIN;
+}
+
#endif /* _CRYPTO_ALGAPI_H */
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 9c1f8ca59a77..0a6f744ce4a1 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -572,16 +572,7 @@ static inline struct ahash_request *ahash_request_alloc_noprof(
* ahash_request_free() - zeroize and free the request data structure
* @req: request data structure cipher handle to be freed
*/
-static inline void ahash_request_free(struct ahash_request *req)
-{
- kfree_sensitive(req);
-}
-
-static inline void ahash_request_zero(struct ahash_request *req)
-{
- memzero_explicit(req, sizeof(*req) +
- crypto_ahash_reqsize(crypto_ahash_reqtfm(req)));
-}
+void ahash_request_free(struct ahash_request *req);
static inline struct ahash_request *ahash_request_cast(
struct crypto_async_request *req)
@@ -622,6 +613,7 @@ static inline void ahash_request_set_callback(struct ahash_request *req,
req->base.complete = compl;
req->base.data = data;
req->base.flags = flags;
+ crypto_reqchain_init(&req->base);
}
/**
@@ -646,6 +638,12 @@ static inline void ahash_request_set_crypt(struct ahash_request *req,
req->result = result;
}
+static inline void ahash_request_chain(struct ahash_request *req,
+ struct ahash_request *head)
+{
+ crypto_request_chain(&req->base, &head->base);
+}
+
/**
* DOC: Synchronous Message Digest API
*
@@ -947,4 +945,14 @@ static inline void shash_desc_zero(struct shash_desc *desc)
sizeof(*desc) + crypto_shash_descsize(desc->tfm));
}
+static inline int ahash_request_err(struct ahash_request *req)
+{
+ return req->base.err;
+}
+
+static inline bool ahash_is_async(struct crypto_ahash *tfm)
+{
+ return crypto_tfm_is_async(&tfm->base);
+}
+
#endif /* _CRYPTO_HASH_H */
diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h
index 58967593b6b4..81542a48587e 100644
--- a/include/crypto/internal/hash.h
+++ b/include/crypto/internal/hash.h
@@ -270,5 +270,15 @@ static inline struct crypto_shash *__crypto_shash_cast(struct crypto_tfm *tfm)
return container_of(tfm, struct crypto_shash, base);
}
+static inline bool ahash_request_chained(struct ahash_request *req)
+{
+ return crypto_request_chained(&req->base);
+}
+
+static inline bool crypto_ahash_req_chain(struct crypto_ahash *tfm)
+{
+ return crypto_tfm_req_chain(&tfm->base);
+}
+
#endif /* _CRYPTO_INTERNAL_HASH_H */
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index b164da5e129e..1d2a6c515d58 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -13,6 +13,8 @@
#define _LINUX_CRYPTO_H
#include <linux/completion.h>
+#include <linux/errno.h>
+#include <linux/list.h>
#include <linux/refcount.h>
#include <linux/slab.h>
#include <linux/types.h>
@@ -124,6 +126,9 @@
*/
#define CRYPTO_ALG_FIPS_INTERNAL 0x00020000
+/* Set if the algorithm supports request chains. */
+#define CRYPTO_ALG_REQ_CHAIN 0x00040000
+
/*
* Transform masks and values (for crt_flags).
*/
@@ -174,6 +179,7 @@ struct crypto_async_request {
struct crypto_tfm *tfm;
u32 flags;
+ int err;
};
/**
@@ -540,5 +546,23 @@ int crypto_comp_decompress(struct crypto_comp *tfm,
const u8 *src, unsigned int slen,
u8 *dst, unsigned int *dlen);
+static inline void crypto_reqchain_init(struct crypto_async_request *req)
+{
+ req->err = -EINPROGRESS;
+ INIT_LIST_HEAD(&req->list);
+}
+
+static inline void crypto_request_chain(struct crypto_async_request *req,
+ struct crypto_async_request *head)
+{
+ req->err = -EINPROGRESS;
+ list_add_tail(&req->list, &head->list);
+}
+
+static inline bool crypto_tfm_is_async(struct crypto_tfm *tfm)
+{
+ return tfm->__crt_alg->cra_flags & CRYPTO_ALG_ASYNC;
+}
+
#endif /* _LINUX_CRYPTO_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 04/11] crypto: tcrypt - Restore multibuffer ahash tests
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (2 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 03/11] crypto: hash - Add request chaining API Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 05/11] crypto: ahash - Add virtual address support Herbert Xu
` (8 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
This patch is a revert of commit 388ac25efc8ce3bf9768ce7bf24268d6fac285d5.
As multibuffer ahash is coming back in the form of request chaining,
restore the multibuffer ahash tests using the new interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/tcrypt.c | 231 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 231 insertions(+)
diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index e1a74cb2cfbe..f618f61c5615 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -716,6 +716,207 @@ static inline int do_one_ahash_op(struct ahash_request *req, int ret)
return crypto_wait_req(ret, wait);
}
+struct test_mb_ahash_data {
+ struct scatterlist sg[XBUFSIZE];
+ char result[64];
+ struct ahash_request *req;
+ struct crypto_wait wait;
+ char *xbuf[XBUFSIZE];
+};
+
+static inline int do_mult_ahash_op(struct test_mb_ahash_data *data, u32 num_mb,
+ int *rc)
+{
+ int i, err;
+
+ /* Fire up a bunch of concurrent requests */
+ err = crypto_ahash_digest(data[0].req);
+
+ /* Wait for all requests to finish */
+ err = crypto_wait_req(err, &data[0].wait);
+ if (num_mb < 2)
+ return err;
+
+ for (i = 0; i < num_mb; i++) {
+ rc[i] = ahash_request_err(data[i].req);
+ if (rc[i]) {
+ pr_info("concurrent request %d error %d\n", i, rc[i]);
+ err = rc[i];
+ }
+ }
+
+ return err;
+}
+
+static int test_mb_ahash_jiffies(struct test_mb_ahash_data *data, int blen,
+ int secs, u32 num_mb)
+{
+ unsigned long start, end;
+ int bcount;
+ int ret = 0;
+ int *rc;
+
+ rc = kcalloc(num_mb, sizeof(*rc), GFP_KERNEL);
+ if (!rc)
+ return -ENOMEM;
+
+ for (start = jiffies, end = start + secs * HZ, bcount = 0;
+ time_before(jiffies, end); bcount++) {
+ ret = do_mult_ahash_op(data, num_mb, rc);
+ if (ret)
+ goto out;
+ }
+
+ pr_cont("%d operations in %d seconds (%llu bytes)\n",
+ bcount * num_mb, secs, (u64)bcount * blen * num_mb);
+
+out:
+ kfree(rc);
+ return ret;
+}
+
+static int test_mb_ahash_cycles(struct test_mb_ahash_data *data, int blen,
+ u32 num_mb)
+{
+ unsigned long cycles = 0;
+ int ret = 0;
+ int i;
+ int *rc;
+
+ rc = kcalloc(num_mb, sizeof(*rc), GFP_KERNEL);
+ if (!rc)
+ return -ENOMEM;
+
+ /* Warm-up run. */
+ for (i = 0; i < 4; i++) {
+ ret = do_mult_ahash_op(data, num_mb, rc);
+ if (ret)
+ goto out;
+ }
+
+ /* The real thing. */
+ for (i = 0; i < 8; i++) {
+ cycles_t start, end;
+
+ start = get_cycles();
+ ret = do_mult_ahash_op(data, num_mb, rc);
+ end = get_cycles();
+
+ if (ret)
+ goto out;
+
+ cycles += end - start;
+ }
+
+ pr_cont("1 operation in %lu cycles (%d bytes)\n",
+ (cycles + 4) / (8 * num_mb), blen);
+
+out:
+ kfree(rc);
+ return ret;
+}
+
+static void test_mb_ahash_speed(const char *algo, unsigned int secs,
+ struct hash_speed *speed, u32 num_mb)
+{
+ struct test_mb_ahash_data *data;
+ struct crypto_ahash *tfm;
+ unsigned int i, j, k;
+ int ret;
+
+ data = kcalloc(num_mb, sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return;
+
+ tfm = crypto_alloc_ahash(algo, 0, 0);
+ if (IS_ERR(tfm)) {
+ pr_err("failed to load transform for %s: %ld\n",
+ algo, PTR_ERR(tfm));
+ goto free_data;
+ }
+
+ for (i = 0; i < num_mb; ++i) {
+ if (testmgr_alloc_buf(data[i].xbuf))
+ goto out;
+
+ crypto_init_wait(&data[i].wait);
+
+ data[i].req = ahash_request_alloc(tfm, GFP_KERNEL);
+ if (!data[i].req) {
+ pr_err("alg: hash: Failed to allocate request for %s\n",
+ algo);
+ goto out;
+ }
+
+
+ if (i) {
+ ahash_request_set_callback(data[i].req, 0, NULL, NULL);
+ ahash_request_chain(data[i].req, data[0].req);
+ } else
+ ahash_request_set_callback(data[0].req, 0,
+ crypto_req_done,
+ &data[0].wait);
+
+ sg_init_table(data[i].sg, XBUFSIZE);
+ for (j = 0; j < XBUFSIZE; j++) {
+ sg_set_buf(data[i].sg + j, data[i].xbuf[j], PAGE_SIZE);
+ memset(data[i].xbuf[j], 0xff, PAGE_SIZE);
+ }
+ }
+
+ pr_info("\ntesting speed of multibuffer %s (%s)\n", algo,
+ get_driver_name(crypto_ahash, tfm));
+
+ for (i = 0; speed[i].blen != 0; i++) {
+ /* For some reason this only tests digests. */
+ if (speed[i].blen != speed[i].plen)
+ continue;
+
+ if (speed[i].blen > XBUFSIZE * PAGE_SIZE) {
+ pr_err("template (%u) too big for tvmem (%lu)\n",
+ speed[i].blen, XBUFSIZE * PAGE_SIZE);
+ goto out;
+ }
+
+ if (klen)
+ crypto_ahash_setkey(tfm, tvmem[0], klen);
+
+ for (k = 0; k < num_mb; k++)
+ ahash_request_set_crypt(data[k].req, data[k].sg,
+ data[k].result, speed[i].blen);
+
+ pr_info("test%3u "
+ "(%5u byte blocks,%5u bytes per update,%4u updates): ",
+ i, speed[i].blen, speed[i].plen,
+ speed[i].blen / speed[i].plen);
+
+ if (secs) {
+ ret = test_mb_ahash_jiffies(data, speed[i].blen, secs,
+ num_mb);
+ cond_resched();
+ } else {
+ ret = test_mb_ahash_cycles(data, speed[i].blen, num_mb);
+ }
+
+
+ if (ret) {
+ pr_err("At least one hashing failed ret=%d\n", ret);
+ break;
+ }
+ }
+
+out:
+ ahash_request_free(data[0].req);
+
+ for (k = 0; k < num_mb; ++k)
+ testmgr_free_buf(data[k].xbuf);
+
+ crypto_free_ahash(tfm);
+
+free_data:
+ kfree(data);
+}
+
static int test_ahash_jiffies_digest(struct ahash_request *req, int blen,
char *out, int secs)
{
@@ -2391,6 +2592,36 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_ahash_speed("sm3", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
+ case 450:
+ test_mb_ahash_speed("sha1", sec, generic_hash_speed_template,
+ num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+ case 451:
+ test_mb_ahash_speed("sha256", sec, generic_hash_speed_template,
+ num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+ case 452:
+ test_mb_ahash_speed("sha512", sec, generic_hash_speed_template,
+ num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+ case 453:
+ test_mb_ahash_speed("sm3", sec, generic_hash_speed_template,
+ num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+ case 454:
+ test_mb_ahash_speed("streebog256", sec,
+ generic_hash_speed_template, num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+ case 455:
+ test_mb_ahash_speed("streebog512", sec,
+ generic_hash_speed_template, num_mb);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
case 499:
break;
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 05/11] crypto: ahash - Add virtual address support
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (3 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 04/11] crypto: tcrypt - Restore multibuffer ahash tests Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 06/11] crypto: ahash - Set default reqsize from ahash_alg Herbert Xu
` (7 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
This patch adds virtual address support to ahash. Virtual addresses
were previously only supported through shash. The user may choose
to use virtual addresses with ahash by calling ahash_request_set_virt
instead of ahash_request_set_crypt.
The API will take care of translating this to an SG list if necessary,
unless the algorithm declares that it supports chaining. Therefore
in order for an ahash algorithm to support chaining, it must also
support virtual addresses directly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/ahash.c | 280 +++++++++++++++++++++++++++++----
include/crypto/hash.h | 38 ++++-
include/crypto/internal/hash.h | 7 +-
include/linux/crypto.h | 2 +-
4 files changed, 293 insertions(+), 34 deletions(-)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 0546835f7304..40ccaf4c0cd6 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -34,11 +34,17 @@ struct ahash_save_req_state {
int (*op)(struct ahash_request *req);
crypto_completion_t compl;
void *data;
+ struct scatterlist sg;
+ const u8 *src;
+ u8 *page;
+ unsigned int offset;
+ unsigned int nbytes;
};
static void ahash_reqchain_done(void *data, int err);
static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt);
-static void ahash_restore_req(struct ahash_request *req);
+static void ahash_restore_req(struct ahash_save_req_state *state);
+static void ahash_def_finup_done1(void *data, int err);
static int ahash_def_finup(struct ahash_request *req);
/*
@@ -100,6 +106,10 @@ int shash_ahash_digest(struct ahash_request *req, struct shash_desc *desc)
unsigned int offset;
int err;
+ if (ahash_request_isvirt(req))
+ return crypto_shash_digest(desc, req->svirt, nbytes,
+ req->result);
+
if (nbytes &&
(sg = req->src, offset = sg->offset,
nbytes <= min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset))) {
@@ -182,6 +192,9 @@ static int hash_walk_new_entry(struct crypto_hash_walk *walk)
int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err)
{
+ if ((walk->flags & CRYPTO_AHASH_REQ_VIRT))
+ return err;
+
walk->data -= walk->offset;
kunmap_local(walk->data);
@@ -209,14 +222,20 @@ int crypto_hash_walk_first(struct ahash_request *req,
struct crypto_hash_walk *walk)
{
walk->total = req->nbytes;
+ walk->entrylen = 0;
- if (!walk->total) {
- walk->entrylen = 0;
+ if (!walk->total)
return 0;
+
+ walk->flags = req->base.flags;
+
+ if (ahash_request_isvirt(req)) {
+ walk->data = req->svirt;
+ walk->total = 0;
+ return req->nbytes;
}
walk->sg = req->src;
- walk->flags = req->base.flags;
return hash_walk_new_entry(walk);
}
@@ -264,18 +283,82 @@ int crypto_ahash_setkey(struct crypto_ahash *tfm, const u8 *key,
}
EXPORT_SYMBOL_GPL(crypto_ahash_setkey);
+static bool ahash_request_hasvirt(struct ahash_request *req)
+{
+ struct ahash_request *r2;
+
+ if (ahash_request_isvirt(req))
+ return true;
+
+ list_for_each_entry(r2, &req->base.list, base.list)
+ if (ahash_request_isvirt(r2))
+ return true;
+
+ return false;
+}
+
+static int ahash_reqchain_virt(struct ahash_save_req_state *state,
+ int err, u32 mask)
+{
+ struct ahash_request *req = state->cur;
+
+ for (;;) {
+ unsigned len = state->nbytes;
+
+ req->base.err = err;
+
+ if (!state->offset)
+ break;
+
+ if (state->offset == len || err) {
+ u8 *result = req->result;
+
+ ahash_request_set_virt(req, state->src, result, len);
+ state->offset = 0;
+ break;
+ }
+
+ len -= state->offset;
+
+ len = min(PAGE_SIZE, len);
+ memcpy(state->page, state->src + state->offset, len);
+ state->offset += len;
+ req->nbytes = len;
+
+ err = state->op(req);
+ if (err == -EINPROGRESS) {
+ if (!list_empty(&state->head) ||
+ state->offset < state->nbytes)
+ err = -EBUSY;
+ break;
+ }
+
+ if (err == -EBUSY)
+ break;
+ }
+
+ return err;
+}
+
static int ahash_reqchain_finish(struct ahash_save_req_state *state,
int err, u32 mask)
{
struct ahash_request *req0 = state->req0;
struct ahash_request *req = state->cur;
+ struct crypto_ahash *tfm;
struct ahash_request *n;
+ bool update;
- req->base.err = err;
+ err = ahash_reqchain_virt(state, err, mask);
+ if (err == -EINPROGRESS || err == -EBUSY)
+ goto out;
if (req != req0)
list_add_tail(&req->base.list, &req0->base.list);
+ tfm = crypto_ahash_reqtfm(req);
+ update = state->op == crypto_ahash_alg(tfm)->update;
+
list_for_each_entry_safe(req, n, &state->head, base.list) {
list_del_init(&req->base.list);
@@ -283,10 +366,27 @@ static int ahash_reqchain_finish(struct ahash_save_req_state *state,
req->base.complete = ahash_reqchain_done;
req->base.data = state;
state->cur = req;
+
+ if (update && ahash_request_isvirt(req) && req->nbytes) {
+ unsigned len = req->nbytes;
+ u8 *result = req->result;
+
+ state->src = req->svirt;
+ state->nbytes = len;
+
+ len = min(PAGE_SIZE, len);
+
+ memcpy(state->page, req->svirt, len);
+ state->offset = len;
+
+ ahash_request_set_crypt(req, &state->sg, result, len);
+ }
+
err = state->op(req);
if (err == -EINPROGRESS) {
- if (!list_empty(&state->head))
+ if (!list_empty(&state->head) ||
+ state->offset < state->nbytes)
err = -EBUSY;
goto out;
}
@@ -294,11 +394,14 @@ static int ahash_reqchain_finish(struct ahash_save_req_state *state,
if (err == -EBUSY)
goto out;
- req->base.err = err;
+ err = ahash_reqchain_virt(state, err, mask);
+ if (err == -EINPROGRESS || err == -EBUSY)
+ goto out;
+
list_add_tail(&req->base.list, &req0->base.list);
}
- ahash_restore_req(req0);
+ ahash_restore_req(state);
out:
return err;
@@ -312,7 +415,7 @@ static void ahash_reqchain_done(void *data, int err)
data = state->data;
if (err == -EINPROGRESS) {
- if (!list_empty(&state->head))
+ if (!list_empty(&state->head) || state->offset < state->nbytes)
return;
goto notify;
}
@@ -329,40 +432,84 @@ static int ahash_do_req_chain(struct ahash_request *req,
int (*op)(struct ahash_request *req))
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ bool update = op == crypto_ahash_alg(tfm)->update;
struct ahash_save_req_state *state;
struct ahash_save_req_state state0;
+ struct ahash_request *r2;
+ u8 *page = NULL;
int err;
- if (!ahash_request_chained(req) || crypto_ahash_req_chain(tfm))
+ if (crypto_ahash_req_chain(tfm) ||
+ (!ahash_request_chained(req) &&
+ (!update || !ahash_request_isvirt(req))))
return op(req);
- state = &state0;
+ if (update && ahash_request_hasvirt(req)) {
+ gfp_t gfp;
+ u32 flags;
+ flags = ahash_request_flags(req);
+ gfp = (flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
+ GFP_KERNEL : GFP_ATOMIC;
+ page = (void *)__get_free_page(gfp);
+ err = -ENOMEM;
+ if (!page)
+ goto out_set_chain;
+ }
+
+ state = &state0;
if (ahash_is_async(tfm)) {
err = ahash_save_req(req, ahash_reqchain_done);
- if (err) {
- struct ahash_request *r2;
-
- req->base.err = err;
- list_for_each_entry(r2, &req->base.list, base.list)
- r2->base.err = err;
-
- return err;
- }
+ if (err)
+ goto out_free_page;
state = req->base.data;
}
state->op = op;
state->cur = req;
+ state->page = page;
+ state->offset = 0;
+ state->nbytes = 0;
INIT_LIST_HEAD(&state->head);
list_splice_init(&req->base.list, &state->head);
+ if (page)
+ sg_init_one(&state->sg, page, PAGE_SIZE);
+
+ if (update && ahash_request_isvirt(req) && req->nbytes) {
+ unsigned len = req->nbytes;
+ u8 *result = req->result;
+
+ state->src = req->svirt;
+ state->nbytes = len;
+
+ len = min(PAGE_SIZE, len);
+
+ memcpy(page, req->svirt, len);
+ state->offset = len;
+
+ ahash_request_set_crypt(req, &state->sg, result, len);
+ }
+
err = op(req);
if (err == -EBUSY || err == -EINPROGRESS)
return -EBUSY;
return ahash_reqchain_finish(state, err, ~0);
+
+out_free_page:
+ if (page) {
+ memset(page, 0, PAGE_SIZE);
+ free_page((unsigned long)page);
+ }
+
+out_set_chain:
+ req->base.err = err;
+ list_for_each_entry(r2, &req->base.list, base.list)
+ r2->base.err = err;
+
+ return err;
}
int crypto_ahash_init(struct ahash_request *req)
@@ -414,15 +561,19 @@ static int ahash_save_req(struct ahash_request *req, crypto_completion_t cplt)
req->base.complete = cplt;
req->base.data = state;
state->req0 = req;
+ state->page = NULL;
return 0;
}
-static void ahash_restore_req(struct ahash_request *req)
+static void ahash_restore_req(struct ahash_save_req_state *state)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- struct ahash_save_req_state *state;
+ struct ahash_request *req = state->req0;
+ struct crypto_ahash *tfm;
+ free_page((unsigned long)state->page);
+
+ tfm = crypto_ahash_reqtfm(req);
if (!ahash_is_async(tfm))
return;
@@ -504,13 +655,74 @@ int crypto_ahash_finup(struct ahash_request *req)
return err;
}
- if (!crypto_ahash_alg(tfm)->finup)
+ if (!crypto_ahash_alg(tfm)->finup ||
+ (!crypto_ahash_req_chain(tfm) && ahash_request_hasvirt(req)))
return ahash_def_finup(req);
return ahash_do_req_chain(req, crypto_ahash_alg(tfm)->finup);
}
EXPORT_SYMBOL_GPL(crypto_ahash_finup);
+static int ahash_def_digest_finish(struct ahash_save_req_state *state, int err)
+{
+ struct ahash_request *req = state->req0;
+ struct crypto_ahash *tfm;
+
+ if (err)
+ goto out;
+
+ tfm = crypto_ahash_reqtfm(req);
+ if (ahash_is_async(tfm))
+ req->base.complete = ahash_def_finup_done1;
+
+ err = crypto_ahash_update(req);
+ if (err == -EINPROGRESS || err == -EBUSY)
+ return err;
+
+out:
+ ahash_restore_req(state);
+ return err;
+}
+
+static void ahash_def_digest_done(void *data, int err)
+{
+ struct ahash_save_req_state *state0 = data;
+ struct ahash_save_req_state state;
+ struct ahash_request *areq;
+
+ state = *state0;
+ areq = state.req0;
+ if (err == -EINPROGRESS)
+ goto out;
+
+ areq->base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ err = ahash_def_digest_finish(state0, err);
+ if (err == -EINPROGRESS || err == -EBUSY)
+ return;
+
+out:
+ state.compl(state.data, err);
+}
+
+static int ahash_def_digest(struct ahash_request *req)
+{
+ struct ahash_save_req_state *state;
+ int err;
+
+ err = ahash_save_req(req, ahash_def_digest_done);
+ if (err)
+ return err;
+
+ state = req->base.data;
+
+ err = crypto_ahash_init(req);
+ if (err == -EINPROGRESS || err == -EBUSY)
+ return err;
+
+ return ahash_def_digest_finish(state, err);
+}
+
int crypto_ahash_digest(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
@@ -532,6 +744,9 @@ int crypto_ahash_digest(struct ahash_request *req)
return err;
}
+ if (!crypto_ahash_req_chain(tfm) && ahash_request_hasvirt(req))
+ return ahash_def_digest(req);
+
if (crypto_ahash_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
return -ENOKEY;
@@ -547,17 +762,19 @@ static void ahash_def_finup_done2(void *data, int err)
if (err == -EINPROGRESS)
return;
- ahash_restore_req(areq);
+ ahash_restore_req(state);
ahash_request_complete(areq, err);
}
-static int ahash_def_finup_finish1(struct ahash_request *req, int err)
+static int ahash_def_finup_finish1(struct ahash_save_req_state *state, int err)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct ahash_request *req = state->req0;
+ struct crypto_ahash *tfm;
if (err)
goto out;
+ tfm = crypto_ahash_reqtfm(req);
if (ahash_is_async(tfm))
req->base.complete = ahash_def_finup_done2;
@@ -566,7 +783,7 @@ static int ahash_def_finup_finish1(struct ahash_request *req, int err)
return err;
out:
- ahash_restore_req(req);
+ ahash_restore_req(state);
return err;
}
@@ -583,7 +800,7 @@ static void ahash_def_finup_done1(void *data, int err)
areq->base.flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
- err = ahash_def_finup_finish1(areq, err);
+ err = ahash_def_finup_finish1(state0, err);
if (err == -EINPROGRESS || err == -EBUSY)
return;
@@ -593,17 +810,20 @@ static void ahash_def_finup_done1(void *data, int err)
static int ahash_def_finup(struct ahash_request *req)
{
+ struct ahash_save_req_state *state;
int err;
err = ahash_save_req(req, ahash_def_finup_done1);
if (err)
return err;
+ state = req->base.data;
+
err = crypto_ahash_update(req);
if (err == -EINPROGRESS || err == -EBUSY)
return err;
- return ahash_def_finup_finish1(req, err);
+ return ahash_def_finup_finish1(state, err);
}
int crypto_ahash_export(struct ahash_request *req, void *out)
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 0a6f744ce4a1..4e87e39679cb 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -12,6 +12,9 @@
#include <linux/crypto.h>
#include <linux/string.h>
+/* Set this bit for virtual address instead of SG list. */
+#define CRYPTO_AHASH_REQ_VIRT 0x00000001
+
struct crypto_ahash;
/**
@@ -52,7 +55,10 @@ struct ahash_request {
struct crypto_async_request base;
unsigned int nbytes;
- struct scatterlist *src;
+ union {
+ struct scatterlist *src;
+ const u8 *svirt;
+ };
u8 *result;
void *__ctx[] CRYPTO_MINALIGN_ATTR;
@@ -610,9 +616,13 @@ static inline void ahash_request_set_callback(struct ahash_request *req,
crypto_completion_t compl,
void *data)
{
+ u32 keep = CRYPTO_AHASH_REQ_VIRT;
+
req->base.complete = compl;
req->base.data = data;
- req->base.flags = flags;
+ flags &= ~keep;
+ req->base.flags &= keep;
+ req->base.flags |= flags;
crypto_reqchain_init(&req->base);
}
@@ -636,6 +646,30 @@ static inline void ahash_request_set_crypt(struct ahash_request *req,
req->src = src;
req->nbytes = nbytes;
req->result = result;
+ req->base.flags &= ~CRYPTO_AHASH_REQ_VIRT;
+}
+
+/**
+ * ahash_request_set_virt() - set virtual address data buffers
+ * @req: ahash_request handle to be updated
+ * @src: source virtual address
+ * @result: buffer that is filled with the message digest -- the caller must
+ * ensure that the buffer has sufficient space by, for example, calling
+ * crypto_ahash_digestsize()
+ * @nbytes: number of bytes to process from the source virtual address
+ *
+ * By using this call, the caller references the source virtual address.
+ * The source virtual address points to the data the message digest is to
+ * be calculated for.
+ */
+static inline void ahash_request_set_virt(struct ahash_request *req,
+ const u8 *src, u8 *result,
+ unsigned int nbytes)
+{
+ req->svirt = src;
+ req->nbytes = nbytes;
+ req->result = result;
+ req->base.flags |= CRYPTO_AHASH_REQ_VIRT;
}
static inline void ahash_request_chain(struct ahash_request *req,
diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h
index 81542a48587e..195d6aeeede3 100644
--- a/include/crypto/internal/hash.h
+++ b/include/crypto/internal/hash.h
@@ -15,7 +15,7 @@ struct ahash_request;
struct scatterlist;
struct crypto_hash_walk {
- char *data;
+ const char *data;
unsigned int offset;
unsigned int flags;
@@ -275,6 +275,11 @@ static inline bool ahash_request_chained(struct ahash_request *req)
return crypto_request_chained(&req->base);
}
+static inline bool ahash_request_isvirt(struct ahash_request *req)
+{
+ return req->base.flags & CRYPTO_AHASH_REQ_VIRT;
+}
+
static inline bool crypto_ahash_req_chain(struct crypto_ahash *tfm)
{
return crypto_tfm_req_chain(&tfm->base);
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 1d2a6c515d58..61ac11226638 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -126,7 +126,7 @@
*/
#define CRYPTO_ALG_FIPS_INTERNAL 0x00020000
-/* Set if the algorithm supports request chains. */
+/* Set if the algorithm supports request chains and virtual addresses. */
#define CRYPTO_ALG_REQ_CHAIN 0x00040000
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 06/11] crypto: ahash - Set default reqsize from ahash_alg
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (4 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 05/11] crypto: ahash - Add virtual address support Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing Herbert Xu
` (6 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Add a reqsize field to struct ahash_alg and use it to set the
default reqsize so that algorithms with a static reqsize are
not forced to create an init_tfm function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/ahash.c | 4 ++++
include/crypto/hash.h | 3 +++
2 files changed, 7 insertions(+)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 40ccaf4c0cd6..6b19fa6fc628 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -862,6 +862,7 @@ static int crypto_ahash_init_tfm(struct crypto_tfm *tfm)
struct ahash_alg *alg = crypto_ahash_alg(hash);
crypto_ahash_set_statesize(hash, alg->halg.statesize);
+ crypto_ahash_set_reqsize(hash, alg->reqsize);
if (tfm->__crt_alg->cra_type == &crypto_shash_type)
return crypto_init_ahash_using_shash(tfm);
@@ -1027,6 +1028,9 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
if (alg->halg.statesize == 0)
return -EINVAL;
+ if (alg->reqsize && alg->reqsize < alg->halg.statesize)
+ return -EINVAL;
+
err = hash_prepare_alg(&alg->halg);
if (err)
return err;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 4e87e39679cb..2aa83ee0ec98 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -135,6 +135,7 @@ struct ahash_request {
* This is a counterpart to @init_tfm, used to remove
* various changes set in @init_tfm.
* @clone_tfm: Copy transform into new object, may allocate memory.
+ * @reqsize: Size of the request context.
* @halg: see struct hash_alg_common
*/
struct ahash_alg {
@@ -151,6 +152,8 @@ struct ahash_alg {
void (*exit_tfm)(struct crypto_ahash *tfm);
int (*clone_tfm)(struct crypto_ahash *dst, struct crypto_ahash *src);
+ unsigned int reqsize;
+
struct hash_alg_common halg;
};
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (5 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 06/11] crypto: ahash - Set default reqsize from ahash_alg Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 9:18 ` kernel test robot
2025-02-16 3:07 ` [v2 PATCH 08/11] crypto: x86/sha2 - Restore multibuffer AVX2 support Herbert Xu
` (5 subsequent siblings)
12 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
This is based on a patch by Eric Biggers <ebiggers@google.com>.
Add limited self-test for multibuffer hash code path. This tests
only a single request in chain of a random length. The other
requests are all of the same length as the one being tested.
Potential extension include testing all requests rather than just
the single one, and varying the length of each request.
Link: https://lore.kernel.org/all/20241001153718.111665-3-ebiggers@kernel.org/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/testmgr.c | 132 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 103 insertions(+), 29 deletions(-)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index b69877db3f33..9717b5c0f3c6 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -58,6 +58,9 @@ module_param(fuzz_iterations, uint, 0644);
MODULE_PARM_DESC(fuzz_iterations, "number of fuzz test iterations");
#endif
+/* Multibuffer hashing is unlimited. Set arbitrary limit for testing. */
+#define HASH_TEST_MAX_MB_MSGS 16
+
#ifdef CONFIG_CRYPTO_MANAGER_DISABLE_TESTS
/* a perfect nop */
@@ -299,6 +302,11 @@ struct test_sg_division {
* @key_offset_relative_to_alignmask: if true, add the algorithm's alignmask to
* the @key_offset
* @finalization_type: what finalization function to use for hashes
+ * @multibuffer: test with multibuffer
+ * @multibuffer_index: random number used to generate the message index to use
+ * for multibuffer.
+ * @multibuffer_count: random number used to generate the num_msgs parameter
+ * for multibuffer
* @nosimd: execute with SIMD disabled? Requires !CRYPTO_TFM_REQ_MAY_SLEEP.
* This applies to the parts of the operation that aren't controlled
* individually by @nosimd_setkey or @src_divs[].nosimd.
@@ -318,6 +326,9 @@ struct testvec_config {
enum finalization_type finalization_type;
bool nosimd;
bool nosimd_setkey;
+ bool multibuffer;
+ unsigned int multibuffer_index;
+ unsigned int multibuffer_count;
};
#define TESTVEC_CONFIG_NAMELEN 192
@@ -1146,6 +1157,13 @@ static void generate_random_testvec_config(struct rnd_state *rng,
break;
}
+ if (prandom_bool(rng)) {
+ cfg->multibuffer = true;
+ cfg->multibuffer_index = prandom_u32_state(rng);
+ cfg->multibuffer_count = prandom_u32_state(rng);
+ p += scnprintf(p, end - p, " multibuffer");
+ }
+
if (!(cfg->req_flags & CRYPTO_TFM_REQ_MAY_SLEEP)) {
if (prandom_bool(rng)) {
cfg->nosimd = true;
@@ -1446,16 +1464,61 @@ static int test_shash_vec_cfg(const struct hash_testvec *vec,
driver, cfg);
}
-static int do_ahash_op(int (*op)(struct ahash_request *req),
- struct ahash_request *req,
- struct crypto_wait *wait, bool nosimd)
+static int do_ahash_op_multibuffer(
+ int (*op)(struct ahash_request *req),
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
+ struct crypto_wait *wait,
+ const struct testvec_config *cfg)
{
+ struct ahash_request *req = reqs[0];
+ u8 trash[HASH_MAX_DIGESTSIZE];
+ unsigned int num_msgs;
+ unsigned int msg_idx;
+ int err;
+ int i;
+
+ num_msgs = 1 + (cfg->multibuffer_count % HASH_TEST_MAX_MB_MSGS);
+ if (num_msgs == 1)
+ return op(req);
+
+ msg_idx = cfg->multibuffer_index % num_msgs;
+ for (i = 1; i < num_msgs; i++) {
+ struct ahash_request *r2 = reqs[i];
+
+ ahash_request_set_callback(r2, req->base.flags, NULL, NULL);
+ ahash_request_set_crypt(r2, req->src, trash, req->nbytes);
+ ahash_request_chain(r2, req);
+ }
+
+ if (msg_idx) {
+ reqs[msg_idx]->result = req->result;
+ req->result = trash;
+ }
+
+ err = op(req);
+
+ if (msg_idx)
+ req->result = reqs[msg_idx]->result;
+
+ return err;
+}
+
+static int do_ahash_op(int (*op)(struct ahash_request *req),
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
+ struct crypto_wait *wait,
+ const struct testvec_config *cfg,
+ bool nosimd)
+{
+ struct ahash_request *req = reqs[0];
int err;
if (nosimd)
crypto_disable_simd_for_test();
- err = op(req);
+ if (cfg->multibuffer)
+ err = do_ahash_op_multibuffer(op, reqs, wait, cfg);
+ else
+ err = op(req);
if (nosimd)
crypto_reenable_simd_for_test();
@@ -1485,10 +1548,11 @@ static int check_nonfinal_ahash_op(const char *op, int err,
static int test_ahash_vec_cfg(const struct hash_testvec *vec,
const char *vec_name,
const struct testvec_config *cfg,
- struct ahash_request *req,
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
struct test_sglist *tsgl,
u8 *hashstate)
{
+ struct ahash_request *req = reqs[0];
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
const unsigned int digestsize = crypto_ahash_digestsize(tfm);
const unsigned int statesize = crypto_ahash_statesize(tfm);
@@ -1540,7 +1604,7 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
ahash_request_set_callback(req, req_flags, crypto_req_done,
&wait);
ahash_request_set_crypt(req, tsgl->sgl, result, vec->psize);
- err = do_ahash_op(crypto_ahash_digest, req, &wait, cfg->nosimd);
+ err = do_ahash_op(crypto_ahash_digest, reqs, &wait, cfg, cfg->nosimd);
if (err) {
if (err == vec->digest_error)
return 0;
@@ -1561,7 +1625,7 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
ahash_request_set_callback(req, req_flags, crypto_req_done, &wait);
ahash_request_set_crypt(req, NULL, result, 0);
- err = do_ahash_op(crypto_ahash_init, req, &wait, cfg->nosimd);
+ err = do_ahash_op(crypto_ahash_init, reqs, &wait, cfg, cfg->nosimd);
err = check_nonfinal_ahash_op("init", err, result, digestsize,
driver, vec_name, cfg);
if (err)
@@ -1577,8 +1641,8 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
crypto_req_done, &wait);
ahash_request_set_crypt(req, pending_sgl, result,
pending_len);
- err = do_ahash_op(crypto_ahash_update, req, &wait,
- divs[i]->nosimd);
+ err = do_ahash_op(crypto_ahash_update, reqs, &wait,
+ cfg, divs[i]->nosimd);
err = check_nonfinal_ahash_op("update", err,
result, digestsize,
driver, vec_name, cfg);
@@ -1621,12 +1685,13 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
ahash_request_set_crypt(req, pending_sgl, result, pending_len);
if (cfg->finalization_type == FINALIZATION_TYPE_FINAL) {
/* finish with update() and final() */
- err = do_ahash_op(crypto_ahash_update, req, &wait, cfg->nosimd);
+ err = do_ahash_op(crypto_ahash_update, reqs, &wait, cfg, cfg->nosimd);
err = check_nonfinal_ahash_op("update", err, result, digestsize,
driver, vec_name, cfg);
if (err)
return err;
- err = do_ahash_op(crypto_ahash_final, req, &wait, cfg->nosimd);
+ ahash_request_set_callback(req, req_flags, crypto_req_done, &wait);
+ err = do_ahash_op(crypto_ahash_final, reqs, &wait, cfg, cfg->nosimd);
if (err) {
pr_err("alg: ahash: %s final() failed with err %d on test vector %s, cfg=\"%s\"\n",
driver, err, vec_name, cfg->name);
@@ -1634,7 +1699,7 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
}
} else {
/* finish with finup() */
- err = do_ahash_op(crypto_ahash_finup, req, &wait, cfg->nosimd);
+ err = do_ahash_op(crypto_ahash_finup, reqs, &wait, cfg, cfg->nosimd);
if (err) {
pr_err("alg: ahash: %s finup() failed with err %d on test vector %s, cfg=\"%s\"\n",
driver, err, vec_name, cfg->name);
@@ -1650,7 +1715,7 @@ static int test_ahash_vec_cfg(const struct hash_testvec *vec,
static int test_hash_vec_cfg(const struct hash_testvec *vec,
const char *vec_name,
const struct testvec_config *cfg,
- struct ahash_request *req,
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
struct shash_desc *desc,
struct test_sglist *tsgl,
u8 *hashstate)
@@ -1670,11 +1735,12 @@ static int test_hash_vec_cfg(const struct hash_testvec *vec,
return err;
}
- return test_ahash_vec_cfg(vec, vec_name, cfg, req, tsgl, hashstate);
+ return test_ahash_vec_cfg(vec, vec_name, cfg, reqs, tsgl, hashstate);
}
static int test_hash_vec(const struct hash_testvec *vec, unsigned int vec_num,
- struct ahash_request *req, struct shash_desc *desc,
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
+ struct shash_desc *desc,
struct test_sglist *tsgl, u8 *hashstate)
{
char vec_name[16];
@@ -1686,7 +1752,7 @@ static int test_hash_vec(const struct hash_testvec *vec, unsigned int vec_num,
for (i = 0; i < ARRAY_SIZE(default_hash_testvec_configs); i++) {
err = test_hash_vec_cfg(vec, vec_name,
&default_hash_testvec_configs[i],
- req, desc, tsgl, hashstate);
+ reqs, desc, tsgl, hashstate);
if (err)
return err;
}
@@ -1703,7 +1769,7 @@ static int test_hash_vec(const struct hash_testvec *vec, unsigned int vec_num,
generate_random_testvec_config(&rng, &cfg, cfgname,
sizeof(cfgname));
err = test_hash_vec_cfg(vec, vec_name, &cfg,
- req, desc, tsgl, hashstate);
+ reqs, desc, tsgl, hashstate);
if (err)
return err;
cond_resched();
@@ -1762,11 +1828,12 @@ static void generate_random_hash_testvec(struct rnd_state *rng,
*/
static int test_hash_vs_generic_impl(const char *generic_driver,
unsigned int maxkeysize,
- struct ahash_request *req,
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS],
struct shash_desc *desc,
struct test_sglist *tsgl,
u8 *hashstate)
{
+ struct ahash_request *req = reqs[0];
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
const unsigned int digestsize = crypto_ahash_digestsize(tfm);
const unsigned int blocksize = crypto_ahash_blocksize(tfm);
@@ -1864,7 +1931,7 @@ static int test_hash_vs_generic_impl(const char *generic_driver,
sizeof(cfgname));
err = test_hash_vec_cfg(&vec, vec_name, cfg,
- req, desc, tsgl, hashstate);
+ reqs, desc, tsgl, hashstate);
if (err)
goto out;
cond_resched();
@@ -1929,8 +1996,8 @@ static int __alg_test_hash(const struct hash_testvec *vecs,
u32 type, u32 mask,
const char *generic_driver, unsigned int maxkeysize)
{
+ struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS] = {};
struct crypto_ahash *atfm = NULL;
- struct ahash_request *req = NULL;
struct crypto_shash *stfm = NULL;
struct shash_desc *desc = NULL;
struct test_sglist *tsgl = NULL;
@@ -1954,12 +2021,14 @@ static int __alg_test_hash(const struct hash_testvec *vecs,
}
driver = crypto_ahash_driver_name(atfm);
- req = ahash_request_alloc(atfm, GFP_KERNEL);
- if (!req) {
- pr_err("alg: hash: failed to allocate request for %s\n",
- driver);
- err = -ENOMEM;
- goto out;
+ for (i = 0; i < HASH_TEST_MAX_MB_MSGS; i++) {
+ reqs[i] = ahash_request_alloc(atfm, GFP_KERNEL);
+ if (!reqs[i]) {
+ pr_err("alg: hash: failed to allocate request for %s\n",
+ driver);
+ err = -ENOMEM;
+ goto out;
+ }
}
/*
@@ -1995,12 +2064,12 @@ static int __alg_test_hash(const struct hash_testvec *vecs,
if (fips_enabled && vecs[i].fips_skip)
continue;
- err = test_hash_vec(&vecs[i], i, req, desc, tsgl, hashstate);
+ err = test_hash_vec(&vecs[i], i, reqs, desc, tsgl, hashstate);
if (err)
goto out;
cond_resched();
}
- err = test_hash_vs_generic_impl(generic_driver, maxkeysize, req,
+ err = test_hash_vs_generic_impl(generic_driver, maxkeysize, reqs,
desc, tsgl, hashstate);
out:
kfree(hashstate);
@@ -2010,7 +2079,12 @@ static int __alg_test_hash(const struct hash_testvec *vecs,
}
kfree(desc);
crypto_free_shash(stfm);
- ahash_request_free(req);
+ if (reqs[0]) {
+ ahash_request_set_callback(reqs[0], 0, NULL, NULL);
+ for (i = 1; i < HASH_TEST_MAX_MB_MSGS && reqs[i]; i++)
+ ahash_request_chain(reqs[i], reqs[0]);
+ ahash_request_free(reqs[0]);
+ }
crypto_free_ahash(atfm);
return err;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 08/11] crypto: x86/sha2 - Restore multibuffer AVX2 support
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (6 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 09/11] crypto: hash - Add sync hash interface Herbert Xu
` (4 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Resurrect the old multibuffer AVX2 code removed by commit ab8085c130ed
("crypto: x86 - remove SHA multibuffer routines and mcryptd") using
the new request chaining interface.
This is purely a proof of concept and only meant to illustrate the
utility of the new API rather than a serious attempt at improving
the performance.
However, it is interesting to note that with x8 multibuffer the
performance of AVX2 is on par with SHA-NI.
testing speed of multibuffer sha256 (sha256-avx2)
tcrypt: test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 1 operation in 184 cycles (16 bytes)
tcrypt: test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 1 operation in 165 cycles (64 bytes)
tcrypt: test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 1 operation in 444 cycles (256 bytes)
tcrypt: test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 1 operation in 1549 cycles (1024 bytes)
tcrypt: test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 1 operation in 3060 cycles (2048 bytes)
tcrypt: test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 1 operation in 5983 cycles (4096 bytes)
tcrypt: test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 1 operation in 11980 cycles (8192 bytes)
tcrypt: testing speed of async sha256 (sha256-avx2)
tcrypt: test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 475 cycles/operation, 29 cycles/byte
tcrypt: test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 780 cycles/operation, 12 cycles/byte
tcrypt: test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 1872 cycles/operation, 7 cycles/byte
tcrypt: test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 5416 cycles/operation, 5 cycles/byte
tcrypt: test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 10339 cycles/operation, 5 cycles/byte
tcrypt: test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 20214 cycles/operation, 4 cycles/byte
tcrypt: test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 40042 cycles/operation, 4 cycles/byte
tcrypt: testing speed of async sha256-ni (sha256-ni)
tcrypt: test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 207 cycles/operation, 12 cycles/byte
tcrypt: test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 299 cycles/operation, 4 cycles/byte
tcrypt: test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 543 cycles/operation, 2 cycles/byte
tcrypt: test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 1523 cycles/operation, 1 cycles/byte
tcrypt: test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 2835 cycles/operation, 1 cycles/byte
tcrypt: test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 5459 cycles/operation, 1 cycles/byte
tcrypt: test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 10724 cycles/operation, 1 cycles/byte
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
arch/x86/crypto/Makefile | 2 +-
arch/x86/crypto/sha256_mb_mgr_datastruct.S | 304 +++++++++++
arch/x86/crypto/sha256_ssse3_glue.c | 540 +++++++++++++++++--
arch/x86/crypto/sha256_x8_avx2.S | 598 +++++++++++++++++++++
4 files changed, 1401 insertions(+), 43 deletions(-)
create mode 100644 arch/x86/crypto/sha256_mb_mgr_datastruct.S
create mode 100644 arch/x86/crypto/sha256_x8_avx2.S
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 07b00bfca64b..ab3fb2a9ebea 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -60,7 +60,7 @@ sha1-ssse3-y := sha1_avx2_x86_64_asm.o sha1_ssse3_asm.o sha1_ssse3_glue.o
sha1-ssse3-$(CONFIG_AS_SHA1_NI) += sha1_ni_asm.o
obj-$(CONFIG_CRYPTO_SHA256_SSSE3) += sha256-ssse3.o
-sha256-ssse3-y := sha256-ssse3-asm.o sha256-avx-asm.o sha256-avx2-asm.o sha256_ssse3_glue.o
+sha256-ssse3-y := sha256-ssse3-asm.o sha256-avx-asm.o sha256-avx2-asm.o sha256_ssse3_glue.o sha256_x8_avx2.o
sha256-ssse3-$(CONFIG_AS_SHA256_NI) += sha256_ni_asm.o
obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
diff --git a/arch/x86/crypto/sha256_mb_mgr_datastruct.S b/arch/x86/crypto/sha256_mb_mgr_datastruct.S
new file mode 100644
index 000000000000..5c377bac21d0
--- /dev/null
+++ b/arch/x86/crypto/sha256_mb_mgr_datastruct.S
@@ -0,0 +1,304 @@
+/*
+ * Header file for multi buffer SHA256 algorithm data structure
+ *
+ * This file is provided under a dual BSD/GPLv2 license. When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Contact Information:
+ * Megha Dey <megha.dey@linux.intel.com>
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+# Macros for defining data structures
+
+# Usage example
+
+#START_FIELDS # JOB_AES
+### name size align
+#FIELD _plaintext, 8, 8 # pointer to plaintext
+#FIELD _ciphertext, 8, 8 # pointer to ciphertext
+#FIELD _IV, 16, 8 # IV
+#FIELD _keys, 8, 8 # pointer to keys
+#FIELD _len, 4, 4 # length in bytes
+#FIELD _status, 4, 4 # status enumeration
+#FIELD _user_data, 8, 8 # pointer to user data
+#UNION _union, size1, align1, \
+# size2, align2, \
+# size3, align3, \
+# ...
+#END_FIELDS
+#%assign _JOB_AES_size _FIELD_OFFSET
+#%assign _JOB_AES_align _STRUCT_ALIGN
+
+#########################################################################
+
+# Alternate "struc-like" syntax:
+# STRUCT job_aes2
+# RES_Q .plaintext, 1
+# RES_Q .ciphertext, 1
+# RES_DQ .IV, 1
+# RES_B .nested, _JOB_AES_SIZE, _JOB_AES_ALIGN
+# RES_U .union, size1, align1, \
+# size2, align2, \
+# ...
+# ENDSTRUCT
+# # Following only needed if nesting
+# %assign job_aes2_size _FIELD_OFFSET
+# %assign job_aes2_align _STRUCT_ALIGN
+#
+# RES_* macros take a name, a count and an optional alignment.
+# The count in in terms of the base size of the macro, and the
+# default alignment is the base size.
+# The macros are:
+# Macro Base size
+# RES_B 1
+# RES_W 2
+# RES_D 4
+# RES_Q 8
+# RES_DQ 16
+# RES_Y 32
+# RES_Z 64
+#
+# RES_U defines a union. It's arguments are a name and two or more
+# pairs of "size, alignment"
+#
+# The two assigns are only needed if this structure is being nested
+# within another. Even if the assigns are not done, one can still use
+# STRUCT_NAME_size as the size of the structure.
+#
+# Note that for nesting, you still need to assign to STRUCT_NAME_size.
+#
+# The differences between this and using "struc" directly are that each
+# type is implicitly aligned to its natural length (although this can be
+# over-ridden with an explicit third parameter), and that the structure
+# is padded at the end to its overall alignment.
+#
+
+#########################################################################
+
+#ifndef _DATASTRUCT_ASM_
+#define _DATASTRUCT_ASM_
+
+#define SZ8 8*SHA256_DIGEST_WORD_SIZE
+#define ROUNDS 64*SZ8
+#define PTR_SZ 8
+#define SHA256_DIGEST_WORD_SIZE 4
+#define MAX_SHA256_LANES 8
+#define SHA256_DIGEST_WORDS 8
+#define SHA256_DIGEST_ROW_SIZE (MAX_SHA256_LANES * SHA256_DIGEST_WORD_SIZE)
+#define SHA256_DIGEST_SIZE (SHA256_DIGEST_ROW_SIZE * SHA256_DIGEST_WORDS)
+#define SHA256_BLK_SZ 64
+
+# START_FIELDS
+.macro START_FIELDS
+ _FIELD_OFFSET = 0
+ _STRUCT_ALIGN = 0
+.endm
+
+# FIELD name size align
+.macro FIELD name size align
+ _FIELD_OFFSET = (_FIELD_OFFSET + (\align) - 1) & (~ ((\align)-1))
+ \name = _FIELD_OFFSET
+ _FIELD_OFFSET = _FIELD_OFFSET + (\size)
+.if (\align > _STRUCT_ALIGN)
+ _STRUCT_ALIGN = \align
+.endif
+.endm
+
+# END_FIELDS
+.macro END_FIELDS
+ _FIELD_OFFSET = (_FIELD_OFFSET + _STRUCT_ALIGN-1) & (~ (_STRUCT_ALIGN-1))
+.endm
+
+########################################################################
+
+.macro STRUCT p1
+START_FIELDS
+.struc \p1
+.endm
+
+.macro ENDSTRUCT
+ tmp = _FIELD_OFFSET
+ END_FIELDS
+ tmp = (_FIELD_OFFSET - %%tmp)
+.if (tmp > 0)
+ .lcomm tmp
+.endif
+.endstruc
+.endm
+
+## RES_int name size align
+.macro RES_int p1 p2 p3
+ name = \p1
+ size = \p2
+ align = .\p3
+
+ _FIELD_OFFSET = (_FIELD_OFFSET + (align) - 1) & (~ ((align)-1))
+.align align
+.lcomm name size
+ _FIELD_OFFSET = _FIELD_OFFSET + (size)
+.if (align > _STRUCT_ALIGN)
+ _STRUCT_ALIGN = align
+.endif
+.endm
+
+# macro RES_B name, size [, align]
+.macro RES_B _name, _size, _align=1
+RES_int _name _size _align
+.endm
+
+# macro RES_W name, size [, align]
+.macro RES_W _name, _size, _align=2
+RES_int _name 2*(_size) _align
+.endm
+
+# macro RES_D name, size [, align]
+.macro RES_D _name, _size, _align=4
+RES_int _name 4*(_size) _align
+.endm
+
+# macro RES_Q name, size [, align]
+.macro RES_Q _name, _size, _align=8
+RES_int _name 8*(_size) _align
+.endm
+
+# macro RES_DQ name, size [, align]
+.macro RES_DQ _name, _size, _align=16
+RES_int _name 16*(_size) _align
+.endm
+
+# macro RES_Y name, size [, align]
+.macro RES_Y _name, _size, _align=32
+RES_int _name 32*(_size) _align
+.endm
+
+# macro RES_Z name, size [, align]
+.macro RES_Z _name, _size, _align=64
+RES_int _name 64*(_size) _align
+.endm
+
+#endif
+
+
+########################################################################
+#### Define SHA256 Out Of Order Data Structures
+########################################################################
+
+START_FIELDS # LANE_DATA
+### name size align
+FIELD _job_in_lane, 8, 8 # pointer to job object
+END_FIELDS
+
+ _LANE_DATA_size = _FIELD_OFFSET
+ _LANE_DATA_align = _STRUCT_ALIGN
+
+########################################################################
+
+START_FIELDS # SHA256_ARGS_X4
+### name size align
+FIELD _digest, 4*8*8, 4 # transposed digest
+FIELD _data_ptr, 8*8, 8 # array of pointers to data
+END_FIELDS
+
+ _SHA256_ARGS_X4_size = _FIELD_OFFSET
+ _SHA256_ARGS_X4_align = _STRUCT_ALIGN
+ _SHA256_ARGS_X8_size = _FIELD_OFFSET
+ _SHA256_ARGS_X8_align = _STRUCT_ALIGN
+
+#######################################################################
+
+START_FIELDS # MB_MGR
+### name size align
+FIELD _args, _SHA256_ARGS_X4_size, _SHA256_ARGS_X4_align
+FIELD _lens, 4*8, 8
+FIELD _unused_lanes, 8, 8
+FIELD _ldata, _LANE_DATA_size*8, _LANE_DATA_align
+END_FIELDS
+
+ _MB_MGR_size = _FIELD_OFFSET
+ _MB_MGR_align = _STRUCT_ALIGN
+
+_args_digest = _args + _digest
+_args_data_ptr = _args + _data_ptr
+
+#######################################################################
+
+START_FIELDS #STACK_FRAME
+### name size align
+FIELD _data, 16*SZ8, 1 # transposed digest
+FIELD _digest, 8*SZ8, 1 # array of pointers to data
+FIELD _ytmp, 4*SZ8, 1
+FIELD _rsp, 8, 1
+END_FIELDS
+
+ _STACK_FRAME_size = _FIELD_OFFSET
+ _STACK_FRAME_align = _STRUCT_ALIGN
+
+#######################################################################
+
+########################################################################
+#### Define constants
+########################################################################
+
+#define STS_UNKNOWN 0
+#define STS_BEING_PROCESSED 1
+#define STS_COMPLETED 2
+
+########################################################################
+#### Define JOB_SHA256 structure
+########################################################################
+
+START_FIELDS # JOB_SHA256
+
+### name size align
+FIELD _buffer, 8, 8 # pointer to buffer
+FIELD _len, 8, 8 # length in bytes
+FIELD _result_digest, 8*4, 32 # Digest (output)
+FIELD _status, 4, 4
+FIELD _user_data, 8, 8
+END_FIELDS
+
+ _JOB_SHA256_size = _FIELD_OFFSET
+ _JOB_SHA256_align = _STRUCT_ALIGN
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index e04a43d9f7d5..130918d01930 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -41,8 +41,24 @@
#include <asm/cpu_device_id.h>
#include <asm/simd.h>
+struct sha256_x8_mbctx {
+ u32 state[8][8];
+ const u8 *input[8];
+};
+
+struct sha256_reqctx {
+ struct sha256_state state;
+ struct crypto_hash_walk walk;
+ const u8 *input;
+ int total;
+ unsigned int next;
+};
+
asmlinkage void sha256_transform_ssse3(struct sha256_state *state,
const u8 *data, int blocks);
+asmlinkage void sha256_transform_rorx(struct sha256_state *state,
+ const u8 *data, int blocks);
+asmlinkage void sha256_x8_avx2(struct sha256_x8_mbctx *mbctx, int blocks);
static const struct x86_cpu_id module_cpu_ids[] = {
#ifdef CONFIG_AS_SHA256_NI
@@ -55,14 +71,69 @@ static const struct x86_cpu_id module_cpu_ids[] = {
};
MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids);
-static int _sha256_update(struct shash_desc *desc, const u8 *data,
- unsigned int len, sha256_block_fn *sha256_xform)
+static int sha256_import(struct ahash_request *req, const void *in)
{
- struct sha256_state *sctx = shash_desc_ctx(desc);
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ memcpy(&rctx->state, in, sizeof(rctx->state));
+ return 0;
+}
+
+static int sha256_export(struct ahash_request *req, void *out)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+
+ memcpy(out, &rctx->state, sizeof(rctx->state));
+ return 0;
+}
+
+static int sha256_ahash_init(struct ahash_request *req)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct ahash_request *r2;
+
+ sha256_init(&rctx->state);
+
+ if (!ahash_request_chained(req))
+ return 0;
+
+ req->base.err = 0;
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ r2->base.err = 0;
+ rctx = ahash_request_ctx(r2);
+ sha256_init(&rctx->state);
+ }
+
+ return 0;
+}
+
+static int sha224_ahash_init(struct ahash_request *req)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct ahash_request *r2;
+
+ sha224_init(&rctx->state);
+
+ if (!ahash_request_chained(req))
+ return 0;
+
+ req->base.err = 0;
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ rctx = ahash_request_ctx(r2);
+ sha224_init(&rctx->state);
+ }
+
+ return 0;
+}
+
+static void __sha256_update(struct sha256_state *sctx, const u8 *data,
+ unsigned int len, sha256_block_fn *sha256_xform)
+{
if (!crypto_simd_usable() ||
- (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
- return crypto_sha256_update(desc, data, len);
+ (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) {
+ sha256_update(sctx, data, len);
+ return;
+ }
/*
* Make sure struct sha256_state begins directly with the SHA256
@@ -71,25 +142,97 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data,
BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
kernel_fpu_begin();
- sha256_base_do_update(desc, data, len, sha256_xform);
+ lib_sha256_base_do_update(sctx, data, len, sha256_xform);
+ kernel_fpu_end();
+}
+
+static int _sha256_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len, sha256_block_fn *sha256_xform)
+{
+ __sha256_update(shash_desc_ctx(desc), data, len, sha256_xform);
+ return 0;
+}
+
+static int sha256_ahash_update(struct ahash_request *req,
+ sha256_block_fn *sha256_xform)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct crypto_hash_walk *walk = &rctx->walk;
+ struct sha256_state *state = &rctx->state;
+ int nbytes;
+
+ /*
+ * Make sure struct sha256_state begins directly with the SHA256
+ * 256-bit internal state, as this is what the asm functions expect.
+ */
+ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
+
+ for (nbytes = crypto_hash_walk_first(req, walk); nbytes > 0;
+ nbytes = crypto_hash_walk_done(walk, 0))
+ __sha256_update(state, walk->data, nbytes, sha256_xform);
+
+ return nbytes;
+}
+
+static void _sha256_finup(struct sha256_state *state, const u8 *data,
+ unsigned int len, u8 *out, unsigned int ds,
+ sha256_block_fn *sha256_xform)
+{
+ if (!crypto_simd_usable()) {
+ sha256_update(state, data, len);
+ if (ds == SHA224_DIGEST_SIZE)
+ sha224_final(state, out);
+ else
+ sha256_final(state, out);
+ return;
+ }
+
+ kernel_fpu_begin();
+ if (len)
+ lib_sha256_base_do_update(state, data, len, sha256_xform);
+ lib_sha256_base_do_finalize(state, sha256_xform);
kernel_fpu_end();
- return 0;
+ lib_sha256_base_finish(state, out, ds);
+}
+
+static int sha256_ahash_finup(struct ahash_request *req, bool nodata,
+ sha256_block_fn *sha256_xform)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct crypto_hash_walk *walk = &rctx->walk;
+ struct sha256_state *state = &rctx->state;
+ unsigned int ds;
+ int nbytes;
+
+ ds = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+ if (nodata || !req->nbytes) {
+ _sha256_finup(state, NULL, 0, req->result,
+ ds, sha256_xform);
+ return 0;
+ }
+
+ for (nbytes = crypto_hash_walk_first(req, walk); nbytes > 0;
+ nbytes = crypto_hash_walk_done(walk, 0)) {
+ if (crypto_hash_walk_last(walk)) {
+ _sha256_finup(state, walk->data, nbytes, req->result,
+ ds, sha256_xform);
+ continue;
+ }
+
+ __sha256_update(state, walk->data, nbytes, sha256_xform);
+ }
+
+ return nbytes;
}
static int sha256_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out, sha256_block_fn *sha256_xform)
{
- if (!crypto_simd_usable())
- return crypto_sha256_finup(desc, data, len, out);
+ unsigned int ds = crypto_shash_digestsize(desc->tfm);
- kernel_fpu_begin();
- if (len)
- sha256_base_do_update(desc, data, len, sha256_xform);
- sha256_base_do_finalize(desc, sha256_xform);
- kernel_fpu_end();
-
- return sha256_base_finish(desc, out);
+ _sha256_finup(shash_desc_ctx(desc), data, len, out, ds, sha256_xform);
+ return 0;
}
static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
@@ -247,61 +390,374 @@ static void unregister_sha256_avx(void)
ARRAY_SIZE(sha256_avx_algs));
}
-asmlinkage void sha256_transform_rorx(struct sha256_state *state,
- const u8 *data, int blocks);
-
-static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
- unsigned int len)
+static int sha256_pad2(unsigned int partial, struct ahash_request *req)
{
- return _sha256_update(desc, data, len, sha256_transform_rorx);
+ const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct sha256_state *state = &rctx->state;
+ __be64 *bits;
+
+ if (rctx->total)
+ return 0;
+
+ rctx->total = -1;
+
+ memset(state->buf + partial, 0, bit_offset - partial);
+ bits = (__be64 *)(state->buf + bit_offset);
+ *bits = cpu_to_be64(state->count << 3);
+
+ return SHA256_BLOCK_SIZE;
}
-static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *out)
+static int sha256_pad1(struct ahash_request *req, bool final)
{
- return sha256_finup(desc, data, len, out, sha256_transform_rorx);
+ const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct sha256_state *state = &rctx->state;
+ unsigned int partial = state->count;
+
+ if (!final)
+ return 0;
+
+ rctx->total = 0;
+ rctx->input = state->buf;
+
+ partial %= SHA256_BLOCK_SIZE;
+ state->buf[partial++] = 0x80;
+
+ if (partial > bit_offset) {
+ memset(state->buf + partial, 0, SHA256_BLOCK_SIZE - partial);
+ return SHA256_BLOCK_SIZE;
+ }
+
+ return sha256_pad2(partial, req);
}
-static int sha256_avx2_final(struct shash_desc *desc, u8 *out)
+static int sha256_mb_fill(struct ahash_request *req, bool final)
{
- return sha256_avx2_finup(desc, NULL, 0, out);
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct sha256_state *state = &rctx->state;
+ int nbytes = rctx->total;
+ unsigned int partial;
+
+ partial = state->count % SHA256_BLOCK_SIZE;
+ while (partial + nbytes < SHA256_BLOCK_SIZE) {
+ memcpy(state->buf + partial, rctx->input, nbytes);
+ state->count += nbytes;
+ partial += nbytes;
+
+ nbytes = crypto_hash_walk_done(&rctx->walk, 0);
+ if (!nbytes)
+ return sha256_pad1(req, final);
+
+ rctx->input = rctx->walk.data;
+ rctx->total = nbytes;
+ }
+
+ if (partial) {
+ unsigned int offset = SHA256_BLOCK_SIZE - partial;
+
+ memcpy(state->buf + partial, rctx->input, offset);
+ rctx->input = state->buf;
+
+ return SHA256_BLOCK_SIZE;
+ }
+
+ return nbytes;
}
-static int sha256_avx2_digest(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *out)
+static int sha256_mb_start(struct ahash_request *req, bool nodata, bool final)
{
- return sha256_base_init(desc) ?:
- sha256_avx2_finup(desc, data, len, out);
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ int nbytes;
+
+ nbytes = nodata ? 0 : crypto_hash_walk_first(req, &rctx->walk);
+ if (!nbytes)
+ return sha256_pad1(req, final);
+
+ rctx->input = rctx->walk.data;
+ rctx->total = nbytes;
+
+ return sha256_mb_fill(req, final);
}
-static struct shash_alg sha256_avx2_algs[] = { {
- .digestsize = SHA256_DIGEST_SIZE,
- .init = sha256_base_init,
+static int sha256_mb_next(struct ahash_request *req, unsigned int len,
+ bool final)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct sha256_state *state = &rctx->state;
+
+ if (rctx->input != state->buf)
+ ;
+ else if (rctx->total <= 0)
+ return sha256_pad2(0, req);
+ else {
+ len = SHA256_BLOCK_SIZE - state->count % SHA256_BLOCK_SIZE;
+ rctx->input = rctx->walk.data;
+ }
+
+ rctx->input += len;
+ rctx->total -= len;
+ state->count += len;
+
+ return sha256_mb_fill(req, final);
+}
+
+static struct ahash_request *sha256_update_x8x1(
+ struct list_head *list, struct ahash_request *r2,
+ struct ahash_request *reqs[8], bool nodata, bool final)
+{
+ struct sha256_state *states[8];
+ struct sha256_x8_mbctx mbctx;
+ unsigned int len = 0;
+ int i = 0;
+
+ do {
+ struct sha256_reqctx *rctx = ahash_request_ctx(reqs[i]);
+ unsigned int nbytes;
+
+ nbytes = rctx->next;
+ if (!i || nbytes < len)
+ len = nbytes;
+
+ states[i] = &rctx->state;
+ mbctx.input[i] = rctx->input;
+ } while (++i < 8 && reqs[i]);
+
+ len &= ~(SHA256_BLOCK_SIZE - 1);
+
+ /* 3 is the break-even point for x8. */
+ if (i < 3) {
+ do {
+ i--;
+ sha256_transform_rorx(states[i], mbctx.input[i],
+ len / SHA256_BLOCK_SIZE);
+ } while (i);
+ goto done;
+ }
+
+ for (; i < 8; i++) {
+ mbctx.input[i] = mbctx.input[0];
+ states[i] = NULL;
+ }
+
+ for (i = 0; i < 8; i++) {
+ int j;
+
+ for (j = 0; j < 8; j++)
+ mbctx.state[i][j] = states[j] ? states[j]->state[i] : 0;
+ }
+
+ sha256_x8_avx2(&mbctx, len / SHA256_BLOCK_SIZE);
+
+ for (i = 0; i < 8 && states[i]; i++) {
+ int j;
+
+ for (j = 0; j < 8; j++)
+ states[i]->state[j] = mbctx.state[j][i];
+ }
+
+done:
+ i = 0;
+ do {
+ struct sha256_reqctx *rctx = ahash_request_ctx(reqs[i]);
+
+ rctx->next = sha256_mb_next(reqs[i], len, final);
+
+ if (rctx->next) {
+ if (++i >= 8)
+ break;
+ continue;
+ }
+
+ if (i < 7 && reqs[i + 1]) {
+ memmove(reqs + i, reqs + i + 1, sizeof(r2) * (7 - i));
+ reqs[7] = NULL;
+ continue;
+ }
+
+ reqs[i] = NULL;
+
+ do {
+ while (!list_is_last(&r2->base.list, list)) {
+ r2 = list_next_entry(r2, base.list);
+ r2->base.err = 0;
+
+ rctx = ahash_request_ctx(r2);
+ rctx->next = sha256_mb_start(r2, nodata, final);
+ if (rctx->next) {
+ reqs[i] = r2;
+ break;
+ }
+ }
+ } while (reqs[i] && ++i < 8);
+
+ break;
+ } while (reqs[i]);
+
+ return r2;
+}
+
+static void sha256_update_x8(struct list_head *list,
+ struct ahash_request *reqs[8], int i,
+ bool nodata, bool final)
+{
+ struct ahash_request *r2 = reqs[i - 1];
+
+ do {
+ r2 = sha256_update_x8x1(list, r2, reqs, nodata, final);
+ } while (reqs[0]);
+}
+
+static void sha256_chain(struct ahash_request *req, bool nodata, bool final)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ unsigned int ds = crypto_ahash_digestsize(tfm);
+ struct ahash_request *reqs[8] = {};
+ struct ahash_request *r2;
+ int i;
+
+ req->base.err = 0;
+ reqs[0] = req;
+ rctx->next = sha256_mb_start(req, nodata, final);
+ i = !!rctx->next;
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
+
+ r2->base.err = 0;
+
+ r2ctx = ahash_request_ctx(r2);
+ r2ctx->next = sha256_mb_start(r2, nodata, final);
+ if (!r2ctx->next)
+ continue;
+
+ reqs[i++] = r2;
+ if (i >= 8)
+ break;
+ }
+
+ if (i)
+ sha256_update_x8(&req->base.list, reqs, i, nodata, final);
+
+ if (!final)
+ return;
+
+ lib_sha256_base_finish(&rctx->state, req->result, ds);
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
+
+ lib_sha256_base_finish(&r2ctx->state, r2->result, ds);
+ }
+}
+
+static int sha256_avx2_update(struct ahash_request *req)
+{
+ struct ahash_request *r2;
+ int err;
+
+ if (ahash_request_chained(req) && crypto_simd_usable()) {
+ sha256_chain(req, false, false);
+ return 0;
+ }
+
+ err = sha256_ahash_update(req, sha256_transform_rorx);
+ if (!ahash_request_chained(req))
+ return err;
+
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ err = sha256_ahash_update(r2, sha256_transform_rorx);
+ r2->base.err = err;
+ }
+
+ return 0;
+}
+
+static int _sha256_avx2_finup(struct ahash_request *req, bool nodata)
+{
+ struct ahash_request *r2;
+ int err;
+
+ if (ahash_request_chained(req) && crypto_simd_usable()) {
+ sha256_chain(req, nodata, true);
+ return 0;
+ }
+
+ err = sha256_ahash_finup(req, nodata, sha256_transform_rorx);
+ if (!ahash_request_chained(req))
+ return err;
+
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ err = sha256_ahash_finup(r2, nodata, sha256_transform_rorx);
+ r2->base.err = err;
+ }
+
+ return 0;
+}
+
+static int sha256_avx2_finup(struct ahash_request *req)
+{
+ return _sha256_avx2_finup(req, false);
+}
+
+static int sha256_avx2_final(struct ahash_request *req)
+{
+ return _sha256_avx2_finup(req, true);
+}
+
+static int sha256_avx2_digest(struct ahash_request *req)
+{
+ return sha256_ahash_init(req) ?:
+ sha256_avx2_finup(req);
+}
+
+static int sha224_avx2_digest(struct ahash_request *req)
+{
+ return sha224_ahash_init(req) ?:
+ sha256_avx2_finup(req);
+}
+
+static struct ahash_alg sha256_avx2_algs[] = { {
+ .halg.digestsize = SHA256_DIGEST_SIZE,
+ .halg.statesize = sizeof(struct sha256_state),
+ .reqsize = sizeof(struct sha256_reqctx),
+ .init = sha256_ahash_init,
.update = sha256_avx2_update,
.final = sha256_avx2_final,
.finup = sha256_avx2_finup,
.digest = sha256_avx2_digest,
- .descsize = sizeof(struct sha256_state),
- .base = {
+ .import = sha256_import,
+ .export = sha256_export,
+ .halg.base = {
.cra_name = "sha256",
.cra_driver_name = "sha256-avx2",
.cra_priority = 170,
.cra_blocksize = SHA256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
+ .cra_flags = CRYPTO_ALG_REQ_CHAIN,
}
}, {
- .digestsize = SHA224_DIGEST_SIZE,
- .init = sha224_base_init,
+ .halg.digestsize = SHA224_DIGEST_SIZE,
+ .halg.statesize = sizeof(struct sha256_state),
+ .reqsize = sizeof(struct sha256_reqctx),
+ .init = sha224_ahash_init,
.update = sha256_avx2_update,
.final = sha256_avx2_final,
.finup = sha256_avx2_finup,
- .descsize = sizeof(struct sha256_state),
- .base = {
+ .digest = sha224_avx2_digest,
+ .import = sha256_import,
+ .export = sha256_export,
+ .halg.base = {
.cra_name = "sha224",
.cra_driver_name = "sha224-avx2",
.cra_priority = 170,
.cra_blocksize = SHA224_BLOCK_SIZE,
.cra_module = THIS_MODULE,
+ .cra_flags = CRYPTO_ALG_REQ_CHAIN,
}
} };
@@ -317,7 +773,7 @@ static bool avx2_usable(void)
static int register_sha256_avx2(void)
{
if (avx2_usable())
- return crypto_register_shashes(sha256_avx2_algs,
+ return crypto_register_ahashes(sha256_avx2_algs,
ARRAY_SIZE(sha256_avx2_algs));
return 0;
}
@@ -325,7 +781,7 @@ static int register_sha256_avx2(void)
static void unregister_sha256_avx2(void)
{
if (avx2_usable())
- crypto_unregister_shashes(sha256_avx2_algs,
+ crypto_unregister_ahashes(sha256_avx2_algs,
ARRAY_SIZE(sha256_avx2_algs));
}
diff --git a/arch/x86/crypto/sha256_x8_avx2.S b/arch/x86/crypto/sha256_x8_avx2.S
new file mode 100644
index 000000000000..ce74f8963236
--- /dev/null
+++ b/arch/x86/crypto/sha256_x8_avx2.S
@@ -0,0 +1,598 @@
+/*
+ * Multi-buffer SHA256 algorithm hash compute routine
+ *
+ * This file is provided under a dual BSD/GPLv2 license. When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Contact Information:
+ * Megha Dey <megha.dey@linux.intel.com>
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <asm/frame.h>
+#include <linux/cfi_types.h>
+#include <linux/linkage.h>
+#include "sha256_mb_mgr_datastruct.S"
+
+## code to compute oct SHA256 using SSE-256
+## outer calling routine takes care of save and restore of XMM registers
+## Logic designed/laid out by JDG
+
+## Function clobbers: rax, rcx, rdx, rbx, rsi, rdi, r9-r15; %ymm0-15
+## Linux clobbers: rax rbx rcx rdx rsi r9 r10 r11 r12 r13 r14 r15
+## Linux preserves: rdi rbp r8
+##
+## clobbers %ymm0-15
+
+arg1 = %rdi
+arg2 = %rsi
+reg3 = %rcx
+reg4 = %rdx
+
+# Common definitions
+STATE = arg1
+INP_SIZE = arg2
+
+IDX = %rax
+ROUND = %rbx
+TBL = reg3
+
+inp0 = %r9
+inp1 = %r10
+inp2 = %r11
+inp3 = %r12
+inp4 = %r13
+inp5 = %r14
+inp6 = %r15
+inp7 = reg4
+
+a = %ymm0
+b = %ymm1
+c = %ymm2
+d = %ymm3
+e = %ymm4
+f = %ymm5
+g = %ymm6
+h = %ymm7
+
+T1 = %ymm8
+
+a0 = %ymm12
+a1 = %ymm13
+a2 = %ymm14
+TMP = %ymm15
+TMP0 = %ymm6
+TMP1 = %ymm7
+
+TT0 = %ymm8
+TT1 = %ymm9
+TT2 = %ymm10
+TT3 = %ymm11
+TT4 = %ymm12
+TT5 = %ymm13
+TT6 = %ymm14
+TT7 = %ymm15
+
+# Define stack usage
+
+# Assume stack aligned to 32 bytes before call
+# Therefore FRAMESZ mod 32 must be 32-8 = 24
+
+#define FRAMESZ 0x388
+
+#define VMOVPS vmovups
+
+# TRANSPOSE8 r0, r1, r2, r3, r4, r5, r6, r7, t0, t1
+# "transpose" data in {r0...r7} using temps {t0...t1}
+# Input looks like: {r0 r1 r2 r3 r4 r5 r6 r7}
+# r0 = {a7 a6 a5 a4 a3 a2 a1 a0}
+# r1 = {b7 b6 b5 b4 b3 b2 b1 b0}
+# r2 = {c7 c6 c5 c4 c3 c2 c1 c0}
+# r3 = {d7 d6 d5 d4 d3 d2 d1 d0}
+# r4 = {e7 e6 e5 e4 e3 e2 e1 e0}
+# r5 = {f7 f6 f5 f4 f3 f2 f1 f0}
+# r6 = {g7 g6 g5 g4 g3 g2 g1 g0}
+# r7 = {h7 h6 h5 h4 h3 h2 h1 h0}
+#
+# Output looks like: {r0 r1 r2 r3 r4 r5 r6 r7}
+# r0 = {h0 g0 f0 e0 d0 c0 b0 a0}
+# r1 = {h1 g1 f1 e1 d1 c1 b1 a1}
+# r2 = {h2 g2 f2 e2 d2 c2 b2 a2}
+# r3 = {h3 g3 f3 e3 d3 c3 b3 a3}
+# r4 = {h4 g4 f4 e4 d4 c4 b4 a4}
+# r5 = {h5 g5 f5 e5 d5 c5 b5 a5}
+# r6 = {h6 g6 f6 e6 d6 c6 b6 a6}
+# r7 = {h7 g7 f7 e7 d7 c7 b7 a7}
+#
+
+.macro TRANSPOSE8 r0 r1 r2 r3 r4 r5 r6 r7 t0 t1
+ # process top half (r0..r3) {a...d}
+ vshufps $0x44, \r1, \r0, \t0 # t0 = {b5 b4 a5 a4 b1 b0 a1 a0}
+ vshufps $0xEE, \r1, \r0, \r0 # r0 = {b7 b6 a7 a6 b3 b2 a3 a2}
+ vshufps $0x44, \r3, \r2, \t1 # t1 = {d5 d4 c5 c4 d1 d0 c1 c0}
+ vshufps $0xEE, \r3, \r2, \r2 # r2 = {d7 d6 c7 c6 d3 d2 c3 c2}
+ vshufps $0xDD, \t1, \t0, \r3 # r3 = {d5 c5 b5 a5 d1 c1 b1 a1}
+ vshufps $0x88, \r2, \r0, \r1 # r1 = {d6 c6 b6 a6 d2 c2 b2 a2}
+ vshufps $0xDD, \r2, \r0, \r0 # r0 = {d7 c7 b7 a7 d3 c3 b3 a3}
+ vshufps $0x88, \t1, \t0, \t0 # t0 = {d4 c4 b4 a4 d0 c0 b0 a0}
+
+ # use r2 in place of t0
+ # process bottom half (r4..r7) {e...h}
+ vshufps $0x44, \r5, \r4, \r2 # r2 = {f5 f4 e5 e4 f1 f0 e1 e0}
+ vshufps $0xEE, \r5, \r4, \r4 # r4 = {f7 f6 e7 e6 f3 f2 e3 e2}
+ vshufps $0x44, \r7, \r6, \t1 # t1 = {h5 h4 g5 g4 h1 h0 g1 g0}
+ vshufps $0xEE, \r7, \r6, \r6 # r6 = {h7 h6 g7 g6 h3 h2 g3 g2}
+ vshufps $0xDD, \t1, \r2, \r7 # r7 = {h5 g5 f5 e5 h1 g1 f1 e1}
+ vshufps $0x88, \r6, \r4, \r5 # r5 = {h6 g6 f6 e6 h2 g2 f2 e2}
+ vshufps $0xDD, \r6, \r4, \r4 # r4 = {h7 g7 f7 e7 h3 g3 f3 e3}
+ vshufps $0x88, \t1, \r2, \t1 # t1 = {h4 g4 f4 e4 h0 g0 f0 e0}
+
+ vperm2f128 $0x13, \r1, \r5, \r6 # h6...a6
+ vperm2f128 $0x02, \r1, \r5, \r2 # h2...a2
+ vperm2f128 $0x13, \r3, \r7, \r5 # h5...a5
+ vperm2f128 $0x02, \r3, \r7, \r1 # h1...a1
+ vperm2f128 $0x13, \r0, \r4, \r7 # h7...a7
+ vperm2f128 $0x02, \r0, \r4, \r3 # h3...a3
+ vperm2f128 $0x13, \t0, \t1, \r4 # h4...a4
+ vperm2f128 $0x02, \t0, \t1, \r0 # h0...a0
+
+.endm
+
+.macro ROTATE_ARGS
+TMP_ = h
+h = g
+g = f
+f = e
+e = d
+d = c
+c = b
+b = a
+a = TMP_
+.endm
+
+.macro _PRORD reg imm tmp
+ vpslld $(32-\imm),\reg,\tmp
+ vpsrld $\imm,\reg, \reg
+ vpor \tmp,\reg, \reg
+.endm
+
+# PRORD_nd reg, imm, tmp, src
+.macro _PRORD_nd reg imm tmp src
+ vpslld $(32-\imm), \src, \tmp
+ vpsrld $\imm, \src, \reg
+ vpor \tmp, \reg, \reg
+.endm
+
+# PRORD dst/src, amt
+.macro PRORD reg imm
+ _PRORD \reg,\imm,TMP
+.endm
+
+# PRORD_nd dst, src, amt
+.macro PRORD_nd reg tmp imm
+ _PRORD_nd \reg, \imm, TMP, \tmp
+.endm
+
+# arguments passed implicitly in preprocessor symbols i, a...h
+.macro ROUND_00_15 _T1 i
+ PRORD_nd a0,e,5 # sig1: a0 = (e >> 5)
+
+ vpxor g, f, a2 # ch: a2 = f^g
+ vpand e,a2, a2 # ch: a2 = (f^g)&e
+ vpxor g, a2, a2 # a2 = ch
+
+ PRORD_nd a1,e,25 # sig1: a1 = (e >> 25)
+
+ vmovdqu \_T1,(SZ8*(\i & 0xf))(%rsp)
+ vpaddd (TBL,ROUND,1), \_T1, \_T1 # T1 = W + K
+ vpxor e,a0, a0 # sig1: a0 = e ^ (e >> 5)
+ PRORD a0, 6 # sig1: a0 = (e >> 6) ^ (e >> 11)
+ vpaddd a2, h, h # h = h + ch
+ PRORD_nd a2,a,11 # sig0: a2 = (a >> 11)
+ vpaddd \_T1,h, h # h = h + ch + W + K
+ vpxor a1, a0, a0 # a0 = sigma1
+ PRORD_nd a1,a,22 # sig0: a1 = (a >> 22)
+ vpxor c, a, \_T1 # maj: T1 = a^c
+ add $SZ8, ROUND # ROUND++
+ vpand b, \_T1, \_T1 # maj: T1 = (a^c)&b
+ vpaddd a0, h, h
+ vpaddd h, d, d
+ vpxor a, a2, a2 # sig0: a2 = a ^ (a >> 11)
+ PRORD a2,2 # sig0: a2 = (a >> 2) ^ (a >> 13)
+ vpxor a1, a2, a2 # a2 = sig0
+ vpand c, a, a1 # maj: a1 = a&c
+ vpor \_T1, a1, a1 # a1 = maj
+ vpaddd a1, h, h # h = h + ch + W + K + maj
+ vpaddd a2, h, h # h = h + ch + W + K + maj + sigma0
+ ROTATE_ARGS
+.endm
+
+# arguments passed implicitly in preprocessor symbols i, a...h
+.macro ROUND_16_XX _T1 i
+ vmovdqu (SZ8*((\i-15)&0xf))(%rsp), \_T1
+ vmovdqu (SZ8*((\i-2)&0xf))(%rsp), a1
+ vmovdqu \_T1, a0
+ PRORD \_T1,11
+ vmovdqu a1, a2
+ PRORD a1,2
+ vpxor a0, \_T1, \_T1
+ PRORD \_T1, 7
+ vpxor a2, a1, a1
+ PRORD a1, 17
+ vpsrld $3, a0, a0
+ vpxor a0, \_T1, \_T1
+ vpsrld $10, a2, a2
+ vpxor a2, a1, a1
+ vpaddd (SZ8*((\i-16)&0xf))(%rsp), \_T1, \_T1
+ vpaddd (SZ8*((\i-7)&0xf))(%rsp), a1, a1
+ vpaddd a1, \_T1, \_T1
+
+ ROUND_00_15 \_T1,\i
+.endm
+
+# void sha256_x8_avx2(struct sha256_mbctx *ctx, int blocks);
+#
+# arg 1 : ctx : pointer to array of pointers to input data
+# arg 2 : blocks : size of input in blocks
+ # save rsp, allocate 32-byte aligned for local variables
+SYM_FUNC_START(sha256_x8_avx2)
+ # save callee-saved clobbered registers to comply with C function ABI
+ push %rbx
+ push %r12
+ push %r13
+ push %r14
+ push %r15
+
+ push %rbp
+ mov %rsp, %rbp
+
+ sub $FRAMESZ, %rsp
+ and $~0x1F, %rsp
+
+ # Load the pre-transposed incoming digest.
+ vmovdqu 0*SHA256_DIGEST_ROW_SIZE(STATE),a
+ vmovdqu 1*SHA256_DIGEST_ROW_SIZE(STATE),b
+ vmovdqu 2*SHA256_DIGEST_ROW_SIZE(STATE),c
+ vmovdqu 3*SHA256_DIGEST_ROW_SIZE(STATE),d
+ vmovdqu 4*SHA256_DIGEST_ROW_SIZE(STATE),e
+ vmovdqu 5*SHA256_DIGEST_ROW_SIZE(STATE),f
+ vmovdqu 6*SHA256_DIGEST_ROW_SIZE(STATE),g
+ vmovdqu 7*SHA256_DIGEST_ROW_SIZE(STATE),h
+
+ lea K256_8(%rip),TBL
+
+ # load the address of each of the 4 message lanes
+ # getting ready to transpose input onto stack
+ mov _args_data_ptr+0*PTR_SZ(STATE),inp0
+ mov _args_data_ptr+1*PTR_SZ(STATE),inp1
+ mov _args_data_ptr+2*PTR_SZ(STATE),inp2
+ mov _args_data_ptr+3*PTR_SZ(STATE),inp3
+ mov _args_data_ptr+4*PTR_SZ(STATE),inp4
+ mov _args_data_ptr+5*PTR_SZ(STATE),inp5
+ mov _args_data_ptr+6*PTR_SZ(STATE),inp6
+ mov _args_data_ptr+7*PTR_SZ(STATE),inp7
+
+ xor IDX, IDX
+lloop:
+ xor ROUND, ROUND
+
+ # save old digest
+ vmovdqu a, _digest(%rsp)
+ vmovdqu b, _digest+1*SZ8(%rsp)
+ vmovdqu c, _digest+2*SZ8(%rsp)
+ vmovdqu d, _digest+3*SZ8(%rsp)
+ vmovdqu e, _digest+4*SZ8(%rsp)
+ vmovdqu f, _digest+5*SZ8(%rsp)
+ vmovdqu g, _digest+6*SZ8(%rsp)
+ vmovdqu h, _digest+7*SZ8(%rsp)
+ i = 0
+.rep 2
+ VMOVPS i*32(inp0, IDX), TT0
+ VMOVPS i*32(inp1, IDX), TT1
+ VMOVPS i*32(inp2, IDX), TT2
+ VMOVPS i*32(inp3, IDX), TT3
+ VMOVPS i*32(inp4, IDX), TT4
+ VMOVPS i*32(inp5, IDX), TT5
+ VMOVPS i*32(inp6, IDX), TT6
+ VMOVPS i*32(inp7, IDX), TT7
+ vmovdqu g, _ytmp(%rsp)
+ vmovdqu h, _ytmp+1*SZ8(%rsp)
+ TRANSPOSE8 TT0, TT1, TT2, TT3, TT4, TT5, TT6, TT7, TMP0, TMP1
+ vmovdqu PSHUFFLE_BYTE_FLIP_MASK(%rip), TMP1
+ vmovdqu _ytmp(%rsp), g
+ vpshufb TMP1, TT0, TT0
+ vpshufb TMP1, TT1, TT1
+ vpshufb TMP1, TT2, TT2
+ vpshufb TMP1, TT3, TT3
+ vpshufb TMP1, TT4, TT4
+ vpshufb TMP1, TT5, TT5
+ vpshufb TMP1, TT6, TT6
+ vpshufb TMP1, TT7, TT7
+ vmovdqu _ytmp+1*SZ8(%rsp), h
+ vmovdqu TT4, _ytmp(%rsp)
+ vmovdqu TT5, _ytmp+1*SZ8(%rsp)
+ vmovdqu TT6, _ytmp+2*SZ8(%rsp)
+ vmovdqu TT7, _ytmp+3*SZ8(%rsp)
+ ROUND_00_15 TT0,(i*8+0)
+ vmovdqu _ytmp(%rsp), TT0
+ ROUND_00_15 TT1,(i*8+1)
+ vmovdqu _ytmp+1*SZ8(%rsp), TT1
+ ROUND_00_15 TT2,(i*8+2)
+ vmovdqu _ytmp+2*SZ8(%rsp), TT2
+ ROUND_00_15 TT3,(i*8+3)
+ vmovdqu _ytmp+3*SZ8(%rsp), TT3
+ ROUND_00_15 TT0,(i*8+4)
+ ROUND_00_15 TT1,(i*8+5)
+ ROUND_00_15 TT2,(i*8+6)
+ ROUND_00_15 TT3,(i*8+7)
+ i = (i+1)
+.endr
+ add $64, IDX
+ i = (i*8)
+
+ jmp Lrounds_16_xx
+.align 16
+Lrounds_16_xx:
+.rep 16
+ ROUND_16_XX T1, i
+ i = (i+1)
+.endr
+
+ cmp $ROUNDS,ROUND
+ jb Lrounds_16_xx
+
+ # add old digest
+ vpaddd _digest+0*SZ8(%rsp), a, a
+ vpaddd _digest+1*SZ8(%rsp), b, b
+ vpaddd _digest+2*SZ8(%rsp), c, c
+ vpaddd _digest+3*SZ8(%rsp), d, d
+ vpaddd _digest+4*SZ8(%rsp), e, e
+ vpaddd _digest+5*SZ8(%rsp), f, f
+ vpaddd _digest+6*SZ8(%rsp), g, g
+ vpaddd _digest+7*SZ8(%rsp), h, h
+
+ sub $1, INP_SIZE # unit is blocks
+ jne lloop
+
+ # write back to memory (state object) the transposed digest
+ vmovdqu a, 0*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu b, 1*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu c, 2*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu d, 3*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu e, 4*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu f, 5*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu g, 6*SHA256_DIGEST_ROW_SIZE(STATE)
+ vmovdqu h, 7*SHA256_DIGEST_ROW_SIZE(STATE)
+
+ # update input pointers
+ add IDX, inp0
+ mov inp0, _args_data_ptr+0*8(STATE)
+ add IDX, inp1
+ mov inp1, _args_data_ptr+1*8(STATE)
+ add IDX, inp2
+ mov inp2, _args_data_ptr+2*8(STATE)
+ add IDX, inp3
+ mov inp3, _args_data_ptr+3*8(STATE)
+ add IDX, inp4
+ mov inp4, _args_data_ptr+4*8(STATE)
+ add IDX, inp5
+ mov inp5, _args_data_ptr+5*8(STATE)
+ add IDX, inp6
+ mov inp6, _args_data_ptr+6*8(STATE)
+ add IDX, inp7
+ mov inp7, _args_data_ptr+7*8(STATE)
+
+ # Postamble
+ mov %rbp, %rsp
+ pop %rbp
+
+ # restore callee-saved clobbered registers
+ pop %r15
+ pop %r14
+ pop %r13
+ pop %r12
+ pop %rbx
+
+ RET
+SYM_FUNC_END(sha256_x8_avx2)
+
+.section .rodata.K256_8, "a", @progbits
+.align 64
+K256_8:
+ .octa 0x428a2f98428a2f98428a2f98428a2f98
+ .octa 0x428a2f98428a2f98428a2f98428a2f98
+ .octa 0x71374491713744917137449171374491
+ .octa 0x71374491713744917137449171374491
+ .octa 0xb5c0fbcfb5c0fbcfb5c0fbcfb5c0fbcf
+ .octa 0xb5c0fbcfb5c0fbcfb5c0fbcfb5c0fbcf
+ .octa 0xe9b5dba5e9b5dba5e9b5dba5e9b5dba5
+ .octa 0xe9b5dba5e9b5dba5e9b5dba5e9b5dba5
+ .octa 0x3956c25b3956c25b3956c25b3956c25b
+ .octa 0x3956c25b3956c25b3956c25b3956c25b
+ .octa 0x59f111f159f111f159f111f159f111f1
+ .octa 0x59f111f159f111f159f111f159f111f1
+ .octa 0x923f82a4923f82a4923f82a4923f82a4
+ .octa 0x923f82a4923f82a4923f82a4923f82a4
+ .octa 0xab1c5ed5ab1c5ed5ab1c5ed5ab1c5ed5
+ .octa 0xab1c5ed5ab1c5ed5ab1c5ed5ab1c5ed5
+ .octa 0xd807aa98d807aa98d807aa98d807aa98
+ .octa 0xd807aa98d807aa98d807aa98d807aa98
+ .octa 0x12835b0112835b0112835b0112835b01
+ .octa 0x12835b0112835b0112835b0112835b01
+ .octa 0x243185be243185be243185be243185be
+ .octa 0x243185be243185be243185be243185be
+ .octa 0x550c7dc3550c7dc3550c7dc3550c7dc3
+ .octa 0x550c7dc3550c7dc3550c7dc3550c7dc3
+ .octa 0x72be5d7472be5d7472be5d7472be5d74
+ .octa 0x72be5d7472be5d7472be5d7472be5d74
+ .octa 0x80deb1fe80deb1fe80deb1fe80deb1fe
+ .octa 0x80deb1fe80deb1fe80deb1fe80deb1fe
+ .octa 0x9bdc06a79bdc06a79bdc06a79bdc06a7
+ .octa 0x9bdc06a79bdc06a79bdc06a79bdc06a7
+ .octa 0xc19bf174c19bf174c19bf174c19bf174
+ .octa 0xc19bf174c19bf174c19bf174c19bf174
+ .octa 0xe49b69c1e49b69c1e49b69c1e49b69c1
+ .octa 0xe49b69c1e49b69c1e49b69c1e49b69c1
+ .octa 0xefbe4786efbe4786efbe4786efbe4786
+ .octa 0xefbe4786efbe4786efbe4786efbe4786
+ .octa 0x0fc19dc60fc19dc60fc19dc60fc19dc6
+ .octa 0x0fc19dc60fc19dc60fc19dc60fc19dc6
+ .octa 0x240ca1cc240ca1cc240ca1cc240ca1cc
+ .octa 0x240ca1cc240ca1cc240ca1cc240ca1cc
+ .octa 0x2de92c6f2de92c6f2de92c6f2de92c6f
+ .octa 0x2de92c6f2de92c6f2de92c6f2de92c6f
+ .octa 0x4a7484aa4a7484aa4a7484aa4a7484aa
+ .octa 0x4a7484aa4a7484aa4a7484aa4a7484aa
+ .octa 0x5cb0a9dc5cb0a9dc5cb0a9dc5cb0a9dc
+ .octa 0x5cb0a9dc5cb0a9dc5cb0a9dc5cb0a9dc
+ .octa 0x76f988da76f988da76f988da76f988da
+ .octa 0x76f988da76f988da76f988da76f988da
+ .octa 0x983e5152983e5152983e5152983e5152
+ .octa 0x983e5152983e5152983e5152983e5152
+ .octa 0xa831c66da831c66da831c66da831c66d
+ .octa 0xa831c66da831c66da831c66da831c66d
+ .octa 0xb00327c8b00327c8b00327c8b00327c8
+ .octa 0xb00327c8b00327c8b00327c8b00327c8
+ .octa 0xbf597fc7bf597fc7bf597fc7bf597fc7
+ .octa 0xbf597fc7bf597fc7bf597fc7bf597fc7
+ .octa 0xc6e00bf3c6e00bf3c6e00bf3c6e00bf3
+ .octa 0xc6e00bf3c6e00bf3c6e00bf3c6e00bf3
+ .octa 0xd5a79147d5a79147d5a79147d5a79147
+ .octa 0xd5a79147d5a79147d5a79147d5a79147
+ .octa 0x06ca635106ca635106ca635106ca6351
+ .octa 0x06ca635106ca635106ca635106ca6351
+ .octa 0x14292967142929671429296714292967
+ .octa 0x14292967142929671429296714292967
+ .octa 0x27b70a8527b70a8527b70a8527b70a85
+ .octa 0x27b70a8527b70a8527b70a8527b70a85
+ .octa 0x2e1b21382e1b21382e1b21382e1b2138
+ .octa 0x2e1b21382e1b21382e1b21382e1b2138
+ .octa 0x4d2c6dfc4d2c6dfc4d2c6dfc4d2c6dfc
+ .octa 0x4d2c6dfc4d2c6dfc4d2c6dfc4d2c6dfc
+ .octa 0x53380d1353380d1353380d1353380d13
+ .octa 0x53380d1353380d1353380d1353380d13
+ .octa 0x650a7354650a7354650a7354650a7354
+ .octa 0x650a7354650a7354650a7354650a7354
+ .octa 0x766a0abb766a0abb766a0abb766a0abb
+ .octa 0x766a0abb766a0abb766a0abb766a0abb
+ .octa 0x81c2c92e81c2c92e81c2c92e81c2c92e
+ .octa 0x81c2c92e81c2c92e81c2c92e81c2c92e
+ .octa 0x92722c8592722c8592722c8592722c85
+ .octa 0x92722c8592722c8592722c8592722c85
+ .octa 0xa2bfe8a1a2bfe8a1a2bfe8a1a2bfe8a1
+ .octa 0xa2bfe8a1a2bfe8a1a2bfe8a1a2bfe8a1
+ .octa 0xa81a664ba81a664ba81a664ba81a664b
+ .octa 0xa81a664ba81a664ba81a664ba81a664b
+ .octa 0xc24b8b70c24b8b70c24b8b70c24b8b70
+ .octa 0xc24b8b70c24b8b70c24b8b70c24b8b70
+ .octa 0xc76c51a3c76c51a3c76c51a3c76c51a3
+ .octa 0xc76c51a3c76c51a3c76c51a3c76c51a3
+ .octa 0xd192e819d192e819d192e819d192e819
+ .octa 0xd192e819d192e819d192e819d192e819
+ .octa 0xd6990624d6990624d6990624d6990624
+ .octa 0xd6990624d6990624d6990624d6990624
+ .octa 0xf40e3585f40e3585f40e3585f40e3585
+ .octa 0xf40e3585f40e3585f40e3585f40e3585
+ .octa 0x106aa070106aa070106aa070106aa070
+ .octa 0x106aa070106aa070106aa070106aa070
+ .octa 0x19a4c11619a4c11619a4c11619a4c116
+ .octa 0x19a4c11619a4c11619a4c11619a4c116
+ .octa 0x1e376c081e376c081e376c081e376c08
+ .octa 0x1e376c081e376c081e376c081e376c08
+ .octa 0x2748774c2748774c2748774c2748774c
+ .octa 0x2748774c2748774c2748774c2748774c
+ .octa 0x34b0bcb534b0bcb534b0bcb534b0bcb5
+ .octa 0x34b0bcb534b0bcb534b0bcb534b0bcb5
+ .octa 0x391c0cb3391c0cb3391c0cb3391c0cb3
+ .octa 0x391c0cb3391c0cb3391c0cb3391c0cb3
+ .octa 0x4ed8aa4a4ed8aa4a4ed8aa4a4ed8aa4a
+ .octa 0x4ed8aa4a4ed8aa4a4ed8aa4a4ed8aa4a
+ .octa 0x5b9cca4f5b9cca4f5b9cca4f5b9cca4f
+ .octa 0x5b9cca4f5b9cca4f5b9cca4f5b9cca4f
+ .octa 0x682e6ff3682e6ff3682e6ff3682e6ff3
+ .octa 0x682e6ff3682e6ff3682e6ff3682e6ff3
+ .octa 0x748f82ee748f82ee748f82ee748f82ee
+ .octa 0x748f82ee748f82ee748f82ee748f82ee
+ .octa 0x78a5636f78a5636f78a5636f78a5636f
+ .octa 0x78a5636f78a5636f78a5636f78a5636f
+ .octa 0x84c8781484c8781484c8781484c87814
+ .octa 0x84c8781484c8781484c8781484c87814
+ .octa 0x8cc702088cc702088cc702088cc70208
+ .octa 0x8cc702088cc702088cc702088cc70208
+ .octa 0x90befffa90befffa90befffa90befffa
+ .octa 0x90befffa90befffa90befffa90befffa
+ .octa 0xa4506ceba4506ceba4506ceba4506ceb
+ .octa 0xa4506ceba4506ceba4506ceba4506ceb
+ .octa 0xbef9a3f7bef9a3f7bef9a3f7bef9a3f7
+ .octa 0xbef9a3f7bef9a3f7bef9a3f7bef9a3f7
+ .octa 0xc67178f2c67178f2c67178f2c67178f2
+ .octa 0xc67178f2c67178f2c67178f2c67178f2
+
+.section .rodata.cst32.PSHUFFLE_BYTE_FLIP_MASK, "aM", @progbits, 32
+.align 32
+PSHUFFLE_BYTE_FLIP_MASK:
+.octa 0x0c0d0e0f08090a0b0405060700010203
+.octa 0x0c0d0e0f08090a0b0405060700010203
+
+.section .rodata.cst256.K256, "aM", @progbits, 256
+.align 64
+.global K256
+K256:
+ .int 0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
+ .int 0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
+ .int 0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
+ .int 0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
+ .int 0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
+ .int 0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
+ .int 0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
+ .int 0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
+ .int 0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
+ .int 0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
+ .int 0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
+ .int 0xd192e819,0xd6990624,0xf40e3585,0x106aa070
+ .int 0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
+ .int 0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
+ .int 0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
+ .int 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 09/11] crypto: hash - Add sync hash interface
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (7 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 08/11] crypto: x86/sha2 - Restore multibuffer AVX2 support Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 10:51 ` kernel test robot
2025-02-16 11:42 ` kernel test robot
2025-02-16 3:07 ` [v2 PATCH 10/11] fsverity: Use sync hash instead of shash Herbert Xu
` (3 subsequent siblings)
12 siblings, 2 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Introduce a new sync hash interface based on ahash, similar to
sync skcipher.
It will replace shash for existing users.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
crypto/ahash.c | 37 ++++++++++++++++
include/crypto/hash.h | 100 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 137 insertions(+)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 6b19fa6fc628..fafce2e47a78 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -949,6 +949,27 @@ struct crypto_ahash *crypto_alloc_ahash(const char *alg_name, u32 type,
}
EXPORT_SYMBOL_GPL(crypto_alloc_ahash);
+struct crypto_sync_hash *crypto_alloc_sync_hash(const char *alg_name,
+ u32 type, u32 mask)
+{
+ struct crypto_ahash *tfm;
+
+ /* Only sync algorithms allowed. */
+ mask |= CRYPTO_ALG_ASYNC;
+ type &= ~CRYPTO_ALG_ASYNC;
+
+ tfm = crypto_alloc_ahash(alg_name, type, mask);
+
+ if (!IS_ERR(tfm) && WARN_ON(crypto_ahash_reqsize(tfm) >
+ MAX_SYNC_HASH_REQSIZE)) {
+ crypto_free_ahash(tfm);
+ return ERR_PTR(-EINVAL);
+ }
+
+ return container_of(tfm, struct crypto_sync_hash, base);
+}
+EXPORT_SYMBOL_GPL(crypto_alloc_sync_hash);
+
int crypto_has_ahash(const char *alg_name, u32 type, u32 mask)
{
return crypto_type_has_alg(alg_name, &crypto_ahash_type, type, mask);
@@ -1123,5 +1144,21 @@ void ahash_request_free(struct ahash_request *req)
}
EXPORT_SYMBOL_GPL(ahash_request_free);
+int crypto_sync_hash_digest(struct crypto_sync_hash *tfm, const u8 *data,
+ unsigned int len, u8 *out)
+{
+ SYNC_HASH_REQUEST_ON_STACK(req, tfm);
+ int err;
+
+ ahash_request_set_callback(req, 0, NULL, NULL);
+ ahash_request_set_virt(req, data, out, len);
+ err = crypto_ahash_digest(req);
+
+ ahash_request_zero(req);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(crypto_shash_tfm_digest);
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Asynchronous cryptographic hash type");
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 2aa83ee0ec98..f6e0c44331a3 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -8,6 +8,7 @@
#ifndef _CRYPTO_HASH_H
#define _CRYPTO_HASH_H
+#include <linux/align.h>
#include <linux/atomic.h>
#include <linux/crypto.h>
#include <linux/string.h>
@@ -162,6 +163,8 @@ struct shash_desc {
void *__ctx[] __aligned(ARCH_SLAB_MINALIGN);
};
+struct sync_hash_requests;
+
#define HASH_MAX_DIGESTSIZE 64
/*
@@ -169,12 +172,30 @@ struct shash_desc {
* containing a 'struct sha3_state'.
*/
#define HASH_MAX_DESCSIZE (sizeof(struct shash_desc) + 360)
+#define MAX_SYNC_HASH_REQSIZE HASH_MAX_DESCSIZE
#define SHASH_DESC_ON_STACK(shash, ctx) \
char __##shash##_desc[sizeof(struct shash_desc) + HASH_MAX_DESCSIZE] \
__aligned(__alignof__(struct shash_desc)); \
struct shash_desc *shash = (struct shash_desc *)__##shash##_desc
+#define SYNC_HASH_REQUEST_ON_STACK(name, _tfm) \
+ char __##name##_req[sizeof(struct ahash_request) + \
+ MAX_SYNC_HASH_REQSIZE \
+ ] CRYPTO_MINALIGN_ATTR; \
+ struct ahash_request *name = \
+ (((struct ahash_request *)__##name##_req)->base.tfm = \
+ crypto_sync_hash_tfm((_tfm)), \
+ (void *)__##name##_req)
+
+#define SYNC_HASH_REQUESTS_ON_STACK(name, _n, _tfm) \
+ char __##name##_req[(_n) * ALIGN(sizeof(struct ahash_request) + \
+ MAX_SYNC_HASH_REQSIZE, \
+ CRYPTO_MINALIGN) \
+ ] CRYPTO_MINALIGN_ATTR; \
+ struct sync_hash_requests *name = sync_hash_requests_on_stack_init( \
+ __##name##_req, sizeof(__##name##_req), (_tfm))
+
/**
* struct shash_alg - synchronous message digest definition
* @init: see struct ahash_alg
@@ -241,6 +262,10 @@ struct crypto_shash {
struct crypto_tfm base;
};
+struct crypto_sync_hash {
+ struct crypto_ahash base;
+};
+
/**
* DOC: Asynchronous Message Digest API
*
@@ -273,6 +298,9 @@ static inline struct crypto_ahash *__crypto_ahash_cast(struct crypto_tfm *tfm)
struct crypto_ahash *crypto_alloc_ahash(const char *alg_name, u32 type,
u32 mask);
+struct crypto_sync_hash *crypto_alloc_sync_hash(const char *alg_name,
+ u32 type, u32 mask);
+
struct crypto_ahash *crypto_clone_ahash(struct crypto_ahash *tfm);
static inline struct crypto_tfm *crypto_ahash_tfm(struct crypto_ahash *tfm)
@@ -280,6 +308,12 @@ static inline struct crypto_tfm *crypto_ahash_tfm(struct crypto_ahash *tfm)
return &tfm->base;
}
+static inline struct crypto_tfm *crypto_sync_hash_tfm(
+ struct crypto_sync_hash *tfm)
+{
+ return crypto_ahash_tfm(&tfm->base);
+}
+
/**
* crypto_free_ahash() - zeroize and free the ahash handle
* @tfm: cipher handle to be freed
@@ -291,6 +325,11 @@ static inline void crypto_free_ahash(struct crypto_ahash *tfm)
crypto_destroy_tfm(tfm, crypto_ahash_tfm(tfm));
}
+static inline void crypto_free_sync_hash(struct crypto_sync_hash *tfm)
+{
+ crypto_free_ahash(&tfm->base);
+}
+
/**
* crypto_has_ahash() - Search for the availability of an ahash.
* @alg_name: is the cra_name / name or cra_driver_name / driver name of the
@@ -313,6 +352,12 @@ static inline const char *crypto_ahash_driver_name(struct crypto_ahash *tfm)
return crypto_tfm_alg_driver_name(crypto_ahash_tfm(tfm));
}
+static inline const char *crypto_sync_hash_driver_name(
+ struct crypto_sync_hash *tfm)
+{
+ return crypto_ahash_driver_name(&tfm->base);
+}
+
/**
* crypto_ahash_blocksize() - obtain block size for cipher
* @tfm: cipher handle
@@ -327,6 +372,12 @@ static inline unsigned int crypto_ahash_blocksize(struct crypto_ahash *tfm)
return crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
}
+static inline unsigned int crypto_sync_hash_blocksize(
+ struct crypto_sync_hash *tfm)
+{
+ return crypto_ahash_blocksize(&tfm->base);
+}
+
static inline struct hash_alg_common *__crypto_hash_alg_common(
struct crypto_alg *alg)
{
@@ -354,6 +405,12 @@ static inline unsigned int crypto_ahash_digestsize(struct crypto_ahash *tfm)
return crypto_hash_alg_common(tfm)->digestsize;
}
+static inline unsigned int crypto_sync_hash_digestsize(
+ struct crypto_sync_hash *tfm)
+{
+ return crypto_ahash_digestsize(&tfm->base);
+}
+
/**
* crypto_ahash_statesize() - obtain size of the ahash state
* @tfm: cipher handle
@@ -369,6 +426,12 @@ static inline unsigned int crypto_ahash_statesize(struct crypto_ahash *tfm)
return tfm->statesize;
}
+static inline unsigned int crypto_sync_hash_statesize(
+ struct crypto_sync_hash *tfm)
+{
+ return crypto_ahash_statesize(&tfm->base);
+}
+
static inline u32 crypto_ahash_get_flags(struct crypto_ahash *tfm)
{
return crypto_tfm_get_flags(crypto_ahash_tfm(tfm));
@@ -877,6 +940,9 @@ int crypto_shash_digest(struct shash_desc *desc, const u8 *data,
int crypto_shash_tfm_digest(struct crypto_shash *tfm, const u8 *data,
unsigned int len, u8 *out);
+int crypto_sync_hash_digest(struct crypto_sync_hash *tfm, const u8 *data,
+ unsigned int len, u8 *out);
+
/**
* crypto_shash_export() - extract operational state for message digest
* @desc: reference to the operational state handle whose state is exported
@@ -982,6 +1048,13 @@ static inline void shash_desc_zero(struct shash_desc *desc)
sizeof(*desc) + crypto_shash_descsize(desc->tfm));
}
+static inline void ahash_request_zero(struct ahash_request *req)
+{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+
+ memzero_explicit(req, sizeof(*req) + crypto_ahash_reqsize(tfm));
+}
+
static inline int ahash_request_err(struct ahash_request *req)
{
return req->base.err;
@@ -992,4 +1065,31 @@ static inline bool ahash_is_async(struct crypto_ahash *tfm)
return crypto_tfm_is_async(&tfm->base);
}
+static inline struct ahash_request *sync_hash_requests(
+ struct sync_hash_requests *reqs, int i)
+{
+ unsigned unit = sizeof(struct ahash_request) + MAX_SYNC_HASH_REQSIZE;
+ unsigned alunit = ALIGN(unit, CRYPTO_MINALIGN);
+
+ return (void *)((char *)reqs + i * alunit);
+}
+
+static inline struct sync_hash_requests *sync_hash_requests_on_stack_init(
+ char *buf, unsigned len, struct crypto_sync_hash *tfm)
+{
+ unsigned unit = sizeof(struct ahash_request) + MAX_SYNC_HASH_REQSIZE;
+ unsigned alunit = ALIGN(unit, CRYPTO_MINALIGN);
+ struct sync_hash_requests *reqs = (void *)buf;
+ int n = len / alunit;
+ int i;
+
+ for (i = 0; i < n; i++) {
+ struct ahash_request *req = sync_hash_requests(reqs, i);
+
+ req->base.tfm = crypto_sync_hash_tfm(tfm);
+ }
+
+ return reqs;
+}
+
#endif /* _CRYPTO_HASH_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 10/11] fsverity: Use sync hash instead of shash
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (8 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 09/11] crypto: hash - Add sync hash interface Herbert Xu
@ 2025-02-16 3:07 ` Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 11/11] fsverity: improve performance by using multibuffer hashing Eric Biggers
` (2 subsequent siblings)
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Use the sync hash interface instead of shash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
fs/verity/fsverity_private.h | 2 +-
fs/verity/hash_algs.c | 41 +++++++++++++++++++-----------------
2 files changed, 23 insertions(+), 20 deletions(-)
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index b3506f56e180..aecc221daf8b 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -20,7 +20,7 @@
/* A hash algorithm supported by fs-verity */
struct fsverity_hash_alg {
- struct crypto_shash *tfm; /* hash tfm, allocated on demand */
+ struct crypto_sync_hash *tfm; /* hash tfm, allocated on demand */
const char *name; /* crypto API name, e.g. sha256 */
unsigned int digest_size; /* digest size in bytes, e.g. 32 for SHA-256 */
unsigned int block_size; /* block size in bytes, e.g. 64 for SHA-256 */
diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
index 6b08b1d9a7d7..e088bcfe5ed1 100644
--- a/fs/verity/hash_algs.c
+++ b/fs/verity/hash_algs.c
@@ -43,7 +43,7 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
unsigned int num)
{
struct fsverity_hash_alg *alg;
- struct crypto_shash *tfm;
+ struct crypto_sync_hash *tfm;
int err;
if (num >= ARRAY_SIZE(fsverity_hash_algs) ||
@@ -62,7 +62,7 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
if (alg->tfm != NULL)
goto out_unlock;
- tfm = crypto_alloc_shash(alg->name, 0, 0);
+ tfm = crypto_alloc_sync_hash(alg->name, 0, 0);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fsverity_warn(inode,
@@ -79,20 +79,20 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
}
err = -EINVAL;
- if (WARN_ON_ONCE(alg->digest_size != crypto_shash_digestsize(tfm)))
+ if (WARN_ON_ONCE(alg->digest_size != crypto_sync_hash_digestsize(tfm)))
goto err_free_tfm;
- if (WARN_ON_ONCE(alg->block_size != crypto_shash_blocksize(tfm)))
+ if (WARN_ON_ONCE(alg->block_size != crypto_sync_hash_blocksize(tfm)))
goto err_free_tfm;
pr_info("%s using implementation \"%s\"\n",
- alg->name, crypto_shash_driver_name(tfm));
+ alg->name, crypto_sync_hash_driver_name(tfm));
/* pairs with smp_load_acquire() above */
smp_store_release(&alg->tfm, tfm);
goto out_unlock;
err_free_tfm:
- crypto_free_shash(tfm);
+ crypto_free_sync_hash(tfm);
alg = ERR_PTR(err);
out_unlock:
mutex_unlock(&fsverity_hash_alg_init_mutex);
@@ -112,17 +112,15 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
const u8 *salt, size_t salt_size)
{
u8 *hashstate = NULL;
- SHASH_DESC_ON_STACK(desc, alg->tfm);
+ SYNC_HASH_REQUEST_ON_STACK(req, alg->tfm);
u8 *padded_salt = NULL;
size_t padded_salt_size;
int err;
- desc->tfm = alg->tfm;
-
if (salt_size == 0)
return NULL;
- hashstate = kmalloc(crypto_shash_statesize(alg->tfm), GFP_KERNEL);
+ hashstate = kmalloc(crypto_sync_hash_statesize(alg->tfm), GFP_KERNEL);
if (!hashstate)
return ERR_PTR(-ENOMEM);
@@ -140,15 +138,19 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
goto err_free;
}
memcpy(padded_salt, salt, salt_size);
- err = crypto_shash_init(desc);
+
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+
+ err = crypto_ahash_init(req);
if (err)
goto err_free;
- err = crypto_shash_update(desc, padded_salt, padded_salt_size);
+ ahash_request_set_virt(req, padded_salt, NULL, padded_salt_size);
+ err = crypto_ahash_update(req);
if (err)
goto err_free;
- err = crypto_shash_export(desc, hashstate);
+ err = crypto_ahash_export(req, hashstate);
if (err)
goto err_free;
out:
@@ -176,21 +178,22 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
int fsverity_hash_block(const struct merkle_tree_params *params,
const struct inode *inode, const void *data, u8 *out)
{
- SHASH_DESC_ON_STACK(desc, params->hash_alg->tfm);
+ SYNC_HASH_REQUEST_ON_STACK(req, params->hash_alg->tfm);
int err;
- desc->tfm = params->hash_alg->tfm;
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_virt(req, data, out, params->block_size);
if (params->hashstate) {
- err = crypto_shash_import(desc, params->hashstate);
+ err = crypto_ahash_import(req, params->hashstate);
if (err) {
fsverity_err(inode,
"Error %d importing hash state", err);
return err;
}
- err = crypto_shash_finup(desc, data, params->block_size, out);
+ err = crypto_ahash_finup(req);
} else {
- err = crypto_shash_digest(desc, data, params->block_size, out);
+ err = crypto_ahash_digest(req);
}
if (err)
fsverity_err(inode, "Error %d computing block hash", err);
@@ -209,7 +212,7 @@ int fsverity_hash_block(const struct merkle_tree_params *params,
int fsverity_hash_buffer(const struct fsverity_hash_alg *alg,
const void *data, size_t size, u8 *out)
{
- return crypto_shash_tfm_digest(alg->tfm, data, size, out);
+ return crypto_sync_hash_digest(alg->tfm, data, size, out);
}
void __init fsverity_check_hash_algs(void)
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 11/11] fsverity: improve performance by using multibuffer hashing
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (9 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 10/11] fsverity: Use sync hash instead of shash Herbert Xu
@ 2025-02-16 3:07 ` Eric Biggers
2025-02-16 3:10 ` Herbert Xu
2025-02-16 3:38 ` [v2 PATCH 00/11] Multibuffer hashing take two Eric Biggers
12 siblings, 0 replies; 42+ messages in thread
From: Eric Biggers @ 2025-02-16 3:07 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
When supported by the hash algorithm, use crypto_shash_finup_mb() to
interleave the hashing of pairs of data blocks. On some CPUs this
nearly doubles hashing performance. The increase in overall throughput
of cold-cache fsverity reads that I'm seeing on arm64 and x86_64 is
roughly 35% (though this metric is hard to measure as it jumps around a
lot).
For now this is only done on the verification path, and only for data
blocks, not Merkle tree blocks. We could use finup_mb on Merkle tree
blocks too, but that is less important as there aren't as many Merkle
tree blocks as data blocks, and that would require some additional code
restructuring. We could also use finup_mb to accelerate building the
Merkle tree, but verification performance is more important.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
fs/verity/fsverity_private.h | 2 +
fs/verity/verify.c | 179 +++++++++++++++++++++++++++++------
2 files changed, 151 insertions(+), 30 deletions(-)
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index aecc221daf8b..3d03fb1e41f0 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -152,6 +152,8 @@ static inline void fsverity_init_signature(void)
/* verify.c */
+#define FS_VERITY_MAX_PENDING_DATA_BLOCKS 2
+
void __init fsverity_init_workqueue(void);
#endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 4fcad0825a12..15bf0887a827 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -10,6 +10,27 @@
#include <crypto/hash.h>
#include <linux/bio.h>
+struct fsverity_pending_block {
+ const void *data;
+ u64 pos;
+ u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
+};
+
+struct fsverity_verification_context {
+ struct inode *inode;
+ struct fsverity_info *vi;
+ unsigned long max_ra_pages;
+
+ /*
+ * This is the queue of data blocks that are pending verification. We
+ * allow multiple blocks to be queued up in order to support multibuffer
+ * hashing, i.e. interleaving the hashing of multiple messages. On many
+ * CPUs this improves performance significantly.
+ */
+ int num_pending;
+ struct fsverity_pending_block pending_blocks[FS_VERITY_MAX_PENDING_DATA_BLOCKS];
+};
+
static struct workqueue_struct *fsverity_read_workqueue;
/*
@@ -79,7 +100,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
}
/*
- * Verify a single data block against the file's Merkle tree.
+ * Verify the hash of a single data block against the file's Merkle tree.
*
* In principle, we need to verify the entire path to the root node. However,
* for efficiency the filesystem may cache the hash blocks. Therefore we need
@@ -90,8 +111,10 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
*/
static bool
verify_data_block(struct inode *inode, struct fsverity_info *vi,
- const void *data, u64 data_pos, unsigned long max_ra_pages)
+ const struct fsverity_pending_block *dblock,
+ unsigned long max_ra_pages)
{
+ const u64 data_pos = dblock->pos;
const struct merkle_tree_params *params = &vi->tree_params;
const unsigned int hsize = params->digest_size;
int level;
@@ -115,8 +138,12 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
*/
u64 hidx = data_pos >> params->log_blocksize;
- /* Up to 1 + FS_VERITY_MAX_LEVELS pages may be mapped at once */
- BUILD_BUG_ON(1 + FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
+ /*
+ * Up to FS_VERITY_MAX_PENDING_DATA_BLOCKS + FS_VERITY_MAX_LEVELS pages
+ * may be mapped at once.
+ */
+ BUILD_BUG_ON(FS_VERITY_MAX_PENDING_DATA_BLOCKS +
+ FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
if (unlikely(data_pos >= inode->i_size)) {
/*
@@ -127,7 +154,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
* any part past EOF should be all zeroes. Therefore, we need
* to verify that any data blocks fully past EOF are all zeroes.
*/
- if (memchr_inv(data, 0, params->block_size)) {
+ if (memchr_inv(dblock->data, 0, params->block_size)) {
fsverity_err(inode,
"FILE CORRUPTED! Data past EOF is not zeroed");
return false;
@@ -221,10 +248,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
put_page(hpage);
}
- /* Finally, verify the data block. */
- if (fsverity_hash_block(params, inode, data, real_hash) != 0)
- goto error;
- if (memcmp(want_hash, real_hash, hsize) != 0)
+ /* Finally, verify the hash of the data block. */
+ if (memcmp(want_hash, dblock->real_hash, hsize) != 0)
goto corrupted;
return true;
@@ -233,7 +258,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
"FILE CORRUPTED! pos=%llu, level=%d, want_hash=%s:%*phN, real_hash=%s:%*phN",
data_pos, level - 1,
params->hash_alg->name, hsize, want_hash,
- params->hash_alg->name, hsize, real_hash);
+ params->hash_alg->name, hsize,
+ level == 0 ? dblock->real_hash : real_hash);
error:
for (; level > 0; level--) {
kunmap_local(hblocks[level - 1].addr);
@@ -242,13 +268,91 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
return false;
}
-static bool
-verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
- unsigned long max_ra_pages)
+static void
+fsverity_init_verification_context(struct fsverity_verification_context *ctx,
+ struct inode *inode,
+ unsigned long max_ra_pages)
{
- struct inode *inode = data_folio->mapping->host;
- struct fsverity_info *vi = inode->i_verity_info;
- const unsigned int block_size = vi->tree_params.block_size;
+ ctx->inode = inode;
+ ctx->vi = inode->i_verity_info;
+ ctx->max_ra_pages = max_ra_pages;
+ ctx->num_pending = 0;
+}
+
+static void
+fsverity_clear_pending_blocks(struct fsverity_verification_context *ctx)
+{
+ int i;
+
+ for (i = ctx->num_pending - 1; i >= 0; i--) {
+ kunmap_local(ctx->pending_blocks[i].data);
+ ctx->pending_blocks[i].data = NULL;
+ }
+ ctx->num_pending = 0;
+}
+
+static bool
+fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
+{
+ struct inode *inode = ctx->inode;
+ struct fsverity_info *vi = ctx->vi;
+ const struct merkle_tree_params *params = &vi->tree_params;
+ SYNC_HASH_REQUESTS_ON_STACK(reqs, FS_VERITY_MAX_PENDING_DATA_BLOCKS, params->hash_alg->tfm);
+ struct ahash_request *req;
+ int i;
+ int err;
+
+ if (ctx->num_pending == 0)
+ return true;
+
+ req = sync_hash_requests(reqs, 0);
+ for (i = 0; i < ctx->num_pending; i++) {
+ struct ahash_request *reqi = sync_hash_requests(reqs, i);
+
+ ahash_request_set_callback(reqi, CRYPTO_TFM_REQ_MAY_SLEEP,
+ NULL, NULL);
+ ahash_request_set_virt(reqi, ctx->pending_blocks[i].data,
+ ctx->pending_blocks[i].real_hash,
+ params->block_size);
+ if (i)
+ ahash_request_chain(reqi, req);
+ if (!params->hashstate)
+ continue;
+
+ err = crypto_ahash_import(reqi, params->hashstate);
+ if (err) {
+ fsverity_err(inode, "Error %d importing hash state", err);
+ return false;
+ }
+ }
+
+ if (params->hashstate)
+ err = crypto_ahash_finup(req);
+ else
+ err = crypto_ahash_digest(req);
+ if (err) {
+ fsverity_err(inode, "Error %d computing block hashes", err);
+ return false;
+ }
+
+ for (i = 0; i < ctx->num_pending; i++) {
+ if (!verify_data_block(inode, vi, &ctx->pending_blocks[i],
+ ctx->max_ra_pages))
+ return false;
+ }
+
+ fsverity_clear_pending_blocks(ctx);
+ return true;
+}
+
+static bool
+fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
+ struct folio *data_folio, size_t len, size_t offset)
+{
+ struct fsverity_info *vi = ctx->vi;
+ const struct merkle_tree_params *params = &vi->tree_params;
+ const unsigned int block_size = params->block_size;
+ const int mb_max_msgs = FS_VERITY_MAX_PENDING_DATA_BLOCKS;
u64 pos = (u64)data_folio->index << PAGE_SHIFT;
if (WARN_ON_ONCE(len <= 0 || !IS_ALIGNED(len | offset, block_size)))
@@ -257,14 +361,11 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
folio_test_uptodate(data_folio)))
return false;
do {
- void *data;
- bool valid;
-
- data = kmap_local_folio(data_folio, offset);
- valid = verify_data_block(inode, vi, data, pos + offset,
- max_ra_pages);
- kunmap_local(data);
- if (!valid)
+ ctx->pending_blocks[ctx->num_pending].data =
+ kmap_local_folio(data_folio, offset);
+ ctx->pending_blocks[ctx->num_pending].pos = pos + offset;
+ if (++ctx->num_pending == mb_max_msgs &&
+ !fsverity_verify_pending_blocks(ctx))
return false;
offset += block_size;
len -= block_size;
@@ -286,7 +387,15 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
*/
bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
{
- return verify_data_blocks(folio, len, offset, 0);
+ struct fsverity_verification_context ctx;
+
+ fsverity_init_verification_context(&ctx, folio->mapping->host, 0);
+
+ if (fsverity_add_data_blocks(&ctx, folio, len, offset) &&
+ fsverity_verify_pending_blocks(&ctx))
+ return true;
+ fsverity_clear_pending_blocks(&ctx);
+ return false;
}
EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
@@ -307,6 +416,8 @@ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
*/
void fsverity_verify_bio(struct bio *bio)
{
+ struct inode *inode = bio_first_folio_all(bio)->mapping->host;
+ struct fsverity_verification_context ctx;
struct folio_iter fi;
unsigned long max_ra_pages = 0;
@@ -323,13 +434,21 @@ void fsverity_verify_bio(struct bio *bio)
max_ra_pages = bio->bi_iter.bi_size >> (PAGE_SHIFT + 2);
}
+ fsverity_init_verification_context(&ctx, inode, max_ra_pages);
+
bio_for_each_folio_all(fi, bio) {
- if (!verify_data_blocks(fi.folio, fi.length, fi.offset,
- max_ra_pages)) {
- bio->bi_status = BLK_STS_IOERR;
- break;
- }
+ if (!fsverity_add_data_blocks(&ctx, fi.folio, fi.length,
+ fi.offset))
+ goto ioerr;
}
+
+ if (!fsverity_verify_pending_blocks(&ctx))
+ goto ioerr;
+ return;
+
+ioerr:
+ fsverity_clear_pending_blocks(&ctx);
+ bio->bi_status = BLK_STS_IOERR;
}
EXPORT_SYMBOL_GPL(fsverity_verify_bio);
#endif /* CONFIG_BLOCK */
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [v2 PATCH 11/11] fsverity: improve performance by using multibuffer hashing
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (10 preceding siblings ...)
2025-02-16 3:07 ` [v2 PATCH 11/11] fsverity: improve performance by using multibuffer hashing Eric Biggers
@ 2025-02-16 3:10 ` Herbert Xu
2025-02-16 3:38 ` [v2 PATCH 00/11] Multibuffer hashing take two Eric Biggers
12 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 3:10 UTC (permalink / raw)
To: Linux Crypto Mailing List
Cc: Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
From: Eric Biggers <ebiggers@google.com>
When supported by the hash algorithm, use crypto_shash_finup_mb() to
interleave the hashing of pairs of data blocks. On some CPUs this
nearly doubles hashing performance. The increase in overall throughput
of cold-cache fsverity reads that I'm seeing on arm64 and x86_64 is
roughly 35% (though this metric is hard to measure as it jumps around a
lot).
For now this is only done on the verification path, and only for data
blocks, not Merkle tree blocks. We could use finup_mb on Merkle tree
blocks too, but that is less important as there aren't as many Merkle
tree blocks as data blocks, and that would require some additional code
restructuring. We could also use finup_mb to accelerate building the
Merkle tree, but verification performance is more important.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
fs/verity/fsverity_private.h | 2 +
fs/verity/verify.c | 179 +++++++++++++++++++++++++++++------
2 files changed, 151 insertions(+), 30 deletions(-)
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index aecc221daf8b..3d03fb1e41f0 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -152,6 +152,8 @@ static inline void fsverity_init_signature(void)
/* verify.c */
+#define FS_VERITY_MAX_PENDING_DATA_BLOCKS 2
+
void __init fsverity_init_workqueue(void);
#endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 4fcad0825a12..15bf0887a827 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -10,6 +10,27 @@
#include <crypto/hash.h>
#include <linux/bio.h>
+struct fsverity_pending_block {
+ const void *data;
+ u64 pos;
+ u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
+};
+
+struct fsverity_verification_context {
+ struct inode *inode;
+ struct fsverity_info *vi;
+ unsigned long max_ra_pages;
+
+ /*
+ * This is the queue of data blocks that are pending verification. We
+ * allow multiple blocks to be queued up in order to support multibuffer
+ * hashing, i.e. interleaving the hashing of multiple messages. On many
+ * CPUs this improves performance significantly.
+ */
+ int num_pending;
+ struct fsverity_pending_block pending_blocks[FS_VERITY_MAX_PENDING_DATA_BLOCKS];
+};
+
static struct workqueue_struct *fsverity_read_workqueue;
/*
@@ -79,7 +100,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
}
/*
- * Verify a single data block against the file's Merkle tree.
+ * Verify the hash of a single data block against the file's Merkle tree.
*
* In principle, we need to verify the entire path to the root node. However,
* for efficiency the filesystem may cache the hash blocks. Therefore we need
@@ -90,8 +111,10 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
*/
static bool
verify_data_block(struct inode *inode, struct fsverity_info *vi,
- const void *data, u64 data_pos, unsigned long max_ra_pages)
+ const struct fsverity_pending_block *dblock,
+ unsigned long max_ra_pages)
{
+ const u64 data_pos = dblock->pos;
const struct merkle_tree_params *params = &vi->tree_params;
const unsigned int hsize = params->digest_size;
int level;
@@ -115,8 +138,12 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
*/
u64 hidx = data_pos >> params->log_blocksize;
- /* Up to 1 + FS_VERITY_MAX_LEVELS pages may be mapped at once */
- BUILD_BUG_ON(1 + FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
+ /*
+ * Up to FS_VERITY_MAX_PENDING_DATA_BLOCKS + FS_VERITY_MAX_LEVELS pages
+ * may be mapped at once.
+ */
+ BUILD_BUG_ON(FS_VERITY_MAX_PENDING_DATA_BLOCKS +
+ FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
if (unlikely(data_pos >= inode->i_size)) {
/*
@@ -127,7 +154,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
* any part past EOF should be all zeroes. Therefore, we need
* to verify that any data blocks fully past EOF are all zeroes.
*/
- if (memchr_inv(data, 0, params->block_size)) {
+ if (memchr_inv(dblock->data, 0, params->block_size)) {
fsverity_err(inode,
"FILE CORRUPTED! Data past EOF is not zeroed");
return false;
@@ -221,10 +248,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
put_page(hpage);
}
- /* Finally, verify the data block. */
- if (fsverity_hash_block(params, inode, data, real_hash) != 0)
- goto error;
- if (memcmp(want_hash, real_hash, hsize) != 0)
+ /* Finally, verify the hash of the data block. */
+ if (memcmp(want_hash, dblock->real_hash, hsize) != 0)
goto corrupted;
return true;
@@ -233,7 +258,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
"FILE CORRUPTED! pos=%llu, level=%d, want_hash=%s:%*phN, real_hash=%s:%*phN",
data_pos, level - 1,
params->hash_alg->name, hsize, want_hash,
- params->hash_alg->name, hsize, real_hash);
+ params->hash_alg->name, hsize,
+ level == 0 ? dblock->real_hash : real_hash);
error:
for (; level > 0; level--) {
kunmap_local(hblocks[level - 1].addr);
@@ -242,13 +268,91 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
return false;
}
-static bool
-verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
- unsigned long max_ra_pages)
+static void
+fsverity_init_verification_context(struct fsverity_verification_context *ctx,
+ struct inode *inode,
+ unsigned long max_ra_pages)
{
- struct inode *inode = data_folio->mapping->host;
- struct fsverity_info *vi = inode->i_verity_info;
- const unsigned int block_size = vi->tree_params.block_size;
+ ctx->inode = inode;
+ ctx->vi = inode->i_verity_info;
+ ctx->max_ra_pages = max_ra_pages;
+ ctx->num_pending = 0;
+}
+
+static void
+fsverity_clear_pending_blocks(struct fsverity_verification_context *ctx)
+{
+ int i;
+
+ for (i = ctx->num_pending - 1; i >= 0; i--) {
+ kunmap_local(ctx->pending_blocks[i].data);
+ ctx->pending_blocks[i].data = NULL;
+ }
+ ctx->num_pending = 0;
+}
+
+static bool
+fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
+{
+ struct inode *inode = ctx->inode;
+ struct fsverity_info *vi = ctx->vi;
+ const struct merkle_tree_params *params = &vi->tree_params;
+ SYNC_HASH_REQUESTS_ON_STACK(reqs, FS_VERITY_MAX_PENDING_DATA_BLOCKS, params->hash_alg->tfm);
+ struct ahash_request *req;
+ int i;
+ int err;
+
+ if (ctx->num_pending == 0)
+ return true;
+
+ req = sync_hash_requests(reqs, 0);
+ for (i = 0; i < ctx->num_pending; i++) {
+ struct ahash_request *reqi = sync_hash_requests(reqs, i);
+
+ ahash_request_set_callback(reqi, CRYPTO_TFM_REQ_MAY_SLEEP,
+ NULL, NULL);
+ ahash_request_set_virt(reqi, ctx->pending_blocks[i].data,
+ ctx->pending_blocks[i].real_hash,
+ params->block_size);
+ if (i)
+ ahash_request_chain(reqi, req);
+ if (!params->hashstate)
+ continue;
+
+ err = crypto_ahash_import(reqi, params->hashstate);
+ if (err) {
+ fsverity_err(inode, "Error %d importing hash state", err);
+ return false;
+ }
+ }
+
+ if (params->hashstate)
+ err = crypto_ahash_finup(req);
+ else
+ err = crypto_ahash_digest(req);
+ if (err) {
+ fsverity_err(inode, "Error %d computing block hashes", err);
+ return false;
+ }
+
+ for (i = 0; i < ctx->num_pending; i++) {
+ if (!verify_data_block(inode, vi, &ctx->pending_blocks[i],
+ ctx->max_ra_pages))
+ return false;
+ }
+
+ fsverity_clear_pending_blocks(ctx);
+ return true;
+}
+
+static bool
+fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
+ struct folio *data_folio, size_t len, size_t offset)
+{
+ struct fsverity_info *vi = ctx->vi;
+ const struct merkle_tree_params *params = &vi->tree_params;
+ const unsigned int block_size = params->block_size;
+ const int mb_max_msgs = FS_VERITY_MAX_PENDING_DATA_BLOCKS;
u64 pos = (u64)data_folio->index << PAGE_SHIFT;
if (WARN_ON_ONCE(len <= 0 || !IS_ALIGNED(len | offset, block_size)))
@@ -257,14 +361,11 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
folio_test_uptodate(data_folio)))
return false;
do {
- void *data;
- bool valid;
-
- data = kmap_local_folio(data_folio, offset);
- valid = verify_data_block(inode, vi, data, pos + offset,
- max_ra_pages);
- kunmap_local(data);
- if (!valid)
+ ctx->pending_blocks[ctx->num_pending].data =
+ kmap_local_folio(data_folio, offset);
+ ctx->pending_blocks[ctx->num_pending].pos = pos + offset;
+ if (++ctx->num_pending == mb_max_msgs &&
+ !fsverity_verify_pending_blocks(ctx))
return false;
offset += block_size;
len -= block_size;
@@ -286,7 +387,15 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
*/
bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
{
- return verify_data_blocks(folio, len, offset, 0);
+ struct fsverity_verification_context ctx;
+
+ fsverity_init_verification_context(&ctx, folio->mapping->host, 0);
+
+ if (fsverity_add_data_blocks(&ctx, folio, len, offset) &&
+ fsverity_verify_pending_blocks(&ctx))
+ return true;
+ fsverity_clear_pending_blocks(&ctx);
+ return false;
}
EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
@@ -307,6 +416,8 @@ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
*/
void fsverity_verify_bio(struct bio *bio)
{
+ struct inode *inode = bio_first_folio_all(bio)->mapping->host;
+ struct fsverity_verification_context ctx;
struct folio_iter fi;
unsigned long max_ra_pages = 0;
@@ -323,13 +434,21 @@ void fsverity_verify_bio(struct bio *bio)
max_ra_pages = bio->bi_iter.bi_size >> (PAGE_SHIFT + 2);
}
+ fsverity_init_verification_context(&ctx, inode, max_ra_pages);
+
bio_for_each_folio_all(fi, bio) {
- if (!verify_data_blocks(fi.folio, fi.length, fi.offset,
- max_ra_pages)) {
- bio->bi_status = BLK_STS_IOERR;
- break;
- }
+ if (!fsverity_add_data_blocks(&ctx, fi.folio, fi.length,
+ fi.offset))
+ goto ioerr;
}
+
+ if (!fsverity_verify_pending_blocks(&ctx))
+ goto ioerr;
+ return;
+
+ioerr:
+ fsverity_clear_pending_blocks(&ctx);
+ bio->bi_status = BLK_STS_IOERR;
}
EXPORT_SYMBOL_GPL(fsverity_verify_bio);
#endif /* CONFIG_BLOCK */
--
2.39.5
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
` (11 preceding siblings ...)
2025-02-16 3:10 ` Herbert Xu
@ 2025-02-16 3:38 ` Eric Biggers
2025-02-16 11:09 ` Herbert Xu
12 siblings, 1 reply; 42+ messages in thread
From: Eric Biggers @ 2025-02-16 3:38 UTC (permalink / raw)
To: Herbert Xu; +Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Sun, Feb 16, 2025 at 11:07:10AM +0800, Herbert Xu wrote:
> This patch-set introduces two additions to the ahash interface.
> First of all request chaining is added so that an arbitrary number
> of requests can be submitted in one go. Incidentally this also
> reduces the cost of indirect calls by amortisation.
>
> It then adds virtual address support to ahash. This allows the
> user to supply a virtual address as the input instead of an SG
> list.
>
> This is assumed to be not DMA-capable so it is always copied
> before it's passed to an existing ahash driver. New drivers can
> elect to take virtual addresses directly. Of course existing shash
> algorithms are able to take virtual addresses without any copying.
>
> The next patch resurrects the old SHA2 AVX2 muiltibuffer code as
> a proof of concept that this API works. The result shows that with
> a full complement of 8 requests, this API is able to achieve parity
> with the more modern but single-threaded SHA-NI code. This passes
> the multibuffer fuzz tests.
>
> Finally introduce a sync hash interface that is similar to the sync
> skcipher interface. This will replace the shash interface for users.
> Use it in fsverity and enable multibuffer hashing.
>
> Eric Biggers (1):
> fsverity: improve performance by using multibuffer hashing
>
> Herbert Xu (10):
> crypto: ahash - Only save callback and data in ahash_save_req
> crypto: x86/ghash - Use proper helpers to clone request
> crypto: hash - Add request chaining API
> crypto: tcrypt - Restore multibuffer ahash tests
> crypto: ahash - Add virtual address support
> crypto: ahash - Set default reqsize from ahash_alg
> crypto: testmgr - Add multibuffer hash testing
> crypto: x86/sha2 - Restore multibuffer AVX2 support
> crypto: hash - Add sync hash interface
> fsverity: Use sync hash instead of shash
>
> arch/x86/crypto/Makefile | 2 +-
> arch/x86/crypto/ghash-clmulni-intel_glue.c | 23 +-
> arch/x86/crypto/sha256_mb_mgr_datastruct.S | 304 +++++++++++
> arch/x86/crypto/sha256_ssse3_glue.c | 540 ++++++++++++++++--
> arch/x86/crypto/sha256_x8_avx2.S | 598 ++++++++++++++++++++
> crypto/ahash.c | 605 ++++++++++++++++++---
> crypto/algapi.c | 2 +-
> crypto/tcrypt.c | 231 ++++++++
> crypto/testmgr.c | 132 ++++-
> fs/verity/fsverity_private.h | 4 +-
> fs/verity/hash_algs.c | 41 +-
> fs/verity/verify.c | 179 +++++-
> include/crypto/algapi.h | 11 +
> include/crypto/hash.h | 172 +++++-
> include/crypto/internal/hash.h | 17 +-
> include/linux/crypto.h | 24 +
> 16 files changed, 2659 insertions(+), 226 deletions(-)
> create mode 100644 arch/x86/crypto/sha256_mb_mgr_datastruct.S
> create mode 100644 arch/x86/crypto/sha256_x8_avx2.S
Nacked-by: Eric Biggers <ebiggers@kernel.org>
This new version hasn't fundamentally changed anything. It's still a much
worse, unnecessarily complex and still incomplete implementation compared to my
patchset which has been ready to go for nearly a year already. Please refer to
all the previous feedback that I've given.
- Eric
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing
2025-02-16 3:07 ` [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing Herbert Xu
@ 2025-02-16 9:18 ` kernel test robot
0 siblings, 0 replies; 42+ messages in thread
From: kernel test robot @ 2025-02-16 9:18 UTC (permalink / raw)
To: Herbert Xu, Linux Crypto Mailing List
Cc: oe-kbuild-all, Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Hi Herbert,
kernel test robot noticed the following build errors:
[auto build test ERROR on herbert-cryptodev-2.6/master]
[also build test ERROR on next-20250214]
[cannot apply to herbert-crypto-2.6/master brauner-vfs/vfs.all linus/master v6.14-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Herbert-Xu/crypto-ahash-Only-save-callback-and-data-in-ahash_save_req/20250216-150941
base: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
patch link: https://lore.kernel.org/r/bb74b3f54e24308bef2def8d25ed917d13590921.1739674648.git.herbert%40gondor.apana.org.au
patch subject: [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing
config: arc-randconfig-001-20250216 (https://download.01.org/0day-ci/archive/20250216/202502161754.b1Fy95ZS-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250216/202502161754.b1Fy95ZS-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502161754.b1Fy95ZS-lkp@intel.com/
All errors (new ones prefixed by >>):
crypto/testmgr.c: In function '__alg_test_hash':
>> crypto/testmgr.c:2072:69: error: passing argument 3 of 'test_hash_vs_generic_impl' from incompatible pointer type [-Werror=incompatible-pointer-types]
2072 | err = test_hash_vs_generic_impl(generic_driver, maxkeysize, reqs,
| ^~~~
| |
| struct ahash_request **
crypto/testmgr.c:1952:60: note: expected 'struct ahash_request *' but argument is of type 'struct ahash_request **'
1952 | struct ahash_request *req,
| ~~~~~~~~~~~~~~~~~~~~~~^~~
cc1: some warnings being treated as errors
vim +/test_hash_vs_generic_impl +2072 crypto/testmgr.c
1993
1994 static int __alg_test_hash(const struct hash_testvec *vecs,
1995 unsigned int num_vecs, const char *driver,
1996 u32 type, u32 mask,
1997 const char *generic_driver, unsigned int maxkeysize)
1998 {
1999 struct ahash_request *reqs[HASH_TEST_MAX_MB_MSGS] = {};
2000 struct crypto_ahash *atfm = NULL;
2001 struct crypto_shash *stfm = NULL;
2002 struct shash_desc *desc = NULL;
2003 struct test_sglist *tsgl = NULL;
2004 u8 *hashstate = NULL;
2005 unsigned int statesize;
2006 unsigned int i;
2007 int err;
2008
2009 /*
2010 * Always test the ahash API. This works regardless of whether the
2011 * algorithm is implemented as ahash or shash.
2012 */
2013
2014 atfm = crypto_alloc_ahash(driver, type, mask);
2015 if (IS_ERR(atfm)) {
2016 if (PTR_ERR(atfm) == -ENOENT)
2017 return 0;
2018 pr_err("alg: hash: failed to allocate transform for %s: %ld\n",
2019 driver, PTR_ERR(atfm));
2020 return PTR_ERR(atfm);
2021 }
2022 driver = crypto_ahash_driver_name(atfm);
2023
2024 for (i = 0; i < HASH_TEST_MAX_MB_MSGS; i++) {
2025 reqs[i] = ahash_request_alloc(atfm, GFP_KERNEL);
2026 if (!reqs[i]) {
2027 pr_err("alg: hash: failed to allocate request for %s\n",
2028 driver);
2029 err = -ENOMEM;
2030 goto out;
2031 }
2032 }
2033
2034 /*
2035 * If available also test the shash API, to cover corner cases that may
2036 * be missed by testing the ahash API only.
2037 */
2038 err = alloc_shash(driver, type, mask, &stfm, &desc);
2039 if (err)
2040 goto out;
2041
2042 tsgl = kmalloc(sizeof(*tsgl), GFP_KERNEL);
2043 if (!tsgl || init_test_sglist(tsgl) != 0) {
2044 pr_err("alg: hash: failed to allocate test buffers for %s\n",
2045 driver);
2046 kfree(tsgl);
2047 tsgl = NULL;
2048 err = -ENOMEM;
2049 goto out;
2050 }
2051
2052 statesize = crypto_ahash_statesize(atfm);
2053 if (stfm)
2054 statesize = max(statesize, crypto_shash_statesize(stfm));
2055 hashstate = kmalloc(statesize + TESTMGR_POISON_LEN, GFP_KERNEL);
2056 if (!hashstate) {
2057 pr_err("alg: hash: failed to allocate hash state buffer for %s\n",
2058 driver);
2059 err = -ENOMEM;
2060 goto out;
2061 }
2062
2063 for (i = 0; i < num_vecs; i++) {
2064 if (fips_enabled && vecs[i].fips_skip)
2065 continue;
2066
2067 err = test_hash_vec(&vecs[i], i, reqs, desc, tsgl, hashstate);
2068 if (err)
2069 goto out;
2070 cond_resched();
2071 }
> 2072 err = test_hash_vs_generic_impl(generic_driver, maxkeysize, reqs,
2073 desc, tsgl, hashstate);
2074 out:
2075 kfree(hashstate);
2076 if (tsgl) {
2077 destroy_test_sglist(tsgl);
2078 kfree(tsgl);
2079 }
2080 kfree(desc);
2081 crypto_free_shash(stfm);
2082 if (reqs[0]) {
2083 ahash_request_set_callback(reqs[0], 0, NULL, NULL);
2084 for (i = 1; i < HASH_TEST_MAX_MB_MSGS && reqs[i]; i++)
2085 ahash_request_chain(reqs[i], reqs[0]);
2086 ahash_request_free(reqs[0]);
2087 }
2088 crypto_free_ahash(atfm);
2089 return err;
2090 }
2091
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 09/11] crypto: hash - Add sync hash interface
2025-02-16 3:07 ` [v2 PATCH 09/11] crypto: hash - Add sync hash interface Herbert Xu
@ 2025-02-16 10:51 ` kernel test robot
2025-02-16 11:42 ` kernel test robot
1 sibling, 0 replies; 42+ messages in thread
From: kernel test robot @ 2025-02-16 10:51 UTC (permalink / raw)
To: Herbert Xu, Linux Crypto Mailing List
Cc: oe-kbuild-all, Eric Biggers, Ard Biesheuvel, Megha Dey, Tim Chen
Hi Herbert,
kernel test robot noticed the following build warnings:
[auto build test WARNING on herbert-cryptodev-2.6/master]
[also build test WARNING on next-20250214]
[cannot apply to herbert-crypto-2.6/master brauner-vfs/vfs.all linus/master v6.14-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Herbert-Xu/crypto-ahash-Only-save-callback-and-data-in-ahash_save_req/20250216-150941
base: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
patch link: https://lore.kernel.org/r/d6e10dbf172f0b7c791f5406d55e8f1c74492d57.1739674648.git.herbert%40gondor.apana.org.au
patch subject: [v2 PATCH 09/11] crypto: hash - Add sync hash interface
config: um-randconfig-002-20250216 (https://download.01.org/0day-ci/archive/20250216/202502161850.W7NEHTk3-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250216/202502161850.W7NEHTk3-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502161850.W7NEHTk3-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from include/crypto/internal/hash.h:12,
from crypto/hash.h:10,
from crypto/ahash.c:26:
In function 'ahash_request_set_callback',
inlined from 'crypto_sync_hash_digest' at crypto/ahash.c:1153:2:
>> include/crypto/hash.h:690:18: warning: '*(struct ahash_request *)(&__req_req[0]).base.flags' is used uninitialized [-Wuninitialized]
690 | req->base.flags &= keep;
| ~~~~~~~~~^~~~~~
crypto/ahash.c: In function 'crypto_sync_hash_digest':
include/crypto/hash.h:183:14: note: '__req_req' declared here
183 | char __##name##_req[sizeof(struct ahash_request) + \
| ^~
crypto/ahash.c:1150:9: note: in expansion of macro 'SYNC_HASH_REQUEST_ON_STACK'
1150 | SYNC_HASH_REQUEST_ON_STACK(req, tfm);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
vim +690 include/crypto/hash.h
18e33e6d5cc049 Herbert Xu 2008-07-10 654
90240ffb127729 Stephan Mueller 2014-11-12 655 /**
90240ffb127729 Stephan Mueller 2014-11-12 656 * ahash_request_set_callback() - set asynchronous callback function
90240ffb127729 Stephan Mueller 2014-11-12 657 * @req: request handle
90240ffb127729 Stephan Mueller 2014-11-12 658 * @flags: specify zero or an ORing of the flags
90240ffb127729 Stephan Mueller 2014-11-12 659 * CRYPTO_TFM_REQ_MAY_BACKLOG the request queue may back log and
90240ffb127729 Stephan Mueller 2014-11-12 660 * increase the wait queue beyond the initial maximum size;
90240ffb127729 Stephan Mueller 2014-11-12 661 * CRYPTO_TFM_REQ_MAY_SLEEP the request processing may sleep
90240ffb127729 Stephan Mueller 2014-11-12 662 * @compl: callback function pointer to be registered with the request handle
90240ffb127729 Stephan Mueller 2014-11-12 663 * @data: The data pointer refers to memory that is not used by the kernel
90240ffb127729 Stephan Mueller 2014-11-12 664 * crypto API, but provided to the callback function for it to use. Here,
90240ffb127729 Stephan Mueller 2014-11-12 665 * the caller can provide a reference to memory the callback function can
90240ffb127729 Stephan Mueller 2014-11-12 666 * operate on. As the callback function is invoked asynchronously to the
90240ffb127729 Stephan Mueller 2014-11-12 667 * related functionality, it may need to access data structures of the
90240ffb127729 Stephan Mueller 2014-11-12 668 * related functionality which can be referenced using this pointer. The
90240ffb127729 Stephan Mueller 2014-11-12 669 * callback function can access the memory via the "data" field in the
90240ffb127729 Stephan Mueller 2014-11-12 670 * &crypto_async_request data structure provided to the callback function.
90240ffb127729 Stephan Mueller 2014-11-12 671 *
90240ffb127729 Stephan Mueller 2014-11-12 672 * This function allows setting the callback function that is triggered once
90240ffb127729 Stephan Mueller 2014-11-12 673 * the cipher operation completes.
90240ffb127729 Stephan Mueller 2014-11-12 674 *
90240ffb127729 Stephan Mueller 2014-11-12 675 * The callback function is registered with the &ahash_request handle and
0184cfe72d2f13 Stephan Mueller 2016-10-21 676 * must comply with the following template::
90240ffb127729 Stephan Mueller 2014-11-12 677 *
90240ffb127729 Stephan Mueller 2014-11-12 678 * void callback_function(struct crypto_async_request *req, int error)
90240ffb127729 Stephan Mueller 2014-11-12 679 */
18e33e6d5cc049 Herbert Xu 2008-07-10 680 static inline void ahash_request_set_callback(struct ahash_request *req,
18e33e6d5cc049 Herbert Xu 2008-07-10 681 u32 flags,
3e3dc25fe7d5e3 Mark Rustad 2014-07-25 682 crypto_completion_t compl,
18e33e6d5cc049 Herbert Xu 2008-07-10 683 void *data)
18e33e6d5cc049 Herbert Xu 2008-07-10 684 {
07b1948dc8bac5 Herbert Xu 2025-02-16 685 u32 keep = CRYPTO_AHASH_REQ_VIRT;
07b1948dc8bac5 Herbert Xu 2025-02-16 686
3e3dc25fe7d5e3 Mark Rustad 2014-07-25 687 req->base.complete = compl;
18e33e6d5cc049 Herbert Xu 2008-07-10 688 req->base.data = data;
07b1948dc8bac5 Herbert Xu 2025-02-16 689 flags &= ~keep;
07b1948dc8bac5 Herbert Xu 2025-02-16 @690 req->base.flags &= keep;
07b1948dc8bac5 Herbert Xu 2025-02-16 691 req->base.flags |= flags;
c0cd3e787da854 Herbert Xu 2025-02-16 692 crypto_reqchain_init(&req->base);
18e33e6d5cc049 Herbert Xu 2008-07-10 693 }
18e33e6d5cc049 Herbert Xu 2008-07-10 694
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-16 3:38 ` [v2 PATCH 00/11] Multibuffer hashing take two Eric Biggers
@ 2025-02-16 11:09 ` Herbert Xu
2025-02-16 19:51 ` Eric Biggers
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-02-16 11:09 UTC (permalink / raw)
To: Eric Biggers
Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Sat, Feb 15, 2025 at 07:38:16PM -0800, Eric Biggers wrote:
>
> This new version hasn't fundamentally changed anything. It's still a much
> worse, unnecessarily complex and still incomplete implementation compared to my
> patchset which has been ready to go for nearly a year already. Please refer to
> all the previous feedback that I've given.
FWIW, my interface is a lot simpler than yours to implement, since
it doesn't deal with the partial buffer non-sense in assembly. In
fact that was a big mistake with the original API, the partial data
handling should've been moved to the API layer a long time ago.
Here is the result for sha256-ni-mb with my code:
testing speed of multibuffer sha256 (sha256-ni-mb)
[ 73.212300] tcrypt: test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 1 operation in 284 cycles (16 bytes)
[ 73.212805] tcrypt: test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 1 operation in 311 cycles (64 bytes)
[ 73.213256] tcrypt: test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 1 operation in 481 cycles (256 bytes)
[ 73.213715] tcrypt: test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 1 operation in 1209 cycles (1024 bytes)
[ 73.214181] tcrypt: test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 1 operation in 36107 cycles (2048 bytes)
[ 73.214904] tcrypt: test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 1 operation in 4097 cycles (4096 bytes)
[ 73.215416] tcrypt: test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 1 operation in 7991 cycles (8192 bytes)
[ 86.522453] tcrypt:
testing speed of multibuffer sha256 (sha256-ni)
[ 86.522737] tcrypt: test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 1 operation in 224 cycles (16 bytes)
[ 86.523164] tcrypt: test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 1 operation in 296 cycles (64 bytes)
[ 86.523586] tcrypt: test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 1 operation in 531 cycles (256 bytes)
[ 86.524012] tcrypt: test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 1 operation in 1466 cycles (1024 bytes)
[ 86.524602] tcrypt: test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 1 operation in 2911 cycles (2048 bytes)
[ 86.525199] tcrypt: test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 1 operation in 5566 cycles (4096 bytes)
[ 86.525806] tcrypt: test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 1 operation in 10901 cycles (8192 bytes)
So about a 36% jump in throughput at 4K on my aging Intel CPU.
Cheers,
diff --git a/arch/x86/crypto/sha256_ni_asm.S b/arch/x86/crypto/sha256_ni_asm.S
index d515a55a3bc1..10771d957116 100644
--- a/arch/x86/crypto/sha256_ni_asm.S
+++ b/arch/x86/crypto/sha256_ni_asm.S
@@ -174,6 +174,216 @@ SYM_TYPED_FUNC_START(sha256_ni_transform)
RET
SYM_FUNC_END(sha256_ni_transform)
+#undef DIGEST_PTR
+#undef DATA_PTR
+#undef NUM_BLKS
+#undef SHA256CONSTANTS
+#undef MSG
+#undef STATE0
+#undef STATE1
+#undef MSG0
+#undef MSG1
+#undef MSG2
+#undef MSG3
+#undef TMP
+#undef SHUF_MASK
+#undef ABEF_SAVE
+#undef CDGH_SAVE
+
+// parameters for sha256_ni_x2()
+#define MBCTX %rdi
+#define BLOCKS %rsi
+#define DATA1 %rdx
+#define DATA2 %rcx
+
+// other scalar variables
+#define SHA256CONSTANTS %rax
+#define COUNT %r10
+#define COUNT32 %r10d
+
+// rbx is used as a temporary.
+
+#define MSG %xmm0 // sha256rnds2 implicit operand
+#define STATE0_A %xmm1
+#define STATE1_A %xmm2
+#define STATE0_B %xmm3
+#define STATE1_B %xmm4
+#define TMP_A %xmm5
+#define TMP_B %xmm6
+#define MSG0_A %xmm7
+#define MSG1_A %xmm8
+#define MSG2_A %xmm9
+#define MSG3_A %xmm10
+#define MSG0_B %xmm11
+#define MSG1_B %xmm12
+#define MSG2_B %xmm13
+#define MSG3_B %xmm14
+#define SHUF_MASK %xmm15
+
+#define OFFSETOF_STATEA 0 // offsetof(struct sha256_x2_mbctx, state[0])
+#define OFFSETOF_STATEB 32 // offsetof(struct sha256_x2_mbctx, state[1])
+#define OFFSETOF_INPUT0 64 // offsetof(struct sha256_x2_mbctx, input[0])
+#define OFFSETOF_INPUT1 72 // offsetof(struct sha256_x2_mbctx, input[1])
+
+// Do 4 rounds of SHA-256 for each of two messages (interleaved). m0_a and m0_b
+// contain the current 4 message schedule words for the first and second message
+// respectively.
+//
+// If not all the message schedule words have been computed yet, then this also
+// computes 4 more message schedule words for each message. m1_a-m3_a contain
+// the next 3 groups of 4 message schedule words for the first message, and
+// likewise m1_b-m3_b for the second. After consuming the current value of
+// m0_a, this macro computes the group after m3_a and writes it to m0_a, and
+// likewise for *_b. This means that the next (m0_a, m1_a, m2_a, m3_a) is the
+// current (m1_a, m2_a, m3_a, m0_a), and likewise for *_b, so the caller must
+// cycle through the registers accordingly.
+.macro do_4rounds_2x i, m0_a, m1_a, m2_a, m3_a, m0_b, m1_b, m2_b, m3_b
+ movdqa (\i-32)*4(SHA256CONSTANTS), TMP_A
+ movdqa TMP_A, TMP_B
+ paddd \m0_a, TMP_A
+ paddd \m0_b, TMP_B
+.if \i < 48
+ sha256msg1 \m1_a, \m0_a
+ sha256msg1 \m1_b, \m0_b
+.endif
+ movdqa TMP_A, MSG
+ sha256rnds2 STATE0_A, STATE1_A
+ movdqa TMP_B, MSG
+ sha256rnds2 STATE0_B, STATE1_B
+ pshufd $0x0E, TMP_A, MSG
+ sha256rnds2 STATE1_A, STATE0_A
+ pshufd $0x0E, TMP_B, MSG
+ sha256rnds2 STATE1_B, STATE0_B
+.if \i < 48
+ movdqa \m3_a, TMP_A
+ movdqa \m3_b, TMP_B
+ palignr $4, \m2_a, TMP_A
+ palignr $4, \m2_b, TMP_B
+ paddd TMP_A, \m0_a
+ paddd TMP_B, \m0_b
+ sha256msg2 \m3_a, \m0_a
+ sha256msg2 \m3_b, \m0_b
+.endif
+.endm
+
+//
+// void sha256_ni_x2(struct sha256_x2_mbctx *mbctx, int blocks)
+//
+// This function computes the SHA-256 digests of two messages that are
+// both |blocks| blocks long, starting from the individual initial states
+// in |mbctx|.
+//
+// The instructions for the two SHA-256 operations are interleaved. On many
+// CPUs, this is almost twice as fast as hashing each message individually due
+// to taking better advantage of the CPU's SHA-256 and SIMD throughput.
+//
+SYM_FUNC_START(sha256_ni_x2)
+ // Allocate 64 bytes of stack space, 16-byte aligned.
+ push %rbx
+ push %rbp
+ mov %rsp, %rbp
+ sub $64, %rsp
+ and $~15, %rsp
+
+ // Load the shuffle mask for swapping the endianness of 32-bit words.
+ movdqa PSHUFFLE_BYTE_FLIP_MASK(%rip), SHUF_MASK
+
+ // Set up pointer to the round constants.
+ lea K256+32*4(%rip), SHA256CONSTANTS
+
+ // Load the initial state from sctx->state.
+ movdqu OFFSETOF_STATEA+0*16(MBCTX), STATE0_A // DCBA
+ movdqu OFFSETOF_STATEA+1*16(MBCTX), STATE1_A // HGFE
+ movdqu OFFSETOF_STATEB+0*16(MBCTX), STATE0_B // DCBA
+ movdqu OFFSETOF_STATEB+1*16(MBCTX), STATE1_B // HGFE
+
+ movdqa STATE0_A, TMP_A
+ movdqa STATE0_B, TMP_B
+ punpcklqdq STATE1_A, STATE0_A // FEBA
+ punpcklqdq STATE1_B, STATE0_B // FEBA
+ punpckhqdq TMP_A, STATE1_A // DCHG
+ punpckhqdq TMP_B, STATE1_B // DCHG
+ pshufd $0x1B, STATE0_A, STATE0_A // ABEF
+ pshufd $0x1B, STATE0_B, STATE0_B // ABEF
+ pshufd $0xB1, STATE1_A, STATE1_A // CDGH
+ pshufd $0xB1, STATE1_B, STATE1_B // CDGH
+
+ mov OFFSETOF_INPUT0+0(MBCTX),DATA1
+ mov OFFSETOF_INPUT1+0(MBCTX),DATA2
+
+.Lfinup2x_loop:
+ // Load the next two data blocks.
+ movdqu 0*16(DATA1), MSG0_A
+ movdqu 0*16(DATA2), MSG0_B
+ movdqu 1*16(DATA1), MSG1_A
+ movdqu 1*16(DATA2), MSG1_B
+ movdqu 2*16(DATA1), MSG2_A
+ movdqu 2*16(DATA2), MSG2_B
+ movdqu 3*16(DATA1), MSG3_A
+ movdqu 3*16(DATA2), MSG3_B
+ add $64, DATA1
+ add $64, DATA2
+
+ // Convert the words of the data blocks from big endian.
+ pshufb SHUF_MASK, MSG0_A
+ pshufb SHUF_MASK, MSG0_B
+ pshufb SHUF_MASK, MSG1_A
+ pshufb SHUF_MASK, MSG1_B
+ pshufb SHUF_MASK, MSG2_A
+ pshufb SHUF_MASK, MSG2_B
+ pshufb SHUF_MASK, MSG3_A
+ pshufb SHUF_MASK, MSG3_B
+
+ // Save the original state for each block.
+ movdqa STATE0_A, 0*16(%rsp)
+ movdqa STATE0_B, 1*16(%rsp)
+ movdqa STATE1_A, 2*16(%rsp)
+ movdqa STATE1_B, 3*16(%rsp)
+
+ // Do the SHA-256 rounds on each block.
+.irp i, 0, 16, 32, 48
+ do_4rounds_2x (\i + 0), MSG0_A, MSG1_A, MSG2_A, MSG3_A, \
+ MSG0_B, MSG1_B, MSG2_B, MSG3_B
+ do_4rounds_2x (\i + 4), MSG1_A, MSG2_A, MSG3_A, MSG0_A, \
+ MSG1_B, MSG2_B, MSG3_B, MSG0_B
+ do_4rounds_2x (\i + 8), MSG2_A, MSG3_A, MSG0_A, MSG1_A, \
+ MSG2_B, MSG3_B, MSG0_B, MSG1_B
+ do_4rounds_2x (\i + 12), MSG3_A, MSG0_A, MSG1_A, MSG2_A, \
+ MSG3_B, MSG0_B, MSG1_B, MSG2_B
+.endr
+
+ // Add the original state for each block.
+ paddd 0*16(%rsp), STATE0_A
+ paddd 1*16(%rsp), STATE0_B
+ paddd 2*16(%rsp), STATE1_A
+ paddd 3*16(%rsp), STATE1_B
+
+ // Update BLOCKS and loop back if more blocks remain.
+ sub $1, BLOCKS
+ jne .Lfinup2x_loop
+
+ // Write the two digests with all bytes in the correct order.
+ movdqa STATE0_A, TMP_A
+ movdqa STATE0_B, TMP_B
+ punpcklqdq STATE1_A, STATE0_A // GHEF
+ punpcklqdq STATE1_B, STATE0_B
+ punpckhqdq TMP_A, STATE1_A // ABCD
+ punpckhqdq TMP_B, STATE1_B
+ pshufd $0xB1, STATE0_A, STATE0_A // HGFE
+ pshufd $0xB1, STATE0_B, STATE0_B
+ pshufd $0x1B, STATE1_A, STATE1_A // DCBA
+ pshufd $0x1B, STATE1_B, STATE1_B
+ movdqu STATE0_A, OFFSETOF_STATEA+1*16(MBCTX)
+ movdqu STATE0_B, OFFSETOF_STATEB+1*16(MBCTX)
+ movdqu STATE1_A, OFFSETOF_STATEA+0*16(MBCTX)
+ movdqu STATE1_B, OFFSETOF_STATEB+0*16(MBCTX)
+
+ mov %rbp, %rsp
+ pop %rbp
+ pop %rbx
+ RET
+SYM_FUNC_END(sha256_ni_x2)
+
.section .rodata.cst256.K256, "aM", @progbits, 256
.align 64
K256:
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index e634b89a5123..d578fd98a0d6 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -41,6 +41,11 @@
#include <asm/cpu_device_id.h>
#include <asm/simd.h>
+struct sha256_x2_mbctx {
+ u32 state[2][8];
+ const u8 *input[2];
+};
+
struct sha256_x8_mbctx {
u32 state[8][8];
const u8 *input[8];
@@ -558,7 +563,102 @@ static int sha256_mb_next(struct ahash_request *req, unsigned int len,
return sha256_mb_fill(req, final);
}
-static struct ahash_request *sha256_update_x8x1(
+static struct ahash_request *sha256_update_x1_post(
+ struct list_head *list, struct ahash_request *r2,
+ struct ahash_request **reqs, int width,
+ unsigned int len, bool nodata, bool final)
+{
+ int i = 0;
+
+ do {
+ struct sha256_reqctx *rctx = ahash_request_ctx(reqs[i]);
+
+ rctx->next = sha256_mb_next(reqs[i], len, final);
+
+ if (rctx->next) {
+ if (++i >= width)
+ break;
+ continue;
+ }
+
+ if (i < width - 1 && reqs[i + 1]) {
+ memmove(reqs + i, reqs + i + 1,
+ sizeof(r2) * (width - i - 1));
+ reqs[width - 1] = NULL;
+ continue;
+ }
+
+ reqs[i] = NULL;
+
+ do {
+ while (!list_is_last(&r2->base.list, list)) {
+ r2 = list_next_entry(r2, base.list);
+ r2->base.err = 0;
+
+ rctx = ahash_request_ctx(r2);
+ rctx->next = sha256_mb_start(r2, nodata, final);
+ if (rctx->next) {
+ reqs[i] = r2;
+ break;
+ }
+ }
+ } while (reqs[i] && ++i < width);
+
+ break;
+ } while (reqs[i]);
+
+ return r2;
+}
+
+static int sha256_chain_pre(struct ahash_request **reqs, int width,
+ struct ahash_request *req,
+ bool nodata, bool final)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct ahash_request *r2;
+ int i;
+
+ req->base.err = 0;
+ reqs[0] = req;
+ rctx->next = sha256_mb_start(req, nodata, final);
+ i = !!rctx->next;
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
+
+ r2->base.err = 0;
+
+ r2ctx = ahash_request_ctx(r2);
+ r2ctx->next = sha256_mb_start(r2, nodata, final);
+ if (!r2ctx->next)
+ continue;
+
+ reqs[i++] = r2;
+ if (i >= width)
+ break;
+ }
+
+ return i;
+}
+
+static void sha256_chain_post(struct ahash_request *req, bool final)
+{
+ struct sha256_reqctx *rctx = ahash_request_ctx(req);
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ unsigned int ds = crypto_ahash_digestsize(tfm);
+ struct ahash_request *r2;
+
+ if (!final)
+ return;
+
+ lib_sha256_base_finish(&rctx->state, req->result, ds);
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
+
+ lib_sha256_base_finish(&r2ctx->state, r2->result, ds);
+ }
+}
+
+static struct ahash_request *sha256_avx2_update_x8x1(
struct list_head *list, struct ahash_request *r2,
struct ahash_request *reqs[8], bool nodata, bool final)
{
@@ -613,97 +713,30 @@ static struct ahash_request *sha256_update_x8x1(
}
done:
- i = 0;
- do {
- struct sha256_reqctx *rctx = ahash_request_ctx(reqs[i]);
-
- rctx->next = sha256_mb_next(reqs[i], len, final);
-
- if (rctx->next) {
- if (++i >= 8)
- break;
- continue;
- }
-
- if (i < 7 && reqs[i + 1]) {
- memmove(reqs + i, reqs + i + 1, sizeof(r2) * (7 - i));
- reqs[7] = NULL;
- continue;
- }
-
- reqs[i] = NULL;
-
- do {
- while (!list_is_last(&r2->base.list, list)) {
- r2 = list_next_entry(r2, base.list);
- r2->base.err = 0;
-
- rctx = ahash_request_ctx(r2);
- rctx->next = sha256_mb_start(r2, nodata, final);
- if (rctx->next) {
- reqs[i] = r2;
- break;
- }
- }
- } while (reqs[i] && ++i < 8);
-
- break;
- } while (reqs[i]);
-
- return r2;
+ return sha256_update_x1_post(list, r2, reqs, 8, len, nodata, final);
}
-static void sha256_update_x8(struct list_head *list,
- struct ahash_request *reqs[8], int i,
- bool nodata, bool final)
+static void sha256_avx2_update_x8(struct list_head *list,
+ struct ahash_request *reqs[8], int i,
+ bool nodata, bool final)
{
struct ahash_request *r2 = reqs[i - 1];
do {
- r2 = sha256_update_x8x1(list, r2, reqs, nodata, final);
+ r2 = sha256_avx2_update_x8x1(list, r2, reqs, nodata, final);
} while (reqs[0]);
}
-static void sha256_chain(struct ahash_request *req, bool nodata, bool final)
+static void sha256_avx2_chain(struct ahash_request *req, bool nodata, bool final)
{
- struct sha256_reqctx *rctx = ahash_request_ctx(req);
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- unsigned int ds = crypto_ahash_digestsize(tfm);
struct ahash_request *reqs[8] = {};
- struct ahash_request *r2;
- int i;
+ int blocks;
- req->base.err = 0;
- reqs[0] = req;
- rctx->next = sha256_mb_start(req, nodata, final);
- i = !!rctx->next;
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
+ blocks = sha256_chain_pre(reqs, 8, req, nodata, final);
+ if (blocks)
+ sha256_avx2_update_x8(&req->base.list, reqs, blocks, nodata, final);
- r2->base.err = 0;
-
- r2ctx = ahash_request_ctx(r2);
- r2ctx->next = sha256_mb_start(r2, nodata, final);
- if (!r2ctx->next)
- continue;
-
- reqs[i++] = r2;
- if (i >= 8)
- break;
- }
-
- if (i)
- sha256_update_x8(&req->base.list, reqs, i, nodata, final);
-
- if (!final)
- return;
-
- lib_sha256_base_finish(&rctx->state, req->result, ds);
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct sha256_reqctx *r2ctx = ahash_request_ctx(r2);
-
- lib_sha256_base_finish(&r2ctx->state, r2->result, ds);
- }
+ sha256_chain_post(req, final);
}
static int sha256_avx2_update_mb(struct ahash_request *req)
@@ -712,7 +745,7 @@ static int sha256_avx2_update_mb(struct ahash_request *req)
int err;
if (ahash_request_chained(req) && crypto_simd_usable()) {
- sha256_chain(req, false, false);
+ sha256_avx2_chain(req, false, false);
return 0;
}
@@ -736,7 +769,7 @@ static int _sha256_avx2_finup(struct ahash_request *req, bool nodata)
int err;
if (ahash_request_chained(req) && crypto_simd_usable()) {
- sha256_chain(req, nodata, true);
+ sha256_avx2_chain(req, nodata, true);
return 0;
}
@@ -860,6 +893,7 @@ static void unregister_sha256_avx2(void)
#ifdef CONFIG_AS_SHA256_NI
asmlinkage void sha256_ni_transform(struct sha256_state *digest,
const u8 *data, int rounds);
+asmlinkage void sha256_ni_x2(struct sha256_x2_mbctx *mbctx, int blocks);
static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
@@ -916,19 +950,207 @@ static struct shash_alg sha256_ni_algs[] = { {
}
} };
+static struct ahash_request *sha256_ni_update_x2x1(
+ struct list_head *list, struct ahash_request *r2,
+ struct ahash_request *reqs[2], bool nodata, bool final)
+{
+ struct sha256_state *states[2];
+ struct sha256_x2_mbctx mbctx;
+ unsigned int len = 0;
+ int i = 0;
+
+ do {
+ struct sha256_reqctx *rctx = ahash_request_ctx(reqs[i]);
+ unsigned int nbytes;
+
+ nbytes = rctx->next;
+ if (!i || nbytes < len)
+ len = nbytes;
+
+ states[i] = &rctx->state;
+ memcpy(mbctx.state[i], states[i], 32);
+ mbctx.input[i] = rctx->input;
+ } while (++i < 2 && reqs[i]);
+
+ len &= ~(SHA256_BLOCK_SIZE - 1);
+
+ if (i < 2) {
+ sha256_ni_transform(states[0], mbctx.input[0],
+ len / SHA256_BLOCK_SIZE);
+ goto done;
+ }
+
+ sha256_ni_x2(&mbctx, len / SHA256_BLOCK_SIZE);
+
+ for (i = 0; i < 2; i++)
+ memcpy(states[i], mbctx.state[i], 32);
+
+done:
+ return sha256_update_x1_post(list, r2, reqs, 2, len, nodata, final);
+}
+
+static void sha256_ni_update_x2(struct list_head *list,
+ struct ahash_request *reqs[2], int i,
+ bool nodata, bool final)
+{
+ struct ahash_request *r2 = reqs[i - 1];
+
+ do {
+ r2 = sha256_ni_update_x2x1(list, r2, reqs, nodata, final);
+ } while (reqs[0]);
+}
+
+static void sha256_ni_chain(struct ahash_request *req, bool nodata, bool final)
+{
+ struct ahash_request *reqs[2] = {};
+ int blocks;
+
+ blocks = sha256_chain_pre(reqs, 2, req, nodata, final);
+ if (blocks)
+ sha256_ni_update_x2(&req->base.list, reqs, blocks, nodata, final);
+
+ sha256_chain_post(req, final);
+}
+
+static int sha256_ni_update_mb(struct ahash_request *req)
+{
+ struct ahash_request *r2;
+ int err;
+
+ if (ahash_request_chained(req) && crypto_simd_usable()) {
+ sha256_ni_chain(req, false, false);
+ return 0;
+ }
+
+ err = sha256_ahash_update(req, sha256_ni_transform);
+ if (!ahash_request_chained(req))
+ return err;
+
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ err = sha256_ahash_update(r2, sha256_ni_transform);
+ r2->base.err = err;
+ }
+
+ return 0;
+}
+
+static int _sha256_ni_finup(struct ahash_request *req, bool nodata)
+{
+ struct ahash_request *r2;
+ int err;
+
+ if (ahash_request_chained(req) && crypto_simd_usable()) {
+ sha256_ni_chain(req, nodata, true);
+ return 0;
+ }
+
+ err = sha256_ahash_finup(req, nodata, sha256_ni_transform);
+ if (!ahash_request_chained(req))
+ return err;
+
+ req->base.err = err;
+
+ list_for_each_entry(r2, &req->base.list, base.list) {
+ err = sha256_ahash_finup(r2, nodata, sha256_ni_transform);
+ r2->base.err = err;
+ }
+
+ return 0;
+}
+
+static int sha256_ni_finup_mb(struct ahash_request *req)
+{
+ return _sha256_ni_finup(req, false);
+}
+
+static int sha256_ni_final_mb(struct ahash_request *req)
+{
+ return _sha256_ni_finup(req, true);
+}
+
+static int sha256_ni_digest_mb(struct ahash_request *req)
+{
+ return sha256_ahash_init(req) ?:
+ sha256_ni_finup_mb(req);
+}
+
+static int sha224_ni_digest_mb(struct ahash_request *req)
+{
+ return sha224_ahash_init(req) ?:
+ sha256_ni_finup_mb(req);
+}
+
+static struct ahash_alg sha256_ni_mb_algs[] = { {
+ .halg.digestsize = SHA256_DIGEST_SIZE,
+ .halg.statesize = sizeof(struct sha256_state),
+ .reqsize = sizeof(struct sha256_reqctx),
+ .init = sha256_ahash_init,
+ .update = sha256_ni_update_mb,
+ .final = sha256_ni_final_mb,
+ .finup = sha256_ni_finup_mb,
+ .digest = sha256_ni_digest_mb,
+ .import = sha256_import,
+ .export = sha256_export,
+ .halg.base = {
+ .cra_name = "sha256",
+ .cra_driver_name = "sha256-ni-mb",
+ .cra_priority = 260,
+ .cra_blocksize = SHA256_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ .cra_flags = CRYPTO_ALG_REQ_CHAIN,
+ }
+}, {
+ .halg.digestsize = SHA224_DIGEST_SIZE,
+ .halg.statesize = sizeof(struct sha256_state),
+ .reqsize = sizeof(struct sha256_reqctx),
+ .init = sha224_ahash_init,
+ .update = sha256_ni_update_mb,
+ .final = sha256_ni_final_mb,
+ .finup = sha256_ni_finup_mb,
+ .digest = sha224_ni_digest_mb,
+ .import = sha256_import,
+ .export = sha256_export,
+ .halg.base = {
+ .cra_name = "sha224",
+ .cra_driver_name = "sha224-ni-mb",
+ .cra_priority = 260,
+ .cra_blocksize = SHA224_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ .cra_flags = CRYPTO_ALG_REQ_CHAIN,
+ }
+} };
+
static int register_sha256_ni(void)
{
- if (boot_cpu_has(X86_FEATURE_SHA_NI))
- return crypto_register_shashes(sha256_ni_algs,
- ARRAY_SIZE(sha256_ni_algs));
- return 0;
+ int err;
+
+ if (!boot_cpu_has(X86_FEATURE_SHA_NI))
+ return 0;
+
+ err = crypto_register_shashes(sha256_ni_algs,
+ ARRAY_SIZE(sha256_ni_algs));
+ if (err)
+ return err;
+
+ err = crypto_register_ahashes(sha256_ni_mb_algs,
+ ARRAY_SIZE(sha256_ni_mb_algs));
+ if (err)
+ crypto_unregister_shashes(sha256_ni_algs,
+ ARRAY_SIZE(sha256_ni_algs));
+
+ return err;
}
static void unregister_sha256_ni(void)
{
- if (boot_cpu_has(X86_FEATURE_SHA_NI))
- crypto_unregister_shashes(sha256_ni_algs,
- ARRAY_SIZE(sha256_ni_algs));
+ if (!boot_cpu_has(X86_FEATURE_SHA_NI))
+ return;
+
+ crypto_unregister_ahashes(sha256_ni_mb_algs,
+ ARRAY_SIZE(sha256_ni_mb_algs));
+ crypto_unregister_shashes(sha256_ni_algs, ARRAY_SIZE(sha256_ni_algs));
}
#else
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 09/11] crypto: hash - Add sync hash interface
2025-02-16 3:07 ` [v2 PATCH 09/11] crypto: hash - Add sync hash interface Herbert Xu
2025-02-16 10:51 ` kernel test robot
@ 2025-02-16 11:42 ` kernel test robot
1 sibling, 0 replies; 42+ messages in thread
From: kernel test robot @ 2025-02-16 11:42 UTC (permalink / raw)
To: Herbert Xu, Linux Crypto Mailing List
Cc: llvm, oe-kbuild-all, Eric Biggers, Ard Biesheuvel, Megha Dey,
Tim Chen
Hi Herbert,
kernel test robot noticed the following build warnings:
[auto build test WARNING on herbert-cryptodev-2.6/master]
[cannot apply to herbert-crypto-2.6/master brauner-vfs/vfs.all linus/master v6.14-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Herbert-Xu/crypto-ahash-Only-save-callback-and-data-in-ahash_save_req/20250216-150941
base: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
patch link: https://lore.kernel.org/r/d6e10dbf172f0b7c791f5406d55e8f1c74492d57.1739674648.git.herbert%40gondor.apana.org.au
patch subject: [v2 PATCH 09/11] crypto: hash - Add sync hash interface
config: s390-randconfig-001-20250216 (https://download.01.org/0day-ci/archive/20250216/202502161953.REiC4YpV-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250216/202502161953.REiC4YpV-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502161953.REiC4YpV-lkp@intel.com/
All warnings (new ones prefixed by >>, old ones prefixed by <<):
ERROR: modpost: vmlinux: 'crypto_shash_tfm_digest' exported twice. Previous export was in vmlinux
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/slub_kunit.o
>> WARNING: modpost: EXPORT symbol "crypto_shash_tfm_digest" [vmlinux] version generation failed, symbol will not be versioned.
Is "crypto_shash_tfm_digest" prototyped in <asm/asm-prototypes.h>?
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-16 11:09 ` Herbert Xu
@ 2025-02-16 19:51 ` Eric Biggers
2025-02-18 10:10 ` Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Eric Biggers @ 2025-02-16 19:51 UTC (permalink / raw)
To: Herbert Xu; +Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Sun, Feb 16, 2025 at 07:09:57PM +0800, Herbert Xu wrote:
> On Sat, Feb 15, 2025 at 07:38:16PM -0800, Eric Biggers wrote:
> >
> > This new version hasn't fundamentally changed anything. It's still a much
> > worse, unnecessarily complex and still incomplete implementation compared to my
> > patchset which has been ready to go for nearly a year already. Please refer to
> > all the previous feedback that I've given.
>
> FWIW, my interface is a lot simpler than yours to implement, since
> it doesn't deal with the partial buffer non-sense in assembly. In
> fact that was a big mistake with the original API, the partial data
> handling should've been moved to the API layer a long time ago.
We've already discussed this. It is part of the multibuffer optimization, as
instruction interleaving is applicable to partial block handling and
finalization too. It also makes those parts able to use SIMD. Both of those
improve performance. But more importantly it eliminates the need for a separate
descriptor for each message. That results in massive simplifications up the
stack. The per-algorithm C glue code is much simpler, the API is much simpler
and has fewer edge cases, and with only one descriptor being needed it becomes
feasible to allocate it on the stack. Overall it's just much more streamlined.
For all these reasons my version ends up with a much smaller diffstat, despite
the assembly code being longer and more optimized.
I see that you didn't even bother to include any tests for all the edge cases in
your API where descriptors and/or scatterlists aren't synced up. As I've
explained before, these cases would be a huge pain to get right.
But of course, there is no need to go there in the first place. Cryptographic
APIs should be simple and not include unnecessary edge cases. It seems you
still have a misconception that your more complex API would make my work useful
for IPsec, but again that is still incorrect, as I've explained many times. The
latest bogus claims that you've been making, like that GHASH is not
parallelizable, don't exactly inspire confidence either.
- Eric
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-16 19:51 ` Eric Biggers
@ 2025-02-18 10:10 ` Herbert Xu
2025-02-18 17:48 ` Eric Biggers
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-02-18 10:10 UTC (permalink / raw)
To: Eric Biggers
Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Sun, Feb 16, 2025 at 11:51:29AM -0800, Eric Biggers wrote:
>
> But of course, there is no need to go there in the first place. Cryptographic
> APIs should be simple and not include unnecessary edge cases. It seems you
> still have a misconception that your more complex API would make my work useful
> for IPsec, but again that is still incorrect, as I've explained many times. The
> latest bogus claims that you've been making, like that GHASH is not
> parallelizable, don't exactly inspire confidence either.
Sure, everyone hates complexity. But you're not removing it.
You're simply pushing the complexity into the algorithm implementation
and more importantly, the user. With your interface the user has to
jump through unnecessary hoops to get multiple requests going, which
is probably why you limited it to just 2.
If anything we should be pushing the complexity into the API code
itself and away from the algorithm implementation. Why? Because
it's shared and therefore the test coverage works much better.
Look over the years at how many buggy edge cases such as block
left-overs we have had in arch crypto code. Now if those edge
cases were moved into shared API code it would be much better.
Sure it could still be buggy, but it would affect everyone
equally and that means it's much easier to catch.
It's much easier for the fuzz tests to catch a bug in shared API
code than individual assembly code.
Here is an example the new hash interface looks like when used
in fsverity. It allows unlimited chaining, without holding all
those unnecessary kmap's:
commit 0a0be692829c3e69a14b7b10ed412250da458825
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue Feb 18 17:46:20 2025 +0800
fsverity: restore ahash support and remove chaining limit
Use the hash interface instead of shash. This allows the chaining
limit to be removed as the request no longer has to be allocated on
the stack.
Memory allocations can always fail, but they *rarely* do. Resolve
the OOM case by using a stack request as a fallback.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index 3d03fb1e41f0..9aae3381ef92 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -20,7 +20,7 @@
/* A hash algorithm supported by fs-verity */
struct fsverity_hash_alg {
- struct crypto_sync_hash *tfm; /* hash tfm, allocated on demand */
+ struct crypto_hash *tfm; /* hash tfm, allocated on demand */
const char *name; /* crypto API name, e.g. sha256 */
unsigned int digest_size; /* digest size in bytes, e.g. 32 for SHA-256 */
unsigned int block_size; /* block size in bytes, e.g. 64 for SHA-256 */
diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
index e088bcfe5ed1..8c625ddee43b 100644
--- a/fs/verity/hash_algs.c
+++ b/fs/verity/hash_algs.c
@@ -43,7 +43,7 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
unsigned int num)
{
struct fsverity_hash_alg *alg;
- struct crypto_sync_hash *tfm;
+ struct crypto_hash *tfm;
int err;
if (num >= ARRAY_SIZE(fsverity_hash_algs) ||
@@ -62,7 +62,7 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
if (alg->tfm != NULL)
goto out_unlock;
- tfm = crypto_alloc_sync_hash(alg->name, 0, 0);
+ tfm = crypto_alloc_hash(alg->name, 0, 0);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fsverity_warn(inode,
@@ -79,20 +79,20 @@ const struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
}
err = -EINVAL;
- if (WARN_ON_ONCE(alg->digest_size != crypto_sync_hash_digestsize(tfm)))
+ if (WARN_ON_ONCE(alg->digest_size != crypto_hash_digestsize(tfm)))
goto err_free_tfm;
- if (WARN_ON_ONCE(alg->block_size != crypto_sync_hash_blocksize(tfm)))
+ if (WARN_ON_ONCE(alg->block_size != crypto_hash_blocksize(tfm)))
goto err_free_tfm;
pr_info("%s using implementation \"%s\"\n",
- alg->name, crypto_sync_hash_driver_name(tfm));
+ alg->name, crypto_hash_driver_name(tfm));
/* pairs with smp_load_acquire() above */
smp_store_release(&alg->tfm, tfm);
goto out_unlock;
err_free_tfm:
- crypto_free_sync_hash(tfm);
+ crypto_free_hash(tfm);
alg = ERR_PTR(err);
out_unlock:
mutex_unlock(&fsverity_hash_alg_init_mutex);
@@ -112,7 +112,7 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
const u8 *salt, size_t salt_size)
{
u8 *hashstate = NULL;
- SYNC_HASH_REQUEST_ON_STACK(req, alg->tfm);
+ HASH_REQUEST_ON_STACK(req, alg->tfm);
u8 *padded_salt = NULL;
size_t padded_salt_size;
int err;
@@ -120,7 +120,7 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
if (salt_size == 0)
return NULL;
- hashstate = kmalloc(crypto_sync_hash_statesize(alg->tfm), GFP_KERNEL);
+ hashstate = kmalloc(crypto_hash_statesize(alg->tfm), GFP_KERNEL);
if (!hashstate)
return ERR_PTR(-ENOMEM);
@@ -178,7 +178,7 @@ const u8 *fsverity_prepare_hash_state(const struct fsverity_hash_alg *alg,
int fsverity_hash_block(const struct merkle_tree_params *params,
const struct inode *inode, const void *data, u8 *out)
{
- SYNC_HASH_REQUEST_ON_STACK(req, params->hash_alg->tfm);
+ HASH_REQUEST_ON_STACK(req, params->hash_alg->tfm);
int err;
ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
@@ -212,7 +212,7 @@ int fsverity_hash_block(const struct merkle_tree_params *params,
int fsverity_hash_buffer(const struct fsverity_hash_alg *alg,
const void *data, size_t size, u8 *out)
{
- return crypto_sync_hash_digest(alg->tfm, data, size, out);
+ return crypto_hash_digest(alg->tfm, data, size, out);
}
void __init fsverity_check_hash_algs(void)
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 15bf0887a827..092f20704a92 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -9,26 +9,21 @@
#include <crypto/hash.h>
#include <linux/bio.h>
+#include <linux/scatterlist.h>
struct fsverity_pending_block {
- const void *data;
- u64 pos;
u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
+ u64 pos;
+ struct scatterlist sg;
};
struct fsverity_verification_context {
struct inode *inode;
struct fsverity_info *vi;
unsigned long max_ra_pages;
-
- /*
- * This is the queue of data blocks that are pending verification. We
- * allow multiple blocks to be queued up in order to support multibuffer
- * hashing, i.e. interleaving the hashing of multiple messages. On many
- * CPUs this improves performance significantly.
- */
- int num_pending;
- struct fsverity_pending_block pending_blocks[FS_VERITY_MAX_PENDING_DATA_BLOCKS];
+ struct crypto_wait wait;
+ struct ahash_request *req;
+ struct ahash_request *fbreq;
};
static struct workqueue_struct *fsverity_read_workqueue;
@@ -111,9 +106,10 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
*/
static bool
verify_data_block(struct inode *inode, struct fsverity_info *vi,
- const struct fsverity_pending_block *dblock,
+ struct ahash_request *req,
unsigned long max_ra_pages)
{
+ struct fsverity_pending_block *dblock = container_of(req->src, struct fsverity_pending_block, sg);
const u64 data_pos = dblock->pos;
const struct merkle_tree_params *params = &vi->tree_params;
const unsigned int hsize = params->digest_size;
@@ -138,14 +134,14 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
*/
u64 hidx = data_pos >> params->log_blocksize;
- /*
- * Up to FS_VERITY_MAX_PENDING_DATA_BLOCKS + FS_VERITY_MAX_LEVELS pages
- * may be mapped at once.
- */
- BUILD_BUG_ON(FS_VERITY_MAX_PENDING_DATA_BLOCKS +
- FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
+ /* Up to FS_VERITY_MAX_LEVELS pages may be mapped at once. */
+ BUILD_BUG_ON(FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
if (unlikely(data_pos >= inode->i_size)) {
+ u8 *data = kmap_local_page(sg_page(&dblock->sg));
+ unsigned int offset = dblock->sg.offset;
+ bool nonzero;
+
/*
* This can happen in the data page spanning EOF when the Merkle
* tree block size is less than the page size. The Merkle tree
@@ -154,7 +150,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
* any part past EOF should be all zeroes. Therefore, we need
* to verify that any data blocks fully past EOF are all zeroes.
*/
- if (memchr_inv(dblock->data, 0, params->block_size)) {
+ nonzero = memchr_inv(data + offset, 0, params->block_size);
+ kunmap_local(data);
+ if (nonzero) {
fsverity_err(inode,
"FILE CORRUPTED! Data past EOF is not zeroed");
return false;
@@ -276,19 +274,17 @@ fsverity_init_verification_context(struct fsverity_verification_context *ctx,
ctx->inode = inode;
ctx->vi = inode->i_verity_info;
ctx->max_ra_pages = max_ra_pages;
- ctx->num_pending = 0;
+ ctx->req = NULL;
+ ctx->fbreq = NULL;
}
static void
fsverity_clear_pending_blocks(struct fsverity_verification_context *ctx)
{
- int i;
-
- for (i = ctx->num_pending - 1; i >= 0; i--) {
- kunmap_local(ctx->pending_blocks[i].data);
- ctx->pending_blocks[i].data = NULL;
- }
- ctx->num_pending = 0;
+ if (ctx->req != ctx->fbreq)
+ ahash_request_free(ctx->req);
+ ctx->req = NULL;
+ ctx->fbreq = NULL;
}
static bool
@@ -297,49 +293,27 @@ fsverity_verify_pending_blocks(struct fsverity_verification_context *ctx)
struct inode *inode = ctx->inode;
struct fsverity_info *vi = ctx->vi;
const struct merkle_tree_params *params = &vi->tree_params;
- SYNC_HASH_REQUESTS_ON_STACK(reqs, FS_VERITY_MAX_PENDING_DATA_BLOCKS, params->hash_alg->tfm);
- struct ahash_request *req;
- int i;
+ struct ahash_request *req = ctx->req;
+ struct ahash_request *r2;
int err;
- if (ctx->num_pending == 0)
- return true;
-
- req = sync_hash_requests(reqs, 0);
- for (i = 0; i < ctx->num_pending; i++) {
- struct ahash_request *reqi = sync_hash_requests(reqs, i);
-
- ahash_request_set_callback(reqi, CRYPTO_TFM_REQ_MAY_SLEEP,
- NULL, NULL);
- ahash_request_set_virt(reqi, ctx->pending_blocks[i].data,
- ctx->pending_blocks[i].real_hash,
- params->block_size);
- if (i)
- ahash_request_chain(reqi, req);
- if (!params->hashstate)
- continue;
-
- err = crypto_ahash_import(reqi, params->hashstate);
- if (err) {
- fsverity_err(inode, "Error %d importing hash state", err);
- return false;
- }
- }
-
+ crypto_init_wait(&ctx->wait);
if (params->hashstate)
err = crypto_ahash_finup(req);
else
err = crypto_ahash_digest(req);
+ err = crypto_wait_req(err, &ctx->wait);
if (err) {
fsverity_err(inode, "Error %d computing block hashes", err);
return false;
}
- for (i = 0; i < ctx->num_pending; i++) {
- if (!verify_data_block(inode, vi, &ctx->pending_blocks[i],
- ctx->max_ra_pages))
+ if (!verify_data_block(inode, vi, req, ctx->max_ra_pages))
+ return false;
+
+ list_for_each_entry(r2, &req->base.list, base.list)
+ if (!verify_data_block(inode, vi, r2, ctx->max_ra_pages))
return false;
- }
fsverity_clear_pending_blocks(ctx);
return true;
@@ -352,7 +326,7 @@ fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
struct fsverity_info *vi = ctx->vi;
const struct merkle_tree_params *params = &vi->tree_params;
const unsigned int block_size = params->block_size;
- const int mb_max_msgs = FS_VERITY_MAX_PENDING_DATA_BLOCKS;
+ struct crypto_hash *tfm = params->hash_alg->tfm;
u64 pos = (u64)data_folio->index << PAGE_SHIFT;
if (WARN_ON_ONCE(len <= 0 || !IS_ALIGNED(len | offset, block_size)))
@@ -361,12 +335,59 @@ fsverity_add_data_blocks(struct fsverity_verification_context *ctx,
folio_test_uptodate(data_folio)))
return false;
do {
- ctx->pending_blocks[ctx->num_pending].data =
- kmap_local_folio(data_folio, offset);
- ctx->pending_blocks[ctx->num_pending].pos = pos + offset;
- if (++ctx->num_pending == mb_max_msgs &&
- !fsverity_verify_pending_blocks(ctx))
+ struct fsverity_pending_block fbblock;
+ struct fsverity_pending_block *block;
+ HASH_REQUEST_ON_STACK(fbreq, tfm);
+ struct ahash_request *req;
+
+ req = hash_request_alloc_extra(params->hash_alg->tfm,
+ sizeof(*block), GFP_NOFS);
+ if (req)
+ block = hash_request_extra(req);
+ else {
+ if (!fsverity_verify_pending_blocks(ctx))
+ return false;
+
+ req = fbreq;
+ block = &fbblock;
+ }
+
+ sg_init_table(&block->sg, 1);
+ sg_set_page(&block->sg,
+ folio_page(data_folio, offset / PAGE_SIZE),
+ block_size, offset % PAGE_SIZE);
+ block->pos = pos + offset;
+
+ ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+ CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &ctx->wait);
+ ahash_request_set_crypt(req, &block->sg, block->real_hash,
+ block_size);
+
+ if (params->hashstate) {
+ int err = crypto_ahash_import(req, params->hashstate);
+ if (err) {
+ fsverity_err(ctx->inode, "Error %d importing hash state", err);
+ if (req != fbreq)
+ ahash_request_free(req);
+ return false;
+ }
+ }
+
+ if (ctx->req) {
+ ahash_request_chain(req, ctx->req);
+ goto next;
+ }
+
+ ctx->req = req;
+ if (req != fbreq)
+ goto next;
+
+ ctx->fbreq = fbreq;
+ if (!fsverity_verify_pending_blocks(ctx))
return false;
+
+next:
offset += block_size;
len -= block_size;
} while (len);
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-18 10:10 ` Herbert Xu
@ 2025-02-18 17:48 ` Eric Biggers
2025-02-21 6:10 ` Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Eric Biggers @ 2025-02-18 17:48 UTC (permalink / raw)
To: Herbert Xu; +Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Tue, Feb 18, 2025 at 06:10:36PM +0800, Herbert Xu wrote:
> On Sun, Feb 16, 2025 at 11:51:29AM -0800, Eric Biggers wrote:
> >
> > But of course, there is no need to go there in the first place. Cryptographic
> > APIs should be simple and not include unnecessary edge cases. It seems you
> > still have a misconception that your more complex API would make my work useful
> > for IPsec, but again that is still incorrect, as I've explained many times. The
> > latest bogus claims that you've been making, like that GHASH is not
> > parallelizable, don't exactly inspire confidence either.
>
> Sure, everyone hates complexity. But you're not removing it.
I'm avoiding adding it in the first place.
> You're simply pushing the complexity into the algorithm implementation
> and more importantly, the user. With your interface the user has to
> jump through unnecessary hoops to get multiple requests going, which
> is probably why you limited it to just 2.
>
> If anything we should be pushing the complexity into the API code
> itself and away from the algorithm implementation. Why? Because
> it's shared and therefore the test coverage works much better.
>
> Look over the years at how many buggy edge cases such as block
> left-overs we have had in arch crypto code. Now if those edge
> cases were moved into shared API code it would be much better.
> Sure it could still be buggy, but it would affect everyone
> equally and that means it's much easier to catch.
You cannot ignore complexity in the API, as that is the worst kind.
In addition, your (slower) solution has a large amount of complexity in the
per-algorithm glue code, making it still more lines of code *per algorithm* than
my (faster) solution, which you're ignoring.
Also, users still have to queue up multiple requests anyway. There are no
"unnecessary hoops" with my patches -- just a faster, simpler, easier to use and
less error-prone API.
> Memory allocations can always fail, but they *rarely* do. Resolve
> the OOM case by using a stack request as a fallback.
Rarely executed fallbacks that are only executed in extremely rare OOM
situations that won't be covered by xfstests? No thank you. Why would you even
think that would be reasonable?
Anyway, I am getting tired of responding to all your weird arguments that don't
bring anything new to the table. Please continue to treat your patches as
nacked and don't treat silence as agreement. I am just tired of this.
- Eric
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 00/11] Multibuffer hashing take two
2025-02-18 17:48 ` Eric Biggers
@ 2025-02-21 6:10 ` Herbert Xu
0 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-02-21 6:10 UTC (permalink / raw)
To: Eric Biggers
Cc: Linux Crypto Mailing List, Ard Biesheuvel, Megha Dey, Tim Chen
On Tue, Feb 18, 2025 at 05:48:10PM +0000, Eric Biggers wrote:
>
> In addition, your (slower) solution has a large amount of complexity in the
> per-algorithm glue code, making it still more lines of code *per algorithm* than
> my (faster) solution, which you're ignoring.
My patches make the per-alg glue code look bad, but the intention
is to share them not just between sha256 implementatinos, but
across all ahash implementations as a whole. They will become
the new hash walking interface.
> Anyway, I am getting tired of responding to all your weird arguments that don't
> bring anything new to the table. Please continue to treat your patches as
> nacked and don't treat silence as agreement. I am just tired of this.
Talk about weird arguments, here's something even weirder to
ponder over:
Rather than passing a single block, let's pass the whole bio
all at once. Then depending on how much memory is available
to store the hash result, we either return the whole thing,
or hash as much as we can fit into a page and then iterate
(just a single hash in the worst case (OOM)).
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH 03/11] crypto: hash - Add request chaining API
2025-02-16 3:07 ` [v2 PATCH 03/11] crypto: hash - Add request chaining API Herbert Xu
@ 2025-03-26 9:00 ` Manorit Chawdhry
2025-03-26 9:17 ` [PATCH] crypto: sa2ul - Use proper helpers to setup request Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-26 9:00 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Manorit Chawdhry, Kamlesh Gurudasani,
Vignesh Raghavendra, Udit Kumar, Pratham T
Hi Herbert,
On 11:07-20250216, Herbert Xu wrote:
> This adds request chaining to the ahash interface. Request chaining
> allows multiple requests to be submitted in one shot. An algorithm
> can elect to receive chained requests by setting the flag
> CRYPTO_ALG_REQ_CHAIN. If this bit is not set, the API will break
> up chained requests and submit them one-by-one.
>
> A new err field is added to struct crypto_async_request to record
> the return value for each individual request.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> crypto/ahash.c | 261 +++++++++++++++++++++++++++++----
> crypto/algapi.c | 2 +-
> include/crypto/algapi.h | 11 ++
> include/crypto/hash.h | 28 ++--
> include/crypto/internal/hash.h | 10 ++
> include/linux/crypto.h | 24 +++
> 6 files changed, 299 insertions(+), 37 deletions(-)
The following patch seems to be breaking selftests in SA2UL driver.
The failure signature:
root@j721e-evm:~# modprobe sa2ul
[ 32.254126] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
root@j721e-evm:~# [ 32.374996] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 32.401815] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 32.449576] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 32.459025] Mem abort info:
[ 32.461812] ESR = 0x0000000096000044
[ 32.480762] Mem abort info:
[ 32.503389] ESR = 0x0000000096000044
[ 32.512483] EC = 0x25: DABT (current EL), IL = 32 bits
[ 32.519478] Mem abort info:
[ 32.534472] EC = 0x25: DABT (current EL), IL = 32 bits
[ 32.542380] ESR = 0x0000000096000044
[ 32.546123] EC = 0x25: DABT (current EL), IL = 32 bits
[ 32.554977] SET = 0, FnV = 0
[ 32.572112] SET = 0, FnV = 0
[ 32.579134] EA = 0, S1PTW = 0
[ 32.597889] EA = 0, S1PTW = 0
[ 32.603045] FSC = 0x04: level 0 translation fault
[ 32.615500] SET = 0, FnV = 0
[ 32.628186] FSC = 0x04: level 0 translation fault
[ 32.645274] EA = 0, S1PTW = 0
[ 32.651268] Data abort info:
[ 32.654145] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 32.675265] Data abort info:
[ 32.678614] FSC = 0x04: level 0 translation fault
[ 32.701391] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 32.721251] Data abort info:
[ 32.725864] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 32.742907] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 32.751647] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 32.770854] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 32.790381] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 32.795591] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 32.807232] [fefefefefefeff46] address between user and kernel address ranges
[ 32.826504] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 32.832211] [fefefefefefeff46] address between user and kernel address ranges
[ 32.854007] Internal error: Oops: 0000000096000044 [#1] SMP
[ 32.859661] Modules linked in: des_generic libdes cbc sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 pinctrl_tps6594 tps6594_esm tps6594_regulator gpio_regmap tps6594_pfsm ti_am335x_adc kfifo_buf pru_rproc cdns3 irq_pruss_intc cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver phy_j721e_wiz omap_mailbox ti_k3_r5_remoteproc tidss tps6594_i2c drm_client_lib cdns_mhdp8546 drm_dma_helper tps6594_core at24 drm_display_helper k3_j72xx_bandgap drm_kms_helper m_can_platform m_can ti_am335x_tscadc pruss snd_soc_davinci_mcasp snd_soc_ti_udma ti_j721e_ufs can_dev snd_soc_ti_edma ti_k3_dsp_remoteproc snd_soc_ti_sdma cdns3_ti snd_soc_pcm3168a_i2c snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 32.929820] CPU: 0 UID: 0 PID: 1253 Comm: cryptomgr_test Not tainted 6.14.0-rc7-next-20250324-build-configs-dirty #2 PREEMPT
[ 32.941098] Hardware name: Texas Instruments J721e EVM (DT)
[ 32.946653] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 32.953598] pc : crypto_ahash_init+0x6c/0xf0
[ 32.957866] lr : crypto_ahash_init+0x50/0xf0
[ 32.962124] sp : ffff8000851cb590
[ 32.965425] x29: ffff8000851cb590 x28: 0000000000000000 x27: ffff8000851cb788
[ 32.972546] x26: ffff000802dace00 x25: 0000000000000000 x24: ffff000804294010
[ 32.979667] x23: ffff80007be3fb30 x22: 0000000000000000 x21: ffff0008095c1250
[ 32.986787] x20: ffff000802dace90 x19: fefefefefefefefe x18: 00000000ffffffff
[ 32.993908] x17: 0000000000373931 x16: 2d3732322d34322d x15: ffff8000851cb740
[ 33.001028] x14: ffff8001051cbaa7 x13: 0000000000000000 x12: 0000000000000000
[ 33.008148] x11: 0000000000000100 x10: 0000000000000001 x9 : ffff80007be3fadc
[ 33.015268] x8 : ffff8000851cb668 x7 : 0000000000000000 x6 : 0010000000000000
[ 33.022388] x5 : efcdab8967452301 x4 : 1032547698badcfe x3 : 00000000c3d2e1f0
[ 33.029509] x2 : 0000000000000000 x1 : ffff0008095c1280 x0 : 0000000000000000
[ 33.036629] Call trace:
[ 33.039065] crypto_ahash_init+0x6c/0xf0 (P)
[ 33.043325] sa_sha_init+0x4c/0xa0 [sa2ul]
[ 33.047419] ahash_do_req_chain+0x144/0x280
[ 33.051591] crypto_ahash_init+0xc8/0xf0
[ 33.055502] do_ahash_op+0x34/0xb8
[ 33.058895] test_ahash_vec_cfg+0x3e4/0x800
[ 33.063068] test_hash_vec+0xbc/0x230
[ 33.066719] __alg_test_hash+0x288/0x3d8
[ 33.070629] alg_test_hash+0x108/0x1a0
[ 33.074367] alg_test+0x148/0x658
[ 33.077672] cryptomgr_test+0x2c/0x50
[ 33.081322] kthread+0x134/0x218
[ 33.084542] ret_from_fork+0x10/0x20
[ 33.088109] Code: eb13029f 540001e0 d503201f f94012a1 (f9002661)
[ 33.094185] ---[ end trace 0000000000000000 ]---
[ 33.100497] [fefefefefefeff46] address between user and kernel address ranges
[ 33.114241] Internal error: Oops: 0000000096000044 [#2] SMP
[ 33.119887] Modules linked in: des_generic libdes cbc sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 pinctrl_tps6594 tps6594_esm tps6594_regulator gpio_regmap tps6594_pfsm ti_am335x_adc kfifo_buf pru_rproc cdns3 irq_pruss_intc cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver phy_j721e_wiz omap_mailbox ti_k3_r5_remoteproc tidss tps6594_i2c drm_client_lib cdns_mhdp8546 drm_dma_helper tps6594_core at24 drm_display_helper k3_j72xx_bandgap drm_kms_helper m_can_platform m_can ti_am335x_tscadc pruss snd_soc_davinci_mcasp snd_soc_ti_udma ti_j721e_ufs can_dev snd_soc_ti_edma ti_k3_dsp_remoteproc snd_soc_ti_sdma cdns3_ti snd_soc_pcm3168a_i2c snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 33.190029] CPU: 0 UID: 0 PID: 1256 Comm: cryptomgr_test Tainted: G D 6.14.0-rc7-next-20250324-build-configs-dirty #2 PREEMPT
[ 33.202868] Tainted: [D]=DIE
[ 33.205737] Hardware name: Texas Instruments J721e EVM (DT)
[ 33.211293] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 33.218236] pc : crypto_ahash_init+0x6c/0xf0
[ 33.222498] lr : crypto_ahash_init+0x50/0xf0
[ 33.226756] sp : ffff8000852ab590
[ 33.230057] x29: ffff8000852ab590 x28: 0000000000000000 x27: ffff8000852ab788
[ 33.237178] x26: ffff000807d69a00 x25: 0000000000000000 x24: ffff000804204810
[ 33.244298] x23: ffff80007be3fb30 x22: 0000000000000000 x21: ffff0008095c1890
[ 33.251419] x20: ffff000807d69a90 x19: fefefefefefefefe x18: 00000000ffffffff
[ 33.258539] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000852ab740
[ 33.265659] x14: ffff8001052abaa7 x13: 0000000000000000 x12: 0000000000000000
[ 33.272779] x11: 0000000000000000 x10: ffff00087f7c8308 x9 : ffff80007be3fadc
[ 33.279899] x8 : bb67ae8584caa73b x7 : 3c6ef372fe94f82b x6 : a54ff53a5f1d36f1
[ 33.287019] x5 : 510e527fade682d1 x4 : 9b05688c2b3e6c1f x3 : 1f83d9abfb41bd6b
[ 33.294139] x2 : 5be0cd19137e2179 x1 : ffff0008095c1100 x0 : 0000000000000000
[ 33.301259] Call trace:
[ 33.303695] crypto_ahash_init+0x6c/0xf0 (P)
[ 33.307954] sa_sha_init+0x4c/0xa0 [sa2ul]
[ 33.312045] ahash_do_req_chain+0x144/0x280
[ 33.316217] crypto_ahash_init+0xc8/0xf0
[ 33.320129] do_ahash_op+0x34/0xb8
[ 33.323520] test_ahash_vec_cfg+0x3e4/0x800
[ 33.327691] test_hash_vec+0xbc/0x230
[ 33.331341] __alg_test_hash+0x288/0x3d8
[ 33.335252] alg_test_hash+0x108/0x1a0
[ 33.338990] alg_test+0x148/0x658
[ 33.342294] cryptomgr_test+0x2c/0x50
[ 33.345944] kthread+0x134/0x218
[ 33.349161] ret_from_fork+0x10/0x20
[ 33.352727] Code: eb13029f 540001e0 d503201f f94012a1 (f9002661)
[ 33.358803] ---[ end trace 0000000000000000 ]---
[ 33.363437] Internal error: Oops: 0000000096000044 [#3] SMP
[ 33.369081] Modules linked in: des_generic libdes cbc sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 pinctrl_tps6594 tps6594_esm tps6594_regulator gpio_regmap tps6594_pfsm ti_am335x_adc kfifo_buf pru_rproc cdns3 irq_pruss_intc cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver phy_j721e_wiz omap_mailbox ti_k3_r5_remoteproc tidss tps6594_i2c drm_client_lib cdns_mhdp8546 drm_dma_helper tps6594_core at24 drm_display_helper k3_j72xx_bandgap drm_kms_helper m_can_platform m_can ti_am335x_tscadc pruss snd_soc_davinci_mcasp snd_soc_ti_udma ti_j721e_ufs can_dev snd_soc_ti_edma ti_k3_dsp_remoteproc snd_soc_ti_sdma cdns3_ti snd_soc_pcm3168a_i2c snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 33.439214] CPU: 0 UID: 0 PID: 1255 Comm: cryptomgr_test Tainted: G D 6.14.0-rc7-next-20250324-build-configs-dirty #2 PREEMPT
[ 33.452052] Tainted: [D]=DIE
[ 33.454921] Hardware name: Texas Instruments J721e EVM (DT)
[ 33.460476] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 33.467419] pc : crypto_ahash_init+0x6c/0xf0
[ 33.471678] lr : crypto_ahash_init+0x50/0xf0
[ 33.475935] sp : ffff8000851fb590
[ 33.479236] x29: ffff8000851fb590 x28: 0000000000000000 x27: ffff8000851fb788
[ 33.486357] x26: ffff000808b0da00 x25: 0000000000000000 x24: ffff000801270c10
[ 33.493477] x23: ffff80007be3fb30 x22: 0000000000000000 x21: ffff0008095c1490
[ 33.500596] x20: ffff000808b0da90 x19: fefefefefefefefe x18: 00000000ffffffff
[ 33.507717] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000851fb740
[ 33.514837] x14: ffff8001051fbaa7 x13: 0000000000000000 x12: 0000000000000000
[ 33.521956] x11: 0000000000000000 x10: ffff00087f7c8308 x9 : ffff80007be3fadc
[ 33.529077] x8 : ffff8000851fb668 x7 : 0000000000000000 x6 : 0010000000000000
[ 33.536197] x5 : bb67ae856a09e667 x4 : a54ff53a3c6ef372 x3 : 9b05688c510e527f
[ 33.543316] x2 : 5be0cd191f83d9ab x1 : ffff0008095c1680 x0 : 0000000000000000
[ 33.550437] Call trace:
[ 33.552871] crypto_ahash_init+0x6c/0xf0 (P)
[ 33.557130] sa_sha_init+0x4c/0xa0 [sa2ul]
[ 33.561217] ahash_do_req_chain+0x144/0x280
[ 33.565388] crypto_ahash_init+0xc8/0xf0
[ 33.569299] do_ahash_op+0x34/0xb8
[ 33.572691] test_ahash_vec_cfg+0x3e4/0x800
[ 33.576861] test_hash_vec+0xbc/0x230
[ 33.580512] __alg_test_hash+0x288/0x3d8
[ 33.584423] alg_test_hash+0x108/0x1a0
[ 33.588161] alg_test+0x148/0x658
[ 33.591465] cryptomgr_test+0x2c/0x50
[ 33.595115] kthread+0x134/0x218
[ 33.598334] ret_from_fork+0x10/0x20
[ 33.601898] Code: eb13029f 540001e0 d503201f f94012a1 (f9002661)
[ 33.607973] ---[ end trace 0000000000000000 ]---
[ 47.186729] kauditd_printk_skb: 5 callbacks suppressed
[ 47.186738] audit: type=1334 audit(1742976993.333:24): prog-id=20 op=UNLOAD
[ 47.200991] audit: type=1334 audit(1742976993.345:25): prog-id=19 op=UNLOAD
[ 47.208310] audit: type=1334 audit(1742976993.345:26): prog-id=18 op=UNLOAD
Don't have sufficient knowledge to understand what is going wrong with
this change, could you help out?
Regards,
Manorit
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 9:00 ` Manorit Chawdhry
@ 2025-03-26 9:17 ` Herbert Xu
2025-03-26 10:00 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-03-26 9:17 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Wed, Mar 26, 2025 at 02:30:35PM +0530, Manorit Chawdhry wrote:
>
> The following patch seems to be breaking selftests in SA2UL driver.
Thanks for the report.
This patch should fix the problem:
---8<---
Rather than setting up a request by hand, use the correct API helpers
to setup the new request. This is because the API helpers will setup
chaining.
Also change the fallback allocation to explicitly request for a
sync algorithm as this driver will crash if given an async one.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/crypto/sa2ul.c b/drivers/crypto/sa2ul.c
index 091612b066f1..8fae14043262 100644
--- a/drivers/crypto/sa2ul.c
+++ b/drivers/crypto/sa2ul.c
@@ -1415,22 +1415,13 @@ static int sa_sha_run(struct ahash_request *req)
(auth_len >= SA_UNSAFE_DATA_SZ_MIN &&
auth_len <= SA_UNSAFE_DATA_SZ_MAX)) {
struct ahash_request *subreq = &rctx->fallback_req;
- int ret = 0;
+ int ret;
ahash_request_set_tfm(subreq, ctx->fallback.ahash);
- subreq->base.flags = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(subreq, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(subreq, req->src, req->result, auth_len);
- crypto_ahash_init(subreq);
-
- subreq->nbytes = auth_len;
- subreq->src = req->src;
- subreq->result = req->result;
-
- ret |= crypto_ahash_update(subreq);
-
- subreq->nbytes = 0;
-
- ret |= crypto_ahash_final(subreq);
+ ret = crypto_ahash_digest(subreq);
return ret;
}
@@ -1502,8 +1493,7 @@ static int sa_sha_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
return ret;
if (alg_base) {
- ctx->shash = crypto_alloc_shash(alg_base, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ ctx->shash = crypto_alloc_shash(alg_base, 0, 0);
if (IS_ERR(ctx->shash)) {
dev_err(sa_k3_dev, "base driver %s couldn't be loaded\n",
alg_base);
@@ -1511,8 +1501,7 @@ static int sa_sha_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
}
/* for fallback */
ctx->fallback.ahash =
- crypto_alloc_ahash(alg_base, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ crypto_alloc_ahash(alg_base, 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(ctx->fallback.ahash)) {
dev_err(ctx->dev_data->dev,
"Could not load fallback driver\n");
@@ -1546,54 +1535,38 @@ static int sa_sha_init(struct ahash_request *req)
crypto_ahash_digestsize(tfm), rctx);
ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, NULL, NULL, 0);
return crypto_ahash_init(&rctx->fallback_req);
}
static int sa_sha_update(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
- rctx->fallback_req.nbytes = req->nbytes;
- rctx->fallback_req.src = req->src;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, req->src, NULL, req->nbytes);
return crypto_ahash_update(&rctx->fallback_req);
}
static int sa_sha_final(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
- rctx->fallback_req.result = req->result;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, NULL, req->result, 0);
return crypto_ahash_final(&rctx->fallback_req);
}
static int sa_sha_finup(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
-
- rctx->fallback_req.nbytes = req->nbytes;
- rctx->fallback_req.src = req->src;
- rctx->fallback_req.result = req->result;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, req->src, req->result, req->nbytes);
return crypto_ahash_finup(&rctx->fallback_req);
}
@@ -1601,12 +1574,8 @@ static int sa_sha_finup(struct ahash_request *req)
static int sa_sha_import(struct ahash_request *req, const void *in)
{
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags = req->base.flags &
- CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
return crypto_ahash_import(&rctx->fallback_req, in);
}
@@ -1614,12 +1583,9 @@ static int sa_sha_import(struct ahash_request *req, const void *in)
static int sa_sha_export(struct ahash_request *req, void *out)
{
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
struct ahash_request *subreq = &rctx->fallback_req;
- ahash_request_set_tfm(subreq, ctx->fallback.ahash);
- subreq->base.flags = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(subreq, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
return crypto_ahash_export(subreq, out);
}
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 9:17 ` [PATCH] crypto: sa2ul - Use proper helpers to setup request Herbert Xu
@ 2025-03-26 10:00 ` Manorit Chawdhry
2025-03-26 10:05 ` [v2 PATCH] " Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-26 10:00 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 17:17-20250326, Herbert Xu wrote:
> On Wed, Mar 26, 2025 at 02:30:35PM +0530, Manorit Chawdhry wrote:
> >
> > The following patch seems to be breaking selftests in SA2UL driver.
>
> Thanks for the report.
>
> This patch should fix the problem:
>
> ---8<---
> Rather than setting up a request by hand, use the correct API helpers
> to setup the new request. This is because the API helpers will setup
> chaining.
>
> Also change the fallback allocation to explicitly request for a
> sync algorithm as this driver will crash if given an async one.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
Thanks for the quick fix, though now I see error in import rather than
init which was there previously.
root@j721e-evm:~# modprobe sa2ul
[ 155.283088] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
root@j721e-evm:~# [ 155.401918] Unable to handle kernel paging request at virtual address fefefefefefefeee
[ 155.430127] Unable to handle kernel paging request at virtual address fefefefefefefeee
[ 155.463959] Unable to handle kernel paging request at virtual address fefefefefefefeee
[ 155.480086] Mem abort info:
[ 155.503068] Mem abort info:
[ 155.506689] ESR = 0x0000000096000004
[ 155.527264] ESR = 0x0000000096000004
[ 155.531009] EC = 0x25: DABT (current EL), IL = 32 bits
[ 155.538758] Mem abort info:
[ 155.543371] EC = 0x25: DABT (current EL), IL = 32 bits
[ 155.559119] ESR = 0x0000000096000004
[ 155.580125] SET = 0, FnV = 0
[ 155.583176] EA = 0, S1PTW = 0
[ 155.589283] EC = 0x25: DABT (current EL), IL = 32 bits
[ 155.607886] SET = 0, FnV = 0
[ 155.610938] EA = 0, S1PTW = 0
[ 155.633650] SET = 0, FnV = 0
[ 155.638300] FSC = 0x04: level 0 translation fault
[ 155.667607] FSC = 0x04: level 0 translation fault
[ 155.673165] EA = 0, S1PTW = 0
[ 155.686121] Data abort info:
[ 155.697270] FSC = 0x04: level 0 translation fault
[ 155.709990] Data abort info:
[ 155.714312] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 155.736268] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 155.745374] Data abort info:
[ 155.763508] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 155.770677] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 155.791894] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 155.801343] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 155.815914] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 155.829811] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 155.847633] [fefefefefefefeee] address between user and kernel address ranges
[ 155.859395] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 155.872011] [fefefefefefefeee] address between user and kernel address ranges
[ 155.893911] [fefefefefefefeee] address between user and kernel address ranges
[ 155.901649] Internal error: Oops: 0000000096000004 [#1] SMP
[ 155.907297] Modules linked in: des_generic cbc libdes sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_esm pinctrl_tps6594 tps6594_pfsm gpio_regmap tps6594_regulator ti_am335x_adc pru_rproc kfifo_buf irq_pruss_intc cdns3 cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver omap_mailbox phy_j721e_wiz ti_k3_r5_remoteproc tps6594_i2c tps6594_core at24 tidss k3_j72xx_bandgap drm_client_lib drm_dma_helper cdns_mhdp8546 ti_am335x_tscadc m_can_platform snd_soc_davinci_mcasp drm_display_helper pruss snd_soc_ti_udma m_can snd_soc_ti_edma drm_kms_helper ti_j721e_ufs snd_soc_ti_sdma ti_k3_dsp_remoteproc snd_soc_pcm3168a_i2c cdns3_ti can_dev snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 155.977452] CPU: 0 UID: 0 PID: 1252 Comm: cryptomgr_test Not tainted 6.14.0-rc7-next-20250324-build-configs-00001-g82a16a3a2a73-dirty #3 PREEMPT
[ 155.990466] Hardware name: Texas Instruments J721e EVM (DT)
[ 155.996022] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 156.002966] pc : crypto_ahash_import+0x18/0x68
[ 156.007409] lr : sa_sha_import+0x48/0x60 [sa2ul]
[ 156.012024] sp : ffff8000853336f0
[ 156.015326] x29: ffff8000853336f0 x28: 0000000000000000 x27: 0000000000000000
[ 156.022447] x26: ffff000807ed6c00 x25: ffff8000853337e8 x24: 0000000000000000
[ 156.029568] x23: ffff000801307440 x22: ffff800085333c08 x21: ffff000801307400
[ 156.036688] x20: ffff800081510c10 x19: ffff8000814e13e8 x18: 00000000ffffffff
[ 156.043809] x17: 0000000000000000 x16: 0010000000000000 x15: ffff800085333740
[ 156.050929] x14: ffff800105333aa7 x13: 0000000000000000 x12: 0000000000000000
[ 156.058049] x11: 0000000000000000 x10: ffff00087f7c8308 x9 : ffff80007be3aa38
[ 156.065170] x8 : ffff000807ed6d50 x7 : fefefefefefefefe x6 : 0101010101010101
[ 156.072291] x5 : 00000000ffffff8d x4 : 0000000000000000 x3 : fefefefefefefefe
[ 156.079411] x2 : ffff000807ed6c00 x1 : ffff000803fc9c00 x0 : ffff000807ed6c90
[ 156.086531] Call trace:
[ 156.088966] crypto_ahash_import+0x18/0x68 (P)
[ 156.093399] sa_sha_import+0x48/0x60 [sa2ul]
[ 156.097658] crypto_ahash_import+0x54/0x68
[ 156.101744] test_ahash_vec_cfg+0x638/0x800
[ 156.105916] test_hash_vec+0xbc/0x230
[ 156.109568] __alg_test_hash+0x288/0x3d8
[ 156.113478] alg_test_hash+0x108/0x1a0
[ 156.117217] alg_test+0x148/0x658
[ 156.120520] cryptomgr_test+0x2c/0x50
[ 156.124171] kthread+0x134/0x218
[ 156.127390] ret_from_fork+0x10/0x20
[ 156.130958] Code: d503233f a9bf7bfd 910003fd f9401003 (385f0064)
[ 156.137034] ---[ end trace 0000000000000000 ]---
[ 156.147224] Internal error: Oops: 0000000096000004 [#2] SMP
[ 156.152873] Modules linked in: des_generic cbc libdes sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_esm pinctrl_tps6594 tps6594_pfsm gpio_regmap tps6594_regulator ti_am335x_adc pru_rproc kfifo_buf irq_pruss_intc cdns3 cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver omap_mailbox phy_j721e_wiz ti_k3_r5_remoteproc tps6594_i2c tps6594_core at24 tidss k3_j72xx_bandgap drm_client_lib drm_dma_helper cdns_mhdp8546 ti_am335x_tscadc m_can_platform snd_soc_davinci_mcasp drm_display_helper pruss snd_soc_ti_udma m_can snd_soc_ti_edma drm_kms_helper ti_j721e_ufs snd_soc_ti_sdma ti_k3_dsp_remoteproc snd_soc_pcm3168a_i2c cdns3_ti can_dev snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 156.223015] CPU: 0 UID: 0 PID: 1253 Comm: cryptomgr_test Tainted: G D 6.14.0-rc7-next-20250324-build-configs-00001-g82a16a3a2a73-dirty #3 PREEMPT
[ 156.237588] Tainted: [D]=DIE
[ 156.240456] Hardware name: Texas Instruments J721e EVM (DT)
[ 156.246010] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 156.252953] pc : crypto_ahash_import+0x18/0x68
[ 156.257389] lr : sa_sha_import+0x48/0x60 [sa2ul]
[ 156.261998] sp : ffff8000853cb6f0
[ 156.265299] x29: ffff8000853cb6f0 x28: 0000000000000000 x27: 0000000000000000
[ 156.272420] x26: ffff0008033aba00 x25: ffff8000853cb7e8 x24: 0000000000000000
[ 156.279540] x23: ffff000808b8a040 x22: ffff8000853cbc08 x21: ffff000808b8a000
[ 156.286660] x20: ffff800081510970 x19: ffff8000814e13e8 x18: ffff0008033abb28
[ 156.293781] x17: 0000000000000050 x16: 000000000000005e x15: fefefefefefefefe
[ 156.300900] x14: fefefefefefefefe x13: fefefefefefefefe x12: fefefefefefefefe
[ 156.308021] x11: fefefefefefefefe x10: fefefefefefefefe x9 : ffff80007be3aa38
[ 156.315141] x8 : ffff0008033abbb0 x7 : fefefefefefefefe x6 : 0101010101010101
[ 156.322261] x5 : 00000000ffffff8d x4 : 0000000000000000 x3 : fefefefefefefefe
[ 156.329381] x2 : ffff0008033aba00 x1 : ffff00080337da00 x0 : ffff0008033aba90
[ 156.336502] Call trace:
[ 156.338937] crypto_ahash_import+0x18/0x68 (P)
[ 156.343371] sa_sha_import+0x48/0x60 [sa2ul]
[ 156.347630] crypto_ahash_import+0x54/0x68
[ 156.351716] test_ahash_vec_cfg+0x638/0x800
[ 156.355888] test_hash_vec+0xbc/0x230
[ 156.359538] __alg_test_hash+0x288/0x3d8
[ 156.363449] alg_test_hash+0x108/0x1a0
[ 156.367187] alg_test+0x148/0x658
[ 156.370491] cryptomgr_test+0x2c/0x50
[ 156.374141] kthread+0x134/0x218
[ 156.377359] ret_from_fork+0x10/0x20
[ 156.380924] Code: d503233f a9bf7bfd 910003fd f9401003 (385f0064)
[ 156.386999] ---[ end trace 0000000000000000 ]---
[ 156.391626] Internal error: Oops: 0000000096000004 [#3] SMP
[ 156.397270] Modules linked in: des_generic cbc libdes sa2ul authenc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_esm pinctrl_tps6594 tps6594_pfsm gpio_regmap tps6594_regulator ti_am335x_adc pru_rproc kfifo_buf irq_pruss_intc cdns3 cdns_pltfrm cdns_usb_common snd_soc_j721e_evm display_connector phy_can_transceiver omap_mailbox phy_j721e_wiz ti_k3_r5_remoteproc tps6594_i2c tps6594_core at24 tidss k3_j72xx_bandgap drm_client_lib drm_dma_helper cdns_mhdp8546 ti_am335x_tscadc m_can_platform snd_soc_davinci_mcasp drm_display_helper pruss snd_soc_ti_udma m_can snd_soc_ti_edma drm_kms_helper ti_j721e_ufs snd_soc_ti_sdma ti_k3_dsp_remoteproc snd_soc_pcm3168a_i2c cdns3_ti can_dev snd_soc_pcm3168a rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6
[ 156.467403] CPU: 0 UID: 0 PID: 1250 Comm: cryptomgr_test Tainted: G D 6.14.0-rc7-next-20250324-build-configs-00001-g82a16a3a2a73-dirty #3 PREEMPT
[ 156.481974] Tainted: [D]=DIE
[ 156.484842] Hardware name: Texas Instruments J721e EVM (DT)
[ 156.490396] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 156.497340] pc : crypto_ahash_import+0x18/0x68
[ 156.501773] lr : sa_sha_import+0x48/0x60 [sa2ul]
[ 156.506379] sp : ffff8000851d36f0
[ 156.509680] x29: ffff8000851d36f0 x28: 0000000000000000 x27: 0000000000000000
[ 156.516800] x26: ffff000803c75400 x25: ffff8000851d37e8 x24: 0000000000000000
[ 156.523920] x23: ffff000808b8b040 x22: ffff8000851d3c08 x21: ffff000808b8b000
[ 156.531040] x20: ffff800081510e98 x19: ffff8000814e13e8 x18: 00000000ffffffff
[ 156.538160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000851d3740
[ 156.545280] x14: ffff8001051d3aa7 x13: 0000000000000000 x12: 0000000000000000
[ 156.552400] x11: 0000000000000000 x10: ffff00087f7c8308 x9 : ffff80007be3aa38
[ 156.559520] x8 : ffff000803c75548 x7 : fefefefefefefefe x6 : 0101010101010101
[ 156.566641] x5 : 00000000ffffff8d x4 : 0000000000000000 x3 : fefefefefefefefe
[ 156.573761] x2 : ffff000803c75400 x1 : ffff000812686a00 x0 : ffff000803c75490
[ 156.580881] Call trace:
[ 156.583316] crypto_ahash_import+0x18/0x68 (P)
[ 156.587747] sa_sha_import+0x48/0x60 [sa2ul]
[ 156.592006] crypto_ahash_import+0x54/0x68
[ 156.596090] test_ahash_vec_cfg+0x638/0x800
[ 156.600261] test_hash_vec+0xbc/0x230
[ 156.603912] __alg_test_hash+0x288/0x3d8
[ 156.607822] alg_test_hash+0x108/0x1a0
[ 156.611560] alg_test+0x148/0x658
[ 156.614864] cryptomgr_test+0x2c/0x50
[ 156.618514] kthread+0x134/0x218
[ 156.621731] ret_from_fork+0x10/0x20
[ 156.625298] Code: d503233f a9bf7bfd 910003fd f9401003 (385f0064)
[ 156.631372] ---[ end trace 0000000000000000 ]---
Regards,
Manorit
^ permalink raw reply [flat|nested] 42+ messages in thread
* [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 10:00 ` Manorit Chawdhry
@ 2025-03-26 10:05 ` Herbert Xu
2025-03-26 12:31 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-03-26 10:05 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Wed, Mar 26, 2025 at 03:30:27PM +0530, Manorit Chawdhry wrote:
>
> Thanks for the quick fix, though now I see error in import rather than
> init which was there previously.
Oops, I removed one line too many from the import function. It
should set the tfm just like init:
---8<---
Rather than setting up a request by hand, use the correct API helpers
to setup the new request. This is because the API helpers will setup
chaining.
Also change the fallback allocation to explicitly request for a
sync algorithm as this driver will crash if given an async one.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/crypto/sa2ul.c b/drivers/crypto/sa2ul.c
index 091612b066f1..fdc0b2486069 100644
--- a/drivers/crypto/sa2ul.c
+++ b/drivers/crypto/sa2ul.c
@@ -1415,22 +1415,13 @@ static int sa_sha_run(struct ahash_request *req)
(auth_len >= SA_UNSAFE_DATA_SZ_MIN &&
auth_len <= SA_UNSAFE_DATA_SZ_MAX)) {
struct ahash_request *subreq = &rctx->fallback_req;
- int ret = 0;
+ int ret;
ahash_request_set_tfm(subreq, ctx->fallback.ahash);
- subreq->base.flags = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(subreq, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(subreq, req->src, req->result, auth_len);
- crypto_ahash_init(subreq);
-
- subreq->nbytes = auth_len;
- subreq->src = req->src;
- subreq->result = req->result;
-
- ret |= crypto_ahash_update(subreq);
-
- subreq->nbytes = 0;
-
- ret |= crypto_ahash_final(subreq);
+ ret = crypto_ahash_digest(subreq);
return ret;
}
@@ -1502,8 +1493,7 @@ static int sa_sha_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
return ret;
if (alg_base) {
- ctx->shash = crypto_alloc_shash(alg_base, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ ctx->shash = crypto_alloc_shash(alg_base, 0, 0);
if (IS_ERR(ctx->shash)) {
dev_err(sa_k3_dev, "base driver %s couldn't be loaded\n",
alg_base);
@@ -1511,8 +1501,7 @@ static int sa_sha_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
}
/* for fallback */
ctx->fallback.ahash =
- crypto_alloc_ahash(alg_base, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ crypto_alloc_ahash(alg_base, 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(ctx->fallback.ahash)) {
dev_err(ctx->dev_data->dev,
"Could not load fallback driver\n");
@@ -1546,54 +1535,38 @@ static int sa_sha_init(struct ahash_request *req)
crypto_ahash_digestsize(tfm), rctx);
ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, NULL, NULL, 0);
return crypto_ahash_init(&rctx->fallback_req);
}
static int sa_sha_update(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
- rctx->fallback_req.nbytes = req->nbytes;
- rctx->fallback_req.src = req->src;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, req->src, NULL, req->nbytes);
return crypto_ahash_update(&rctx->fallback_req);
}
static int sa_sha_final(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
- rctx->fallback_req.result = req->result;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, NULL, req->result, 0);
return crypto_ahash_final(&rctx->fallback_req);
}
static int sa_sha_finup(struct ahash_request *req)
{
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
- ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags =
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
-
- rctx->fallback_req.nbytes = req->nbytes;
- rctx->fallback_req.src = req->src;
- rctx->fallback_req.result = req->result;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+ ahash_request_set_crypt(&rctx->fallback_req, req->src, req->result, req->nbytes);
return crypto_ahash_finup(&rctx->fallback_req);
}
@@ -1605,8 +1578,7 @@ static int sa_sha_import(struct ahash_request *req, const void *in)
struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
ahash_request_set_tfm(&rctx->fallback_req, ctx->fallback.ahash);
- rctx->fallback_req.base.flags = req->base.flags &
- CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(&rctx->fallback_req, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
return crypto_ahash_import(&rctx->fallback_req, in);
}
@@ -1614,12 +1586,9 @@ static int sa_sha_import(struct ahash_request *req, const void *in)
static int sa_sha_export(struct ahash_request *req, void *out)
{
struct sa_sha_req_ctx *rctx = ahash_request_ctx(req);
- struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
- struct sa_tfm_ctx *ctx = crypto_ahash_ctx(tfm);
struct ahash_request *subreq = &rctx->fallback_req;
- ahash_request_set_tfm(subreq, ctx->fallback.ahash);
- subreq->base.flags = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP;
+ ahash_request_set_callback(subreq, req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
return crypto_ahash_export(subreq, out);
}
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 10:05 ` [v2 PATCH] " Herbert Xu
@ 2025-03-26 12:31 ` Manorit Chawdhry
2025-03-26 13:06 ` Herbert Xu
2025-04-11 5:34 ` [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request Manorit Chawdhry
0 siblings, 2 replies; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-26 12:31 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 18:05-20250326, Herbert Xu wrote:
> On Wed, Mar 26, 2025 at 03:30:27PM +0530, Manorit Chawdhry wrote:
> >
> > Thanks for the quick fix, though now I see error in import rather than
> > init which was there previously.
>
> Oops, I removed one line too many from the import function. It
> should set the tfm just like init:
>
> ---8<---
> Rather than setting up a request by hand, use the correct API helpers
> to setup the new request. This is because the API helpers will setup
> chaining.
>
> Also change the fallback allocation to explicitly request for a
> sync algorithm as this driver will crash if given an async one.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks for the fix! Although, it still fails probably due to the
introduction of multibuffer hash testing in "crypto: testmgr - Add
multibuffer hash testing" but that we will have to fix for our driver I
assume.
[ 32.408283] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(9/13/uneven) src_divs=[100.0%@+860] key_offset=17"
[...]
[ 32.885927] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(6/9/uneven) nosimd src_divs=[93.34%@+3634, 6.66%@+16] iv_offset=9 key_offset=70"
[...]
[ 33.135286] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(15/16/uneven) src_divs=[100.0%@alignmask+26] key_offset=1"
Tested-by: Manorit Chawdhry <m-chawdhry@ti.com>
Regards,
Manorit
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 12:31 ` Manorit Chawdhry
@ 2025-03-26 13:06 ` Herbert Xu
2025-03-26 13:07 ` Herbert Xu
2025-03-27 7:34 ` Manorit Chawdhry
2025-04-11 5:34 ` [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request Manorit Chawdhry
1 sibling, 2 replies; 42+ messages in thread
From: Herbert Xu @ 2025-03-26 13:06 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Wed, Mar 26, 2025 at 06:01:20PM +0530, Manorit Chawdhry wrote:
>
> Thanks for the fix! Although, it still fails probably due to the
> introduction of multibuffer hash testing in "crypto: testmgr - Add
> multibuffer hash testing" but that we will have to fix for our driver I
> assume.
>
> [ 32.408283] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(9/13/uneven) src_divs=[100.0%@+860] key_offset=17"
> [...]
> [ 32.885927] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(6/9/uneven) nosimd src_divs=[93.34%@+3634, 6.66%@+16] iv_offset=9 key_offset=70"
> [...]
> [ 33.135286] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(15/16/uneven) src_divs=[100.0%@alignmask+26] key_offset=1"
There are no other messages?
This means that one of the filler test requests triggered an EINVAL
from your driver. A filler request in an uneven test can range from
0 to 2 * PAGE_SIZE bytes long.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 13:06 ` Herbert Xu
@ 2025-03-26 13:07 ` Herbert Xu
2025-03-27 7:34 ` Manorit Chawdhry
1 sibling, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-03-26 13:07 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Wed, Mar 26, 2025 at 09:06:59PM +0800, Herbert Xu wrote:
>
> This means that one of the filler test requests triggered an EINVAL
> from your driver. A filler request in an uneven test can range from
> 0 to 2 * PAGE_SIZE bytes long.
Make that 0 to 16 * PAGE_SIZE bytes.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 13:06 ` Herbert Xu
2025-03-26 13:07 ` Herbert Xu
@ 2025-03-27 7:34 ` Manorit Chawdhry
2025-03-27 8:15 ` Manorit Chawdhry
1 sibling, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-27 7:34 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 21:06-20250326, Herbert Xu wrote:
> On Wed, Mar 26, 2025 at 06:01:20PM +0530, Manorit Chawdhry wrote:
> >
> > Thanks for the fix! Although, it still fails probably due to the
> > introduction of multibuffer hash testing in "crypto: testmgr - Add
> > multibuffer hash testing" but that we will have to fix for our driver I
> > assume.
> >
> > [ 32.408283] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(9/13/uneven) src_divs=[100.0%@+860] key_offset=17"
> > [...]
> > [ 32.885927] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(6/9/uneven) nosimd src_divs=[93.34%@+3634, 6.66%@+16] iv_offset=9 key_offset=70"
> > [...]
> > [ 33.135286] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(15/16/uneven) src_divs=[100.0%@alignmask+26] key_offset=1"
>
> There are no other messages?
This is the full failure log:
root@j721e-evm:~# modprobe sa2ul
[59910.170612] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
root@j721e-evm:~# [59910.331792] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: may_sleep use_digest multibuffer(0/10/uneven) src_divs=[53.50%@+816, 14.50%@+2101, 32.0%@+1281] key_offset=114"
[59910.354517] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: may_sleep use_digest multibuffer(0/6/uneven) src_divs=[3.96%@+26, 88.54%@+3968, 7.50%@+20] dst_divs=[100.0%@alignmask+2] key_offset=33"
[59910.454646] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(4/14/uneven) nosimd src_divs=[50.0%@+29, 25.0%@+28, 25.0%@+4] key_offset=65"
[59910.494415] alg: self-tests for sha1 using sha1-sa2ul failed (rc=-22)
[59910.494424] ------------[ cut here ]------------
[59910.505522] alg: self-tests for sha1 using sha1-sa2ul failed (rc=-22)
[59910.512463] alg: self-tests for sha512 using sha512-sa2ul failed (rc=-22)
[59910.548673] ------------[ cut here ]------------
[59910.560115] alg: self-tests for sha512 using sha512-sa2ul failed (rc=-22)
[59910.577041] WARNING: CPU: 0 PID: 1959 at crypto/testmgr.c:5997 alg_test+0x5d0/0x658
[59910.591470] Modules linked in: sa2ul authenc des_generic libdes cbc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_pfsm tps6594_esm pinctrl_tps6594 tps6594_regulator gpio_regmap ti_am335x_adc kfifo_buf pru_rproc irq_pruss_intc cdns3 cdns_usb_common cdns_pltfrm snd_soc_j721e_evm display_connector phy_j721e_wiz phy_can_transceiver omap_mailbox ti_k3_r5_remoteproc at24 tps6594_i2c tps6594_core tidss drm_client_lib k3_j72xx_bandgap drm_dma_helper cdns_mhdp8546 m_can_platform drm_display_helper m_can ti_am335x_tscadc pruss drm_kms_helper snd_soc_pcm3168a_i2c snd_soc_davinci_mcasp can_dev snd_soc_pcm3168a snd_soc_ti_udma snd_soc_ti_edma ti_j721e_ufs ti_k3_dsp_remoteproc cdns3_ti snd_soc_ti_sdma rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6 [last unloaded: authenc]
[59910.663794] CPU: 0 UID: 0 PID: 1959 Comm: cryptomgr_test Tainted: G W 6.14.0-rc1-build-configs-00186-g8b54e6a8f415-dirty #1
[59910.676286] Tainted: [W]=WARN
[59910.679241] Hardware name: Texas Instruments J721e EVM (DT)
[59910.684797] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[59910.691740] pc : alg_test+0x5d0/0x658
[59910.695392] lr : alg_test+0x5d0/0x658
[59910.699042] sp : ffff80008515bd40
[59910.702343] x29: ffff80008515bde0 x28: 0000000000000000 x27: 0000000000000000
[59910.709464] x26: 00000000ffffffea x25: 00000000ffffffff x24: 000000000000017b
[59910.716585] x23: ffff80008384be88 x22: 000000000000118f x21: ffff0008032c5a80
[59910.723705] x20: ffff0008032c5a00 x19: ffff8000814bf320 x18: 0000000002004c00
[59910.730825] x17: 0000000002004400 x16: 00000000000000ee x15: cb299d3b567fbd0e
[59910.737945] x14: 6cc9dff4249846de x13: 0000000000000000 x12: 0000000000020005
[59910.745066] x11: 000000e200000016 x10: 0000000000000af0 x9 : ffff8000800f8ba0
[59910.752186] x8 : ffff000809b10b50 x7 : 00000000005285f6 x6 : 000000000000001e
[59910.759305] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000208
[59910.766425] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000809b10000
[59910.773545] Call trace:
[59910.775981] alg_test+0x5d0/0x658 (P)
[59910.779634] cryptomgr_test+0x2c/0x50
[59910.783286] kthread+0x134/0x218
[59910.786504] ret_from_fork+0x10/0x20
[59910.790070] ---[ end trace 0000000000000000 ]---
[59910.799050] alg: self-tests for sha256 using sha256-sa2ul failed (rc=-22)
[59910.799057] ------------[ cut here ]------------
[59910.810440] alg: self-tests for sha256 using sha256-sa2ul failed (rc=-22)
[59910.810468] WARNING: CPU: 0 PID: 1962 at crypto/testmgr.c:5997 alg_test+0x5d0/0x658
[59910.824882] Modules linked in: sa2ul authenc des_generic libdes cbc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_pfsm tps6594_esm pinctrl_tps6594 tps6594_regulator gpio_regmap ti_am335x_adc kfifo_buf pru_rproc irq_pruss_intc cdns3 cdns_usb_common cdns_pltfrm snd_soc_j721e_evm display_connector phy_j721e_wiz phy_can_transceiver omap_mailbox ti_k3_r5_remoteproc at24 tps6594_i2c tps6594_core tidss drm_client_lib k3_j72xx_bandgap drm_dma_helper cdns_mhdp8546 m_can_platform drm_display_helper m_can ti_am335x_tscadc pruss drm_kms_helper snd_soc_pcm3168a_i2c snd_soc_davinci_mcasp can_dev snd_soc_pcm3168a snd_soc_ti_udma snd_soc_ti_edma ti_j721e_ufs ti_k3_dsp_remoteproc cdns3_ti snd_soc_ti_sdma rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6 [last unloaded: authenc]
[59910.897196] CPU: 0 UID: 0 PID: 1962 Comm: cryptomgr_test Tainted: G W 6.14.0-rc1-build-configs-00186-g8b54e6a8f415-dirty #1
[59910.909688] Tainted: [W]=WARN
[59910.912642] Hardware name: Texas Instruments J721e EVM (DT)
[59910.918198] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[59910.925141] pc : alg_test+0x5d0/0x658
[59910.928792] lr : alg_test+0x5d0/0x658
[59910.932443] sp : ffff800085313d40
[59910.935744] x29: ffff800085313de0 x28: 0000000000000000 x27: 0000000000000000
[59910.942865] x26: 00000000ffffffea x25: 00000000ffffffff x24: 000000000000018d
[59910.949986] x23: ffff80008384be88 x22: 000000000000118f x21: ffff000808b03e80
[59910.957106] x20: ffff000808b03e00 x19: ffff8000814bf320 x18: 00000000fffffffe
[59910.964226] x17: ffff8007fd27e000 x16: ffff800080000000 x15: ffff8000852bb8e0
[59910.971346] x14: 0000000000000000 x13: ffff800083814452 x12: 0000000000000000
[59910.978465] x11: ffff00087f7a4d80 x10: 0000000000000af0 x9 : ffff8000800f8ba0
[59910.985586] x8 : ffff0008052e8b50 x7 : 0000000000019aff x6 : 000000000000000d
[59910.992705] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000208
[59910.999825] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008052e8000
[59911.006946] Call trace:
[59911.009382] alg_test+0x5d0/0x658 (P)
[59911.013034] cryptomgr_test+0x2c/0x50
[59911.016685] kthread+0x134/0x218
[59911.019905] ret_from_fork+0x10/0x20
[59911.023469] ---[ end trace 0000000000000000 ]---
[59911.028107] WARNING: CPU: 0 PID: 1961 at crypto/testmgr.c:5997 alg_test+0x5d0/0x658
[59911.035749] Modules linked in: sa2ul authenc des_generic libdes cbc onboard_usb_dev rpmsg_ctrl rpmsg_char phy_cadence_torrent phy_cadence_sierra rtc_tps6594 tps6594_pfsm tps6594_esm pinctrl_tps6594 tps6594_regulator gpio_regmap ti_am335x_adc kfifo_buf pru_rproc irq_pruss_intc cdns3 cdns_usb_common cdns_pltfrm snd_soc_j721e_evm display_connector phy_j721e_wiz phy_can_transceiver omap_mailbox ti_k3_r5_remoteproc at24 tps6594_i2c tps6594_core tidss drm_client_lib k3_j72xx_bandgap drm_dma_helper cdns_mhdp8546 m_can_platform drm_display_helper m_can ti_am335x_tscadc pruss drm_kms_helper snd_soc_pcm3168a_i2c snd_soc_davinci_mcasp can_dev snd_soc_pcm3168a snd_soc_ti_udma snd_soc_ti_edma ti_j721e_ufs ti_k3_dsp_remoteproc cdns3_ti snd_soc_ti_sdma rti_wdt overlay cfg80211 rfkill fuse drm backlight ipv6 [last unloaded: authenc]
[59911.108050] CPU: 0 UID: 0 PID: 1961 Comm: cryptomgr_test Tainted: G W 6.14.0-rc1-build-configs-00186-g8b54e6a8f415-dirty #1
[59911.120541] Tainted: [W]=WARN
[59911.123495] Hardware name: Texas Instruments J721e EVM (DT)
[59911.129049] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[59911.135992] pc : alg_test+0x5d0/0x658
[59911.139643] lr : alg_test+0x5d0/0x658
[59911.143293] sp : ffff8000852bbd40
[59911.146594] x29: ffff8000852bbde0 x28: 0000000000000000 x27: 0000000000000000
[59911.153714] x26: 00000000ffffffea x25: 00000000ffffffff x24: 000000000000017f
[59911.160835] x23: ffff80008384be88 x22: 000000000000118f x21: ffff000803f5aa80
[59911.167954] x20: ffff000803f5aa00 x19: ffff8000814bf320 x18: 00000000fffffffe
[59911.175074] x17: ffff8007fd27e000 x16: ffff800080000000 x15: 0000000000000000
[59911.182194] x14: 00003d0971c5fa00 x13: ffffffff919fcffd x12: 0000000000000000
[59911.189313] x11: ffff00087f7a4d80 x10: 0000000000000af0 x9 : ffff8000800f8ba0
[59911.196433] x8 : ffff000809a37450 x7 : 0000000000026a7c x6 : 000000000000000f
[59911.203553] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000208
[59911.210672] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000809a36900
[59911.217793] Call trace:
[59911.220227] alg_test+0x5d0/0x658 (P)
[59911.223879] cryptomgr_test+0x2c/0x50
[59911.227529] kthread+0x134/0x218
[59911.230746] ret_from_fork+0x10/0x20
[59911.234310] ---[ end trace 0000000000000000 ]---
>
> This means that one of the filler test requests triggered an EINVAL
> from your driver. A filler request in an uneven test can range from
> 0 to 2 * PAGE_SIZE bytes long.
>
I tracked it down and see [0] returning -EINVAL. Do you have any
insights as to what changed that it's not working anymore...
[0]: https://github.com/torvalds/linux/blob/38fec10eb60d687e30c8c6b5420d86e8149f7557/drivers/crypto/sa2ul.c#L1177
Regards,
Manorit
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-27 7:34 ` Manorit Chawdhry
@ 2025-03-27 8:15 ` Manorit Chawdhry
2025-03-27 8:23 ` [PATCH] crypto: testmgr - Initialise full_sgl properly Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-27 8:15 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 13:04-20250327, Manorit Chawdhry wrote:
> Hi Herbert,
>
> On 21:06-20250326, Herbert Xu wrote:
> > On Wed, Mar 26, 2025 at 06:01:20PM +0530, Manorit Chawdhry wrote:
> > >
> > > Thanks for the fix! Although, it still fails probably due to the
> > > introduction of multibuffer hash testing in "crypto: testmgr - Add
> > > multibuffer hash testing" but that we will have to fix for our driver I
> > > assume.
> > >
> > > [ 32.408283] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(9/13/uneven) src_divs=[100.0%@+860] key_offset=17"
> > > [...]
> > > [ 32.885927] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(6/9/uneven) nosimd src_divs=[93.34%@+3634, 6.66%@+16] iv_offset=9 key_offset=70"
> > > [...]
> > > [ 33.135286] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(15/16/uneven) src_divs=[100.0%@alignmask+26] key_offset=1"
[..]
> >
> > This means that one of the filler test requests triggered an EINVAL
> > from your driver. A filler request in an uneven test can range from
> > 0 to 2 * PAGE_SIZE bytes long.
> >
>
> I tracked it down and see [0] returning -EINVAL. Do you have any
> insights as to what changed that it's not working anymore...
>
> [0]: https://github.com/torvalds/linux/blob/38fec10eb60d687e30c8c6b5420d86e8149f7557/drivers/crypto/sa2ul.c#L1177
Added some more prints.
diff --git a/drivers/crypto/sa2ul.c b/drivers/crypto/sa2ul.c
index 091612b066f1..0e7692ae60e5 100644
--- a/drivers/crypto/sa2ul.c
+++ b/drivers/crypto/sa2ul.c
@@ -1176,7 +1176,13 @@ static int sa_run(struct sa_req *req)
mapped_sg->sgt.orig_nents = sg_nents;
ret = dma_map_sgtable(ddev, &mapped_sg->sgt, dir_src, 0);
if (ret) {
+ for (struct scatterlist *temp = src; temp; temp = sg_next(temp)) {
+ pr_info("%s: %d: temp->length: %d, temp->offset: %d\n", __func__, __LINE__, temp->length, temp->offset);
+ }
+ pr_info("%s: %d: req->size: %d, src: %p\n", __func__, __LINE__, req->size, req->src);
+ pr_info("%s: %d: sgl: %p, orig_nents: %d\n", __func__, __LINE__, mapped_sg->sgt.sgl, mapped_sg->sgt.orig_nents);
kfree(rxd);
+ pr_info("%s: %d: ret=%d\n", __func__, __LINE__, ret);
return ret;
}
root@j721e-evm:~# modprobe sa2ul
[ 32.890801] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
root@j721e-evm:~# [ 32.981093] sa_run: 1180: temp->length: 8192, temp->offset: 0
[ 32.996268] sa_run: 1180: temp->length: 8192, temp->offset: 0
[ 33.002029] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.007512] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.012986] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.018458] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.023930] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.029402] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.034874] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.040345] sa_run: 1182: req->size: 40187, src: 00000000f1859ae0
[ 33.046426] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
[ 33.052419] sa_run: 1185: ret=-22
[ 33.055852] sa_run: 1180: temp->length: 8192, temp->offset: 0
[ 33.061589] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.067061] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.072532] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.078004] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.083475] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.088947] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.094419] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.099890] sa_run: 1182: req->size: 32768, src: 00000000f1859ae0
[ 33.105969] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
[ 33.111962] sa_run: 1185: ret=-22
[ 33.115268] sa_run: 1180: temp->length: 8192, temp->offset: 0
[ 33.121001] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.126472] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.131944] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.137416] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.142888] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.148360] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.153832] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.159304] sa_run: 1182: req->size: 34117, src: 00000000f1859ae0
[ 33.165382] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
[ 33.171375] sa_run: 1185: ret=-22
[ 33.174725] sa_run: 1180: temp->length: 8192, temp->offset: 0
[ 33.180459] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.185936] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.191408] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.196879] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.202351] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.207822] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.213294] sa_run: 1180: temp->length: 0, temp->offset: 0
[ 33.218765] sa_run: 1182: req->size: 31875, src: 00000000f1859ae0
[ 33.224845] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
[ 33.230838] sa_run: 1185: ret=-22
[ 33.234204] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: may_sleep use_digest multibuffer(8/14/uneven) src_divs=[100.0%@alignmask+3110] dst_divs=[100.0%@+2132] key_offset=63"
[
Regards,
Manorit
>
> Regards,
> Manorit
>
> > Cheers,
> > --
> > Email: Herbert Xu <herbert@gondor.apana.org.au>
> > Home Page: http://gondor.apana.org.au/~herbert/
> > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH] crypto: testmgr - Initialise full_sgl properly
2025-03-27 8:15 ` Manorit Chawdhry
@ 2025-03-27 8:23 ` Herbert Xu
2025-03-27 8:40 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-03-27 8:23 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Thu, Mar 27, 2025 at 01:45:55PM +0530, Manorit Chawdhry wrote:
>
> [ 33.040345] sa_run: 1182: req->size: 40187, src: 00000000f1859ae0
> [ 33.046426] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
Thanks for the info! The filler SG initialisation was broken:
---8<---
Initialise the whole full_sgl array rather than the first entry.
Fixes: 8b54e6a8f415 ("crypto: testmgr - Add multibuffer hash testing")
Reported-by: Manorit Chawdhry <m-chawdhry@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 74b3cadc0d40..455ce6e434fd 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -689,7 +689,7 @@ static int build_test_sglist(struct test_sglist *tsgl,
sg_init_table(tsgl->full_sgl, XBUFSIZE);
for (i = 0; i < XBUFSIZE; i++)
- sg_set_buf(tsgl->full_sgl, tsgl->bufs[i], PAGE_SIZE * 2);
+ sg_set_buf(&tsgl->full_sgl[i], tsgl->bufs[i], PAGE_SIZE * 2);
return 0;
}
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] crypto: testmgr - Initialise full_sgl properly
2025-03-27 8:23 ` [PATCH] crypto: testmgr - Initialise full_sgl properly Herbert Xu
@ 2025-03-27 8:40 ` Manorit Chawdhry
2025-03-27 9:09 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-27 8:40 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 16:23-20250327, Herbert Xu wrote:
> On Thu, Mar 27, 2025 at 01:45:55PM +0530, Manorit Chawdhry wrote:
> >
> > [ 33.040345] sa_run: 1182: req->size: 40187, src: 00000000f1859ae0
> > [ 33.046426] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
>
> Thanks for the info! The filler SG initialisation was broken:
>
> ---8<---
> Initialise the whole full_sgl array rather than the first entry.
>
> Fixes: 8b54e6a8f415 ("crypto: testmgr - Add multibuffer hash testing")
> Reported-by: Manorit Chawdhry <m-chawdhry@ti.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
Thanks, this fixes it.
root@j721e-evm:~# modprobe sa2ul
[ 35.293140] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
Tested-by: Manorit Chawdhry <m-chawdhry@ti.com>
Regards,
Manorit
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] crypto: testmgr - Initialise full_sgl properly
2025-03-27 8:40 ` Manorit Chawdhry
@ 2025-03-27 9:09 ` Manorit Chawdhry
2025-03-31 10:13 ` Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-03-27 9:09 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 14:10-20250327, Manorit Chawdhry wrote:
> Hi Herbert,
>
> On 16:23-20250327, Herbert Xu wrote:
> > On Thu, Mar 27, 2025 at 01:45:55PM +0530, Manorit Chawdhry wrote:
> > >
> > > [ 33.040345] sa_run: 1182: req->size: 40187, src: 00000000f1859ae0
> > > [ 33.046426] sa_run: 1183: sgl: 00000000f1859ae0, orig_nents: -22
> >
> > Thanks for the info! The filler SG initialisation was broken:
> >
> > ---8<---
> > Initialise the whole full_sgl array rather than the first entry.
> >
> > Fixes: 8b54e6a8f415 ("crypto: testmgr - Add multibuffer hash testing")
> > Reported-by: Manorit Chawdhry <m-chawdhry@ti.com>
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> >
>
> Thanks, this fixes it.
Though it really makes me wonder.. I was actually thinking that it was
our driver problem and not something core as when I fell back to the
software fallbacks everything was fine... How is it possible, do you
have any insights on that? Is something missing?
[ Without this patch + the patch mentioned below ]
diff --git a/drivers/crypto/sa2ul.c b/drivers/crypto/sa2ul.c
index 35570c06eb3c..90d754bbea9b 100644
--- a/drivers/crypto/sa2ul.c
+++ b/drivers/crypto/sa2ul.c
@@ -1417,9 +1417,9 @@ static int sa_sha_run(struct ahash_request *req)
if (!auth_len)
return zero_message_process(req);
- if (auth_len > SA_MAX_DATA_SZ ||
+ if (1 || (auth_len > SA_MAX_DATA_SZ ||
(auth_len >= SA_UNSAFE_DATA_SZ_MIN &&
- auth_len <= SA_UNSAFE_DATA_SZ_MAX)) {
+ auth_len <= SA_UNSAFE_DATA_SZ_MAX))) {
struct ahash_request *subreq = &rctx->fallback_req;
int ret;
This passes the tests.
root@j721e-evm:~# modprobe sa2ul
[ 53.380168] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
Regards,
Manorit
>
> root@j721e-evm:~# modprobe sa2ul
> [ 35.293140] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
>
> Tested-by: Manorit Chawdhry <m-chawdhry@ti.com>
>
> Regards,
> Manorit
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] crypto: testmgr - Initialise full_sgl properly
2025-03-27 9:09 ` Manorit Chawdhry
@ 2025-03-31 10:13 ` Herbert Xu
0 siblings, 0 replies; 42+ messages in thread
From: Herbert Xu @ 2025-03-31 10:13 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Thu, Mar 27, 2025 at 02:39:55PM +0530, Manorit Chawdhry wrote:
>
> Though it really makes me wonder.. I was actually thinking that it was
> our driver problem and not something core as when I fell back to the
> software fallbacks everything was fine... How is it possible, do you
> have any insights on that? Is something missing?
I think the software SG walker simply exits if it detects a shorter
than expected SG list. So if you ask it to hash 128KB of data, but
only supply an 8KB SG list, it will hash 8KB and then declare that
the job is done.
That is arguably suboptimal.
I'm in the process of rewriting the software walker to add multibuffer
support so I might fix this in the process.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-03-26 12:31 ` Manorit Chawdhry
2025-03-26 13:06 ` Herbert Xu
@ 2025-04-11 5:34 ` Manorit Chawdhry
2025-04-11 5:37 ` Herbert Xu
1 sibling, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-04-11 5:34 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 18:01-20250326, Manorit Chawdhry wrote:
> Hi Herbert,
>
> On 18:05-20250326, Herbert Xu wrote:
> > On Wed, Mar 26, 2025 at 03:30:27PM +0530, Manorit Chawdhry wrote:
> > >
> > > Thanks for the quick fix, though now I see error in import rather than
> > > init which was there previously.
> >
> > Oops, I removed one line too many from the import function. It
> > should set the tfm just like init:
> >
> > ---8<---
> > Rather than setting up a request by hand, use the correct API helpers
> > to setup the new request. This is because the API helpers will setup
> > chaining.
> >
> > Also change the fallback allocation to explicitly request for a
> > sync algorithm as this driver will crash if given an async one.
> >
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Thanks for the fix! Although, it still fails probably due to the
> introduction of multibuffer hash testing in "crypto: testmgr - Add
> multibuffer hash testing" but that we will have to fix for our driver I
> assume.
I see multibuffer hashing is reverted but with chaining changes we would
require the following patch.. I see the chaining changes in 6.15-rc1 but
I don't see the following patch in 6.15-rc1, could you queue it for next
RC?
Regards,
Manorit
>
> [ 32.408283] alg: ahash: sha1-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(9/13/uneven) src_divs=[100.0%@+860] key_offset=17"
> [...]
> [ 32.885927] alg: ahash: sha512-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: use_digest multibuffer(6/9/uneven) nosimd src_divs=[93.34%@+3634, 6.66%@+16] iv_offset=9 key_offset=70"
> [...]
> [ 33.135286] alg: ahash: sha256-sa2ul digest() failed on test vector 0; expected_error=0, actual_error=-22, cfg="random: inplace_two_sglists may_sleep use_digest multibuffer(15/16/uneven) src_divs=[100.0%@alignmask+26] key_offset=1"
>
> Tested-by: Manorit Chawdhry <m-chawdhry@ti.com>
>
> Regards,
> Manorit
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-04-11 5:34 ` [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request Manorit Chawdhry
@ 2025-04-11 5:37 ` Herbert Xu
2025-04-11 5:44 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-04-11 5:37 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Fri, Apr 11, 2025 at 11:04:26AM +0530, Manorit Chawdhry wrote:
>
> I see multibuffer hashing is reverted but with chaining changes we would
> require the following patch.. I see the chaining changes in 6.15-rc1 but
> I don't see the following patch in 6.15-rc1, could you queue it for next
> RC?
This patch is in cryptodev. There won't be any chaining in 6.15
so it's not needed there.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-04-11 5:37 ` Herbert Xu
@ 2025-04-11 5:44 ` Manorit Chawdhry
2025-04-11 5:46 ` Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-04-11 5:44 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 13:37-20250411, Herbert Xu wrote:
> On Fri, Apr 11, 2025 at 11:04:26AM +0530, Manorit Chawdhry wrote:
> >
> > I see multibuffer hashing is reverted but with chaining changes we would
> > require the following patch.. I see the chaining changes in 6.15-rc1 but
> > I don't see the following patch in 6.15-rc1, could you queue it for next
> > RC?
>
> This patch is in cryptodev. There won't be any chaining in 6.15
> so it's not needed there.
I see the chaining patches in 6.15-rc1.. [0] Are you planning to revert
them as well?
[0]: https://github.com/torvalds/linux/commits/v6.15-rc1/crypto/ahash.c
Regards,
Manorit
>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-04-11 5:44 ` Manorit Chawdhry
@ 2025-04-11 5:46 ` Herbert Xu
2025-04-11 6:14 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-04-11 5:46 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Fri, Apr 11, 2025 at 11:14:58AM +0530, Manorit Chawdhry wrote:
>
> I see the chaining patches in 6.15-rc1.. [0] Are you planning to revert
> them as well?
>
> [0]: https://github.com/torvalds/linux/commits/v6.15-rc1/crypto/ahash.c
With the multibuffer tests removed there is no way to call into
the chaining code in 6.15.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request
2025-04-11 5:46 ` Herbert Xu
@ 2025-04-11 6:14 ` Manorit Chawdhry
2025-04-11 7:14 ` [PATCH] crypto: ahash - Disable request chaining Herbert Xu
0 siblings, 1 reply; 42+ messages in thread
From: Manorit Chawdhry @ 2025-04-11 6:14 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 13:46-20250411, Herbert Xu wrote:
> On Fri, Apr 11, 2025 at 11:14:58AM +0530, Manorit Chawdhry wrote:
> >
> > I see the chaining patches in 6.15-rc1.. [0] Are you planning to revert
> > them as well?
> >
> > [0]: https://github.com/torvalds/linux/commits/v6.15-rc1/crypto/ahash.c
>
> With the multibuffer tests removed there is no way to call into
> the chaining code in 6.15.
>
root@j721e-evm:~# uname -a
Linux j721e-evm 6.15.0-rc1 #3 SMP PREEMPT Fri Apr 11 11:37:56 IST 2025 aarch64 GNU/Linux
root@j721e-evm:~# modprobe sa2ul
[ 42.395465] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
root@j721e-evm:~# [ 42.515589] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 42.549426] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 42.558286] Unable to handle kernel paging request at virtual address fefefefefefeff46
[ 42.592270] Mem abort info:
[ 42.623088] ESR = 0x0000000096000044
[ 42.628660] Mem abort info:
[ 42.631496] ESR = 0x0000000096000044
[ 42.635478] Mem abort info:
[ 42.655279] ESR = 0x0000000096000044
[ 42.661660] EC = 0x25: DABT (current EL), IL = 32 bits
[ 42.681101] EC = 0x25: DABT (current EL), IL = 32 bits
[ 42.687850] EC = 0x25: DABT (current EL), IL = 32 bits
[ 42.701449] SET = 0, FnV = 0
[ 42.704524] EA = 0, S1PTW = 0
[ 42.727712] SET = 0, FnV = 0
[ 42.731640] SET = 0, FnV = 0
[ 42.755404] FSC = 0x04: level 0 translation fault
[ 42.760711] EA = 0, S1PTW = 0
[ 42.763923] EA = 0, S1PTW = 0
[ 42.793049] FSC = 0x04: level 0 translation fault
[ 42.798305] FSC = 0x04: level 0 translation fault
[ 42.809586] Data abort info:
[ 42.812463] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 42.830629] Data abort info:
[ 42.833982] Data abort info:
[ 42.854520] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 42.860776] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 42.880775] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 42.897045] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 42.902752] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 42.913445] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 42.931845] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 42.937545] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 42.948838] [fefefefefefeff46] address between user and kernel address ranges
[ 42.968662] [fefefefefefeff46] address between user and kernel address ranges
[ 42.976773] [fefefefefefeff46] address between user and kernel address ranges
[..]
Maybe it's not the chaining but the way chaining was implemented that
requires us to start using these correct API helpers otherwise we crash
so I think we would require the following patch.
Regards,
Manorit
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH] crypto: ahash - Disable request chaining
2025-04-11 6:14 ` Manorit Chawdhry
@ 2025-04-11 7:14 ` Herbert Xu
2025-04-11 7:58 ` Manorit Chawdhry
0 siblings, 1 reply; 42+ messages in thread
From: Herbert Xu @ 2025-04-11 7:14 UTC (permalink / raw)
To: Manorit Chawdhry
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
On Fri, Apr 11, 2025 at 11:44:17AM +0530, Manorit Chawdhry wrote:
>
> Maybe it's not the chaining but the way chaining was implemented that
> requires us to start using these correct API helpers otherwise we crash
> so I think we would require the following patch.
You're right. This needs to be disabled more thoroughly for 6.15.
Please try this patch:
---8<---
Disable hash request chaining in case a driver that copies an
ahash_request object by hand accidentally triggers chaining.
Reported-by: Manorit Chawdhry <m-chawdhry@ti.com>
Fixes: f2ffe5a9183d ("crypto: hash - Add request chaining API")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 9f57b925b116..2d9eec2b2b1c 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -315,16 +315,7 @@ EXPORT_SYMBOL_GPL(crypto_ahash_setkey);
static bool ahash_request_hasvirt(struct ahash_request *req)
{
- struct ahash_request *r2;
-
- if (ahash_request_isvirt(req))
- return true;
-
- list_for_each_entry(r2, &req->base.list, base.list)
- if (ahash_request_isvirt(r2))
- return true;
-
- return false;
+ return ahash_request_isvirt(req);
}
static int ahash_reqchain_virt(struct ahash_save_req_state *state,
@@ -472,7 +463,6 @@ static int ahash_do_req_chain(struct ahash_request *req,
bool update = op == crypto_ahash_alg(tfm)->update;
struct ahash_save_req_state *state;
struct ahash_save_req_state state0;
- struct ahash_request *r2;
u8 *page = NULL;
int err;
@@ -509,7 +499,6 @@ static int ahash_do_req_chain(struct ahash_request *req,
state->offset = 0;
state->nbytes = 0;
INIT_LIST_HEAD(&state->head);
- list_splice_init(&req->base.list, &state->head);
if (page)
sg_init_one(&state->sg, page, PAGE_SIZE);
@@ -540,9 +529,6 @@ static int ahash_do_req_chain(struct ahash_request *req,
out_set_chain:
req->base.err = err;
- list_for_each_entry(r2, &req->base.list, base.list)
- r2->base.err = err;
-
return err;
}
@@ -551,19 +537,10 @@ int crypto_ahash_init(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
if (likely(tfm->using_shash)) {
- struct ahash_request *r2;
int err;
err = crypto_shash_init(prepare_shash_desc(req, tfm));
req->base.err = err;
-
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct shash_desc *desc;
-
- desc = prepare_shash_desc(r2, tfm);
- r2->base.err = crypto_shash_init(desc);
- }
-
return err;
}
@@ -620,19 +597,10 @@ int crypto_ahash_update(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
if (likely(tfm->using_shash)) {
- struct ahash_request *r2;
int err;
err = shash_ahash_update(req, ahash_request_ctx(req));
req->base.err = err;
-
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct shash_desc *desc;
-
- desc = ahash_request_ctx(r2);
- r2->base.err = shash_ahash_update(r2, desc);
- }
-
return err;
}
@@ -645,19 +613,10 @@ int crypto_ahash_final(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
if (likely(tfm->using_shash)) {
- struct ahash_request *r2;
int err;
err = crypto_shash_final(ahash_request_ctx(req), req->result);
req->base.err = err;
-
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct shash_desc *desc;
-
- desc = ahash_request_ctx(r2);
- r2->base.err = crypto_shash_final(desc, r2->result);
- }
-
return err;
}
@@ -670,19 +629,10 @@ int crypto_ahash_finup(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
if (likely(tfm->using_shash)) {
- struct ahash_request *r2;
int err;
err = shash_ahash_finup(req, ahash_request_ctx(req));
req->base.err = err;
-
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct shash_desc *desc;
-
- desc = ahash_request_ctx(r2);
- r2->base.err = shash_ahash_finup(r2, desc);
- }
-
return err;
}
@@ -757,19 +707,10 @@ int crypto_ahash_digest(struct ahash_request *req)
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
if (likely(tfm->using_shash)) {
- struct ahash_request *r2;
int err;
err = shash_ahash_digest(req, prepare_shash_desc(req, tfm));
req->base.err = err;
-
- list_for_each_entry(r2, &req->base.list, base.list) {
- struct shash_desc *desc;
-
- desc = prepare_shash_desc(r2, tfm);
- r2->base.err = shash_ahash_digest(r2, desc);
- }
-
return err;
}
@@ -1133,20 +1074,5 @@ int ahash_register_instance(struct crypto_template *tmpl,
}
EXPORT_SYMBOL_GPL(ahash_register_instance);
-void ahash_request_free(struct ahash_request *req)
-{
- struct ahash_request *tmp;
- struct ahash_request *r2;
-
- if (unlikely(!req))
- return;
-
- list_for_each_entry_safe(r2, tmp, &req->base.list, base.list)
- kfree_sensitive(r2);
-
- kfree_sensitive(req);
-}
-EXPORT_SYMBOL_GPL(ahash_request_free);
-
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Asynchronous cryptographic hash type");
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 2aa83ee0ec98..a67988316d06 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -10,6 +10,7 @@
#include <linux/atomic.h>
#include <linux/crypto.h>
+#include <linux/slab.h>
#include <linux/string.h>
/* Set this bit for virtual address instead of SG list. */
@@ -581,7 +582,10 @@ static inline struct ahash_request *ahash_request_alloc_noprof(
* ahash_request_free() - zeroize and free the request data structure
* @req: request data structure cipher handle to be freed
*/
-void ahash_request_free(struct ahash_request *req);
+static inline void ahash_request_free(struct ahash_request *req)
+{
+ kfree_sensitive(req);
+}
static inline struct ahash_request *ahash_request_cast(
struct crypto_async_request *req)
diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h
index 485e22cf517e..052ac7924af3 100644
--- a/include/crypto/internal/hash.h
+++ b/include/crypto/internal/hash.h
@@ -249,7 +249,7 @@ static inline struct crypto_shash *__crypto_shash_cast(struct crypto_tfm *tfm)
static inline bool ahash_request_chained(struct ahash_request *req)
{
- return crypto_request_chained(&req->base);
+ return false;
}
static inline bool ahash_request_isvirt(struct ahash_request *req)
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] crypto: ahash - Disable request chaining
2025-04-11 7:14 ` [PATCH] crypto: ahash - Disable request chaining Herbert Xu
@ 2025-04-11 7:58 ` Manorit Chawdhry
0 siblings, 0 replies; 42+ messages in thread
From: Manorit Chawdhry @ 2025-04-11 7:58 UTC (permalink / raw)
To: Herbert Xu
Cc: Linux Crypto Mailing List, Eric Biggers, Ard Biesheuvel,
Megha Dey, Tim Chen, Kamlesh Gurudasani, Vignesh Raghavendra,
Udit Kumar, Pratham T
Hi Herbert,
On 15:14-20250411, Herbert Xu wrote:
> On Fri, Apr 11, 2025 at 11:44:17AM +0530, Manorit Chawdhry wrote:
> >
> > Maybe it's not the chaining but the way chaining was implemented that
> > requires us to start using these correct API helpers otherwise we crash
> > so I think we would require the following patch.
>
> You're right. This needs to be disabled more thoroughly for 6.15.
> Please try this patch:
>
> ---8<---
> Disable hash request chaining in case a driver that copies an
> ahash_request object by hand accidentally triggers chaining.
>
> Reported-by: Manorit Chawdhry <m-chawdhry@ti.com>
> Fixes: f2ffe5a9183d ("crypto: hash - Add request chaining API")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
This fixes it, thanks!
root@j721e-evm:~# uname -a
Linux j721e-evm 6.15.0-rc1-00001-gdcd7f62f8e5e-dirty #4 SMP PREEMPT Fri Apr 11 13:18:03 IST 2025 aarch64 GNU/Linux
root@j721e-evm:~# modprobe sa2ul
[ 414.110972] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
Tested-by: Manorit Chawdhry <m-chawdhry@ti.com>
Regards,
Manorit
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2025-04-11 7:58 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-16 3:07 [v2 PATCH 00/11] Multibuffer hashing take two Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 01/11] crypto: ahash - Only save callback and data in ahash_save_req Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 02/11] crypto: x86/ghash - Use proper helpers to clone request Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 03/11] crypto: hash - Add request chaining API Herbert Xu
2025-03-26 9:00 ` Manorit Chawdhry
2025-03-26 9:17 ` [PATCH] crypto: sa2ul - Use proper helpers to setup request Herbert Xu
2025-03-26 10:00 ` Manorit Chawdhry
2025-03-26 10:05 ` [v2 PATCH] " Herbert Xu
2025-03-26 12:31 ` Manorit Chawdhry
2025-03-26 13:06 ` Herbert Xu
2025-03-26 13:07 ` Herbert Xu
2025-03-27 7:34 ` Manorit Chawdhry
2025-03-27 8:15 ` Manorit Chawdhry
2025-03-27 8:23 ` [PATCH] crypto: testmgr - Initialise full_sgl properly Herbert Xu
2025-03-27 8:40 ` Manorit Chawdhry
2025-03-27 9:09 ` Manorit Chawdhry
2025-03-31 10:13 ` Herbert Xu
2025-04-11 5:34 ` [v2 PATCH] crypto: sa2ul - Use proper helpers to setup request Manorit Chawdhry
2025-04-11 5:37 ` Herbert Xu
2025-04-11 5:44 ` Manorit Chawdhry
2025-04-11 5:46 ` Herbert Xu
2025-04-11 6:14 ` Manorit Chawdhry
2025-04-11 7:14 ` [PATCH] crypto: ahash - Disable request chaining Herbert Xu
2025-04-11 7:58 ` Manorit Chawdhry
2025-02-16 3:07 ` [v2 PATCH 04/11] crypto: tcrypt - Restore multibuffer ahash tests Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 05/11] crypto: ahash - Add virtual address support Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 06/11] crypto: ahash - Set default reqsize from ahash_alg Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 07/11] crypto: testmgr - Add multibuffer hash testing Herbert Xu
2025-02-16 9:18 ` kernel test robot
2025-02-16 3:07 ` [v2 PATCH 08/11] crypto: x86/sha2 - Restore multibuffer AVX2 support Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 09/11] crypto: hash - Add sync hash interface Herbert Xu
2025-02-16 10:51 ` kernel test robot
2025-02-16 11:42 ` kernel test robot
2025-02-16 3:07 ` [v2 PATCH 10/11] fsverity: Use sync hash instead of shash Herbert Xu
2025-02-16 3:07 ` [v2 PATCH 11/11] fsverity: improve performance by using multibuffer hashing Eric Biggers
2025-02-16 3:10 ` Herbert Xu
2025-02-16 3:38 ` [v2 PATCH 00/11] Multibuffer hashing take two Eric Biggers
2025-02-16 11:09 ` Herbert Xu
2025-02-16 19:51 ` Eric Biggers
2025-02-18 10:10 ` Herbert Xu
2025-02-18 17:48 ` Eric Biggers
2025-02-21 6:10 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).