* Re: [BUG] crypto: atmel-aes - erro when compiling with VERBOSE_DEBUG enable
From: Cyrille Pitchen @ 2016-10-03 10:20 UTC (permalink / raw)
To: Herbert Xu; +Cc: levent demir, linux-crypto
In-Reply-To: <20161002143858.GE18268@gondor.apana.org.au>
Hi all,
Le 02/10/2016 à 16:38, Herbert Xu a écrit :
> On Tue, Sep 27, 2016 at 06:45:18PM +0200, Cyrille Pitchen wrote:
>> Hi Levent,
>>
>> there is a typo in the subject line: erroR.
>> Also it would be better to start the summary phrase of the subject line with a
>> verb:
>>
>> crypto: atmel-aes: fix compiler error when VERBODE_DEBUG is defined
>>
>> Le 22/09/2016 à 14:45, levent demir a écrit :
>>> Fix debug function call in atmel_aes_write
>>>
>>> Signed-off-by: Levent DEMIR <levent.demir@inria.fr>
>>> ---
>>> drivers/crypto/atmel-aes.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
>>> index e3d40a8..2b0f926 100644
>>> --- a/drivers/crypto/atmel-aes.c
>>> +++ b/drivers/crypto/atmel-aes.c
>>> @@ -317,7 +317,7 @@ static inline void atmel_aes_write(struct
>>> atmel_aes_dev *dd,
>>> char tmp[16];
>>>
>>> dev_vdbg(dd->dev, "write 0x%08x into %s\n", value,
>>> - atmel_aes_reg_name(offset, tmp));
>>> + atmel_aes_reg_name(offset, tmp, sizeof(tmp)));
>> It looks like a space has been removed.
>
> It's been completely mangled by the mailer and cannot be applied.
>
I've sent a new version in this thread:
https://lkml.org/lkml/2016/9/29/463
I added a Reported-by tag for Levent but if you want to use a Signed-off-by
tag instead, it's fine with me!
Best regards,
Cyrille
^ permalink raw reply
* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Marcelo Cerri @ 2016-10-03 12:08 UTC (permalink / raw)
To: Herbert Xu
Cc: Anton Blanchard, linux-crypto, Leonidas S. Barbosa, linux-kernel,
Paul Mackerras, Paulo Flabiano Smorigo, George Wilson,
linuxppc-dev, David S. Miller
In-Reply-To: <20161002144047.GG18268@gondor.apana.org.au>
[-- Attachment #1: Type: text/plain, Size: 694 bytes --]
Thank you.
--
Regards,
Marcelo
On Sun, Oct 02, 2016 at 10:40:47PM +0800, Herbert Xu wrote:
> On Thu, Sep 29, 2016 at 06:59:08AM +1000, Anton Blanchard wrote:
> > Hi Marcelo
> >
> > > This series fixes the memory corruption found by Jan Stancek in
> > > 4.8-rc7. The problem however also affects previous versions of the
> > > driver.
> >
> > If it affects previous versions, please add the lines in the sign off to
> > get it into the stable kernels.
>
> I have added them to patches 1 and 2. Thanks.
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* [PATCH v2 1/1] crypto: atmel-aes: add support to the XTS mode
From: Cyrille Pitchen @ 2016-10-03 12:33 UTC (permalink / raw)
To: herbert, davem, nicolas.ferre
Cc: linux-crypto, linux-kernel, linux-arm-kernel, smueller,
Cyrille Pitchen
This patch adds the xts(aes) algorithm, which is supported from
hardware version 0x500 and above (sama5d2x).
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
---
ChangeLog:
v1 -> v2:
- fix typo in comment inside atmel_aes_xts_process_data():
s/reverted/reversed/
- use xts_check_key() from atmel_aes_xts_setkey() as suggested by
Stephan Mueller.
drivers/crypto/atmel-aes-regs.h | 4 +
drivers/crypto/atmel-aes.c | 185 ++++++++++++++++++++++++++++++++++++++--
2 files changed, 183 insertions(+), 6 deletions(-)
diff --git a/drivers/crypto/atmel-aes-regs.h b/drivers/crypto/atmel-aes-regs.h
index 6c2951bb70b1..0ec04407b533 100644
--- a/drivers/crypto/atmel-aes-regs.h
+++ b/drivers/crypto/atmel-aes-regs.h
@@ -28,6 +28,7 @@
#define AES_MR_OPMOD_CFB (0x3 << 12)
#define AES_MR_OPMOD_CTR (0x4 << 12)
#define AES_MR_OPMOD_GCM (0x5 << 12)
+#define AES_MR_OPMOD_XTS (0x6 << 12)
#define AES_MR_LOD (0x1 << 15)
#define AES_MR_CFBS_MASK (0x7 << 16)
#define AES_MR_CFBS_128b (0x0 << 16)
@@ -67,6 +68,9 @@
#define AES_CTRR 0x98
#define AES_GCMHR(x) (0x9c + ((x) * 0x04))
+#define AES_TWR(x) (0xc0 + ((x) * 0x04))
+#define AES_ALPHAR(x) (0xd0 + ((x) * 0x04))
+
#define AES_HW_VERSION 0xFC
#endif /* __ATMEL_AES_REGS_H__ */
diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
index 1d9e7bd3f377..6b656f4a9378 100644
--- a/drivers/crypto/atmel-aes.c
+++ b/drivers/crypto/atmel-aes.c
@@ -36,6 +36,7 @@
#include <crypto/scatterwalk.h>
#include <crypto/algapi.h>
#include <crypto/aes.h>
+#include <crypto/xts.h>
#include <crypto/internal/aead.h>
#include <linux/platform_data/crypto-atmel.h>
#include <dt-bindings/dma/at91.h>
@@ -68,6 +69,7 @@
#define AES_FLAGS_CFB8 (AES_MR_OPMOD_CFB | AES_MR_CFBS_8b)
#define AES_FLAGS_CTR AES_MR_OPMOD_CTR
#define AES_FLAGS_GCM AES_MR_OPMOD_GCM
+#define AES_FLAGS_XTS AES_MR_OPMOD_XTS
#define AES_FLAGS_MODE_MASK (AES_FLAGS_OPMODE_MASK | \
AES_FLAGS_ENCRYPT | \
@@ -89,6 +91,7 @@ struct atmel_aes_caps {
bool has_cfb64;
bool has_ctr32;
bool has_gcm;
+ bool has_xts;
u32 max_burst_size;
};
@@ -135,6 +138,12 @@ struct atmel_aes_gcm_ctx {
atmel_aes_fn_t ghash_resume;
};
+struct atmel_aes_xts_ctx {
+ struct atmel_aes_base_ctx base;
+
+ u32 key2[AES_KEYSIZE_256 / sizeof(u32)];
+};
+
struct atmel_aes_reqctx {
unsigned long mode;
};
@@ -282,6 +291,20 @@ static const char *atmel_aes_reg_name(u32 offset, char *tmp, size_t sz)
snprintf(tmp, sz, "GCMHR[%u]", (offset - AES_GCMHR(0)) >> 2);
break;
+ case AES_TWR(0):
+ case AES_TWR(1):
+ case AES_TWR(2):
+ case AES_TWR(3):
+ snprintf(tmp, sz, "TWR[%u]", (offset - AES_TWR(0)) >> 2);
+ break;
+
+ case AES_ALPHAR(0):
+ case AES_ALPHAR(1):
+ case AES_ALPHAR(2):
+ case AES_ALPHAR(3):
+ snprintf(tmp, sz, "ALPHAR[%u]", (offset - AES_ALPHAR(0)) >> 2);
+ break;
+
default:
snprintf(tmp, sz, "0x%02x", offset);
break;
@@ -453,15 +476,15 @@ static inline int atmel_aes_complete(struct atmel_aes_dev *dd, int err)
return err;
}
-static void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
- const u32 *iv)
+static void atmel_aes_write_ctrl_key(struct atmel_aes_dev *dd, bool use_dma,
+ const u32 *iv, const u32 *key, int keylen)
{
u32 valmr = 0;
/* MR register must be set before IV registers */
- if (dd->ctx->keylen == AES_KEYSIZE_128)
+ if (keylen == AES_KEYSIZE_128)
valmr |= AES_MR_KEYSIZE_128;
- else if (dd->ctx->keylen == AES_KEYSIZE_192)
+ else if (keylen == AES_KEYSIZE_192)
valmr |= AES_MR_KEYSIZE_192;
else
valmr |= AES_MR_KEYSIZE_256;
@@ -478,13 +501,19 @@ static void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
atmel_aes_write(dd, AES_MR, valmr);
- atmel_aes_write_n(dd, AES_KEYWR(0), dd->ctx->key,
- SIZE_IN_WORDS(dd->ctx->keylen));
+ atmel_aes_write_n(dd, AES_KEYWR(0), key, SIZE_IN_WORDS(keylen));
if (iv && (valmr & AES_MR_OPMOD_MASK) != AES_MR_OPMOD_ECB)
atmel_aes_write_block(dd, AES_IVR(0), iv);
}
+static inline void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
+ const u32 *iv)
+
+{
+ atmel_aes_write_ctrl_key(dd, use_dma, iv,
+ dd->ctx->key, dd->ctx->keylen);
+}
/* CPU transfer */
@@ -1769,6 +1798,137 @@ static struct aead_alg aes_gcm_alg = {
};
+/* xts functions */
+
+static inline struct atmel_aes_xts_ctx *
+atmel_aes_xts_ctx_cast(struct atmel_aes_base_ctx *ctx)
+{
+ return container_of(ctx, struct atmel_aes_xts_ctx, base);
+}
+
+static int atmel_aes_xts_process_data(struct atmel_aes_dev *dd);
+
+static int atmel_aes_xts_start(struct atmel_aes_dev *dd)
+{
+ struct atmel_aes_xts_ctx *ctx = atmel_aes_xts_ctx_cast(dd->ctx);
+ struct ablkcipher_request *req = ablkcipher_request_cast(dd->areq);
+ struct atmel_aes_reqctx *rctx = ablkcipher_request_ctx(req);
+ unsigned long flags;
+ int err;
+
+ atmel_aes_set_mode(dd, rctx);
+
+ err = atmel_aes_hw_init(dd);
+ if (err)
+ return atmel_aes_complete(dd, err);
+
+ /* Compute the tweak value from req->info with ecb(aes). */
+ flags = dd->flags;
+ dd->flags &= ~AES_FLAGS_MODE_MASK;
+ dd->flags |= (AES_FLAGS_ECB | AES_FLAGS_ENCRYPT);
+ atmel_aes_write_ctrl_key(dd, false, NULL,
+ ctx->key2, ctx->base.keylen);
+ dd->flags = flags;
+
+ atmel_aes_write_block(dd, AES_IDATAR(0), req->info);
+ return atmel_aes_wait_for_data_ready(dd, atmel_aes_xts_process_data);
+}
+
+static int atmel_aes_xts_process_data(struct atmel_aes_dev *dd)
+{
+ struct ablkcipher_request *req = ablkcipher_request_cast(dd->areq);
+ bool use_dma = (req->nbytes >= ATMEL_AES_DMA_THRESHOLD);
+ u32 tweak[AES_BLOCK_SIZE / sizeof(u32)];
+ static const u32 one[AES_BLOCK_SIZE / sizeof(u32)] = {cpu_to_le32(1), };
+ u8 *tweak_bytes = (u8 *)tweak;
+ int i;
+
+ /* Read the computed ciphered tweak value. */
+ atmel_aes_read_block(dd, AES_ODATAR(0), tweak);
+ /*
+ * Hardware quirk:
+ * the order of the ciphered tweak bytes need to be reversed before
+ * writing them into the ODATARx registers.
+ */
+ for (i = 0; i < AES_BLOCK_SIZE/2; ++i) {
+ u8 tmp = tweak_bytes[AES_BLOCK_SIZE - 1 - i];
+
+ tweak_bytes[AES_BLOCK_SIZE - 1 - i] = tweak_bytes[i];
+ tweak_bytes[i] = tmp;
+ }
+
+ /* Process the data. */
+ atmel_aes_write_ctrl(dd, use_dma, NULL);
+ atmel_aes_write_block(dd, AES_TWR(0), tweak);
+ atmel_aes_write_block(dd, AES_ALPHAR(0), one);
+ if (use_dma)
+ return atmel_aes_dma_start(dd, req->src, req->dst, req->nbytes,
+ atmel_aes_transfer_complete);
+
+ return atmel_aes_cpu_start(dd, req->src, req->dst, req->nbytes,
+ atmel_aes_transfer_complete);
+}
+
+static int atmel_aes_xts_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ struct atmel_aes_xts_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+ int err;
+
+ err = xts_check_key(crypto_ablkcipher_tfm(tfm), key, keylen);
+ if (err)
+ return err;
+
+ memcpy(ctx->base.key, key, keylen/2);
+ memcpy(ctx->key2, key + keylen/2, keylen/2);
+ ctx->base.keylen = keylen/2;
+
+ return 0;
+}
+
+static int atmel_aes_xts_encrypt(struct ablkcipher_request *req)
+{
+ return atmel_aes_crypt(req, AES_FLAGS_XTS | AES_FLAGS_ENCRYPT);
+}
+
+static int atmel_aes_xts_decrypt(struct ablkcipher_request *req)
+{
+ return atmel_aes_crypt(req, AES_FLAGS_XTS);
+}
+
+static int atmel_aes_xts_cra_init(struct crypto_tfm *tfm)
+{
+ struct atmel_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
+
+ tfm->crt_ablkcipher.reqsize = sizeof(struct atmel_aes_reqctx);
+ ctx->base.start = atmel_aes_xts_start;
+
+ return 0;
+}
+
+static struct crypto_alg aes_xts_alg = {
+ .cra_name = "xts(aes)",
+ .cra_driver_name = "atmel-xts-aes",
+ .cra_priority = ATMEL_AES_PRIORITY,
+ .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+ .cra_blocksize = AES_BLOCK_SIZE,
+ .cra_ctxsize = sizeof(struct atmel_aes_xts_ctx),
+ .cra_alignmask = 0xf,
+ .cra_type = &crypto_ablkcipher_type,
+ .cra_module = THIS_MODULE,
+ .cra_init = atmel_aes_xts_cra_init,
+ .cra_exit = atmel_aes_cra_exit,
+ .cra_u.ablkcipher = {
+ .min_keysize = 2 * AES_MIN_KEY_SIZE,
+ .max_keysize = 2 * AES_MAX_KEY_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .setkey = atmel_aes_xts_setkey,
+ .encrypt = atmel_aes_xts_encrypt,
+ .decrypt = atmel_aes_xts_decrypt,
+ }
+};
+
+
/* Probe functions */
static int atmel_aes_buff_init(struct atmel_aes_dev *dd)
@@ -1877,6 +2037,9 @@ static void atmel_aes_unregister_algs(struct atmel_aes_dev *dd)
{
int i;
+ if (dd->caps.has_xts)
+ crypto_unregister_alg(&aes_xts_alg);
+
if (dd->caps.has_gcm)
crypto_unregister_aead(&aes_gcm_alg);
@@ -1909,8 +2072,16 @@ static int atmel_aes_register_algs(struct atmel_aes_dev *dd)
goto err_aes_gcm_alg;
}
+ if (dd->caps.has_xts) {
+ err = crypto_register_alg(&aes_xts_alg);
+ if (err)
+ goto err_aes_xts_alg;
+ }
+
return 0;
+err_aes_xts_alg:
+ crypto_unregister_aead(&aes_gcm_alg);
err_aes_gcm_alg:
crypto_unregister_alg(&aes_cfb64_alg);
err_aes_cfb64_alg:
@@ -1928,6 +2099,7 @@ static void atmel_aes_get_cap(struct atmel_aes_dev *dd)
dd->caps.has_cfb64 = 0;
dd->caps.has_ctr32 = 0;
dd->caps.has_gcm = 0;
+ dd->caps.has_xts = 0;
dd->caps.max_burst_size = 1;
/* keep only major version number */
@@ -1937,6 +2109,7 @@ static void atmel_aes_get_cap(struct atmel_aes_dev *dd)
dd->caps.has_cfb64 = 1;
dd->caps.has_ctr32 = 1;
dd->caps.has_gcm = 1;
+ dd->caps.has_xts = 1;
dd->caps.max_burst_size = 4;
break;
case 0x200:
--
2.7.4
^ permalink raw reply related
* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Marcelo Cerri @ 2016-10-03 15:07 UTC (permalink / raw)
To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <1475080931-7926-1-git-send-email-marcelo.cerri@canonical.com>
[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]
Hi Herbert,
Sorry for bothering you. I noticed you included two of the patches in
the crypto-2.6 repository and the remaining one in cryptodev-2.6. Is
that right? I thought all 3 patches would be included in the cruptodev
repository.
--
Regards,
Marcelo
On Wed, Sep 28, 2016 at 01:42:08PM -0300, Marcelo Cerri wrote:
> This series fixes the memory corruption found by Jan Stancek in 4.8-rc7. The
> problem however also affects previous versions of the driver.
>
> Marcelo Cerri (3):
> crypto: ghash-generic - move common definitions to a new header file
> crypto: vmx - Fix memory corruption caused by p8_ghash
> crypto: vmx - Ensure ghash-generic is enabled
>
> crypto/ghash-generic.c | 13 +------------
> drivers/crypto/vmx/Kconfig | 2 +-
> drivers/crypto/vmx/ghash.c | 31 ++++++++++++++++---------------
> include/crypto/ghash.h | 23 +++++++++++++++++++++++
> 4 files changed, 41 insertions(+), 28 deletions(-)
> create mode 100644 include/crypto/ghash.h
>
> --
> 2.7.4
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* [PATCH v2 0/2] Improve DMA chaining for ahash requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
linux-arm-kernel
This series contain performance improvement regarding ahash requests.
So far, ahash requests were systematically not chained at the DMA level.
However, in some case, like this is the case by using IPSec, some ahash
requests can be processed directly by the engine, and don't have
intermediaire partial update states.
This series firstly re-work the way outer IVs are copied from the SRAM
into the dma pool. To do so, we introduce a common dma pool for all type
of requests that contains outer results (like IV or digest). Then, for
ahash requests that can be processed directly by the engine, outer
results are copied from the SRAM into the common dma pool. These requests
are then allowed to be chained at the DMA level.
Benchmarking results with iperf throught IPSec
==============================================
ESP AH
Before 343 Mbits/s 492 Mbits/s
After 392 Mbits/s 557 Mbits/s
Improvement +14% +13%
Romain Perier (2):
crypto: marvell - Use an unique pool to copy results of requests
crypto: marvell - Don't break chain for computable last ahash requests
drivers/crypto/marvell/cesa.c | 4 ---
drivers/crypto/marvell/cesa.h | 5 ++-
drivers/crypto/marvell/cipher.c | 6 ++--
drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++--------
drivers/crypto/marvell/tdma.c | 17 +++++----
5 files changed, 78 insertions(+), 33 deletions(-)
--
2.9.3
^ permalink raw reply
* [PATCH v2 2/2] crypto: marvell - Don't break chain for computable last ahash requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
linux-arm-kernel
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>
Currently, the driver breaks chain for all kind of hash requests in order to
don't override intermediate states of partial ahash updates. However, some final
ahash requests can be directly processed by the engine, and so without
intermediate state. This is typically the case for most for the HMAC requests
processed via IPSec.
This commits adds a TDMA descriptor to copy outer data for these of requests
into the "op" dma pool, then it allow to chain these requests at the DMA level.
The 'complete' operation is also updated to retrieve the MAC digest from the
right location.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
Changes in v2:
- Replaced BUG_ON by an error
- Add a variable "break_chain", with "type" to break the chain
with ahash requests. It improves code readability.
drivers/crypto/marvell/cipher.c | 10 +++---
drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++--------
drivers/crypto/marvell/tdma.c | 4 +--
3 files changed, 71 insertions(+), 22 deletions(-)
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 098871a..8bc52bf 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,8 +212,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
struct mv_cesa_req *basereq;
basereq = &creq->base;
- memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
- ivsize);
+ memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
} else {
memcpy_fromio(ablkreq->info,
engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -374,9 +373,10 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
/* Add output data for IV */
ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
- ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
- CESA_SA_DATA_SRAM_OFFSET,
- CESA_TDMA_SRC_IN_SRAM, flags);
+ ret = mv_cesa_dma_add_result_op(&basereq->chain,
+ CESA_SA_CRYPT_IV_SRAM_OFFSET,
+ ivsize,
+ CESA_TDMA_SRC_IN_SRAM, flags);
if (ret)
goto err_free_tdma;
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 9f28468..35baf4f 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -312,24 +312,53 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
int i;
digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
- for (i = 0; i < digsize / 4; i++)
- creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
- if (creq->last_req) {
+ if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
+ !(creq->base.chain.last->flags & CESA_TDMA_BREAK_CHAIN)) {
+ struct mv_cesa_tdma_desc *tdma = NULL;
+ __le32 *data = NULL;
+
+ for (tdma = creq->base.chain.first; tdma; tdma = tdma->next) {
+ u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
+ if (type == CESA_TDMA_RESULT)
+ break;
+ }
+
+ if (!tdma) {
+ dev_err(cesa_dev->dev, "Failed to retrieve tdma "
+ "descriptor for outer data\n");
+ return;
+ }
+
/*
- * Hardware's MD5 digest is in little endian format, but
- * SHA in big endian format
+ * Result is already in the correct endianess when the SA is
+ * used
*/
- if (creq->algo_le) {
- __le32 *result = (void *)ahashreq->result;
+ data = tdma->data;
+ for (i = 0; i < digsize / 4; i++)
+ creq->state[i] = cpu_to_le32(data[i]);
- for (i = 0; i < digsize / 4; i++)
- result[i] = cpu_to_le32(creq->state[i]);
- } else {
- __be32 *result = (void *)ahashreq->result;
+ memcpy(ahashreq->result, data, digsize);
+ } else {
+ for (i = 0; i < digsize / 4; i++)
+ creq->state[i] = readl_relaxed(engine->regs +
+ CESA_IVDIG(i));
+ if (creq->last_req) {
+ /*
+ * Hardware's MD5 digest is in little endian format, but
+ * SHA in big endian format
+ */
+ if (creq->algo_le) {
+ __le32 *result = (void *)ahashreq->result;
+
+ for (i = 0; i < digsize / 4; i++)
+ result[i] = cpu_to_le32(creq->state[i]);
+ } else {
+ __be32 *result = (void *)ahashreq->result;
- for (i = 0; i < digsize / 4; i++)
- result[i] = cpu_to_be32(creq->state[i]);
+ for (i = 0; i < digsize / 4; i++)
+ result[i] = cpu_to_be32(creq->state[i]);
+ }
}
}
@@ -504,6 +533,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
CESA_SA_DESC_CFG_LAST_FRAG,
CESA_SA_DESC_CFG_FRAG_MSK);
+ ret = mv_cesa_dma_add_result_op(chain,
+ CESA_SA_MAC_DIG_SRAM_OFFSET,
+ 32,
+ CESA_TDMA_SRC_IN_SRAM, flags);
+ if (ret)
+ return ERR_PTR(-ENOMEM);
return op;
}
@@ -564,6 +599,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
struct mv_cesa_op_ctx *op = NULL;
unsigned int frag_len;
int ret;
+ u32 type;
+ bool break_chain = true;
basereq->chain.first = NULL;
basereq->chain.last = NULL;
@@ -635,6 +672,16 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
goto err_free_tdma;
}
+ /*
+ * If results are copied via DMA, this means that this
+ * request can be directly processed by the engine,
+ * without partial updates. So we can chain it at the
+ * DMA level with other requests.
+ */
+ type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
+ if (type == CESA_TDMA_RESULT)
+ break_chain = false;
+
if (op) {
/* Add dummy desc to wait for crypto operation end */
ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
@@ -648,8 +695,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
else
creq->cache_ptr = 0;
- basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
- CESA_TDMA_BREAK_CHAIN);
+ basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
+
+ if (break_chain)
+ basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
return 0;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 2e15f19..8d33003 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -70,7 +70,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
le32_to_cpu(tdma->src));
else if (type == CESA_TDMA_RESULT)
- dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
+ dma_pool_free(cesa_dev->dma->op_pool, tdma->data,
le32_to_cpu(tdma->dst));
tdma = tdma->next;
@@ -227,7 +227,7 @@ int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
tdma->byte_cnt = cpu_to_le32(size | BIT(31));
tdma->src = src;
tdma->dst = cpu_to_le32(dma_handle);
- tdma->op = result;
+ tdma->data = result;
flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
tdma->flags = flags | CESA_TDMA_RESULT;
--
2.9.3
^ permalink raw reply related
* [PATCH v2 1/2] crypto: marvell - Use an unique pool to copy results of requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
linux-arm-kernel
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>
So far, we used a dedicated dma pool to copy the result of outer IV for
cipher requests. Instead of using a dma pool per outer data, we prefer
use the op dma pool that contains all part of the request from the SRAM.
Then, the outer data that is likely to be used by the 'complete'
operation, is copied later. In this way, any type of result can be
retrieved by DMA for cipher or ahash requests.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
Changes in v2:
- Use the dma pool "op" to retrieve outer data intead of introducing
a new one.
drivers/crypto/marvell/cesa.c | 4 ----
drivers/crypto/marvell/cesa.h | 5 ++---
drivers/crypto/marvell/cipher.c | 8 +++++---
drivers/crypto/marvell/tdma.c | 17 ++++++++---------
4 files changed, 15 insertions(+), 19 deletions(-)
diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 37dadb2..6e7a5c7 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -375,10 +375,6 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
if (!dma->padding_pool)
return -ENOMEM;
- dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
- if (!dma->iv_pool)
- return -ENOMEM;
-
cesa->dma = dma;
return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e423d33..a768da7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -277,7 +277,7 @@ struct mv_cesa_op_ctx {
#define CESA_TDMA_DUMMY 0
#define CESA_TDMA_DATA 1
#define CESA_TDMA_OP 2
-#define CESA_TDMA_IV 3
+#define CESA_TDMA_RESULT 3
/**
* struct mv_cesa_tdma_desc - TDMA descriptor
@@ -393,7 +393,6 @@ struct mv_cesa_dev_dma {
struct dma_pool *op_pool;
struct dma_pool *cache_pool;
struct dma_pool *padding_pool;
- struct dma_pool *iv_pool;
};
/**
@@ -839,7 +838,7 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
memset(chain, 0, sizeof(*chain));
}
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
u32 size, u32 flags, gfp_t gfp_flags);
struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index d19dc96..098871a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,7 +212,8 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
struct mv_cesa_req *basereq;
basereq = &creq->base;
- memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+ memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
+ ivsize);
} else {
memcpy_fromio(ablkreq->info,
engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -373,8 +374,9 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
/* Add output data for IV */
ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
- ret = mv_cesa_dma_add_iv_op(&basereq->chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
- ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+ ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
+ CESA_SA_DATA_SRAM_OFFSET,
+ CESA_TDMA_SRC_IN_SRAM, flags);
if (ret)
goto err_free_tdma;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9fd7a5f..2e15f19 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,8 +69,8 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
if (type == CESA_TDMA_OP)
dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
le32_to_cpu(tdma->src));
- else if (type == CESA_TDMA_IV)
- dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+ else if (type == CESA_TDMA_RESULT)
+ dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
le32_to_cpu(tdma->dst));
tdma = tdma->next;
@@ -209,29 +209,28 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
return new_tdma;
}
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
u32 size, u32 flags, gfp_t gfp_flags)
{
-
struct mv_cesa_tdma_desc *tdma;
- u8 *iv;
+ struct mv_cesa_op_ctx *result;
dma_addr_t dma_handle;
tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
if (IS_ERR(tdma))
return PTR_ERR(tdma);
- iv = dma_pool_alloc(cesa_dev->dma->iv_pool, gfp_flags, &dma_handle);
- if (!iv)
+ result = dma_pool_alloc(cesa_dev->dma->op_pool, gfp_flags, &dma_handle);
+ if (!result)
return -ENOMEM;
tdma->byte_cnt = cpu_to_le32(size | BIT(31));
tdma->src = src;
tdma->dst = cpu_to_le32(dma_handle);
- tdma->data = iv;
+ tdma->op = result;
flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
- tdma->flags = flags | CESA_TDMA_IV;
+ tdma->flags = flags | CESA_TDMA_RESULT;
return 0;
}
--
2.9.3
^ permalink raw reply related
* Re: [PATCH v2 0/2] Improve DMA chaining for ahash requests
From: Romain Perier @ 2016-10-03 15:48 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: Thomas Petazzoni, Andrew Lunn, Jason Cooper, linux-crypto,
Gregory Clement, Herbert Xu, David S. Miller, linux-arm-kernel,
Sebastian Hesselbarth
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>
Hello,
Le 03/10/2016 17:17, Romain Perier a écrit :
> This series contain performance improvement regarding ahash requests.
> So far, ahash requests were systematically not chained at the DMA level.
> However, in some case, like this is the case by using IPSec, some ahash
> requests can be processed directly by the engine, and don't have
> intermediaire partial update states.
>
> This series firstly re-work the way outer IVs are copied from the SRAM
> into the dma pool. To do so, we introduce a common dma pool for all type
> of requests that contains outer results (like IV or digest). Then, for
> ahash requests that can be processed directly by the engine, outer
> results are copied from the SRAM into the common dma pool. These requests
> are then allowed to be chained at the DMA level.
>
>
> Benchmarking results with iperf throught IPSec
> ==============================================
> ESP AH
>
> Before 343 Mbits/s 492 Mbits/s
> After 392 Mbits/s 557 Mbits/s
> Improvement +14% +13%
>
> Romain Perier (2):
> crypto: marvell - Use an unique pool to copy results of requests
> crypto: marvell - Don't break chain for computable last ahash requests
>
> drivers/crypto/marvell/cesa.c | 4 ---
> drivers/crypto/marvell/cesa.h | 5 ++-
> drivers/crypto/marvell/cipher.c | 6 ++--
> drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++--------
> drivers/crypto/marvell/tdma.c | 17 +++++----
> 5 files changed, 78 insertions(+), 33 deletions(-)
>
After an internal discussion, we can handle things differently. Instead
of allocating a new mv_cesa_op_ctx in the op_pool, we can just allocate
a new descriptor which points to the first op ctx of the chain, and then
copy outer data.
Please ignore this series.
Regards,
--
Romain Perier, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply
* Moving from blkcipher to skcipher
From: Alex Cope @ 2016-10-03 17:06 UTC (permalink / raw)
To: linux-crypto; +Cc: Michael Halcrow, Eric Biggers
I'm currently working on implementing HEH encryption, and am in the
process of switching from the blkcipher interface to the skcipher
interface. All the examples I have found that use skcipher are
wrapping another mode of operation I.E. cts in cts(cbc(aes)) rather
than being directly above the block cipher I.E. ctr in ctr(aes). Are
there any existing examples of the latter type that I could use as a
reference? If not, is there an estimate on when that work will be
available?
Thanks,
-Alex
^ permalink raw reply
* Re: Moving from blkcipher to skcipher
From: Stephan Mueller @ 2016-10-03 17:36 UTC (permalink / raw)
To: Alex Cope; +Cc: linux-crypto, Michael Halcrow, Eric Biggers
In-Reply-To: <CA+cSK1134BGVm7FprZcrDxVoU7P33GkJP17itLGvTpc8ECB9zw@mail.gmail.com>
Am Montag, 3. Oktober 2016, 10:06:23 CEST schrieb Alex Cope:
Hi Alex,
> I'm currently working on implementing HEH encryption, and am in the
> process of switching from the blkcipher interface to the skcipher
> interface. All the examples I have found that use skcipher are
> wrapping another mode of operation I.E. cts in cts(cbc(aes)) rather
> than being directly above the block cipher I.E. ctr in ctr(aes). Are
> there any existing examples of the latter type that I could use as a
> reference? If not, is there an estimate on when that work will be
> available?
The issue is that a blkcipher is a synchronous version of the skcipher. So,
you could easily move from blkcipher to skcipher and just rename the invoked
API, provided you change the initialization to the following which triggers a
synchronous operation:
tfm = crypto_alloc_skcipher(kccavs_test->name, 0, CRYPTO_ALG_ASYNC);
Note, you can only use ciphers marked as blkcipher or cipher in /proc/crypto
with that.
If you want to use all symmetric cipher implementation, you must use the async
skcipher operation which is identical to the previous ablkcipher API. An
example is given in the crypto API documentation, such as http://
www.chronox.de/crypto-API/Code.html#id-1.8.2
Ciao
Stephan
^ permalink raw reply
* Re: Moving from blkcipher to skcipher
From: Alex Cope @ 2016-10-03 17:58 UTC (permalink / raw)
To: Stephan Mueller; +Cc: linux-crypto, Michael Halcrow, Eric Biggers
In-Reply-To: <3528786.X6pky90e93@tauon.atsec.com>
I was unclear in my initial message. I'm implementing a block cipher
mode of operation. I'm hoping there is a another block cipher mode of
operation that already uses skcipher, so I can use it as a reference
and avoid re-inventing the wheel. In particular, it would be helpful
if there is some implementation of a block cipher mode of operation
that is directly above the underlying block cipher, like CTR or CBC,
rather than something like CTS or rfc3686 which wrap around another
block cipher mode of operation.
On Mon, Oct 3, 2016 at 10:36 AM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Montag, 3. Oktober 2016, 10:06:23 CEST schrieb Alex Cope:
>
> Hi Alex,
>
>> I'm currently working on implementing HEH encryption, and am in the
>> process of switching from the blkcipher interface to the skcipher
>> interface. All the examples I have found that use skcipher are
>> wrapping another mode of operation I.E. cts in cts(cbc(aes)) rather
>> than being directly above the block cipher I.E. ctr in ctr(aes). Are
>> there any existing examples of the latter type that I could use as a
>> reference? If not, is there an estimate on when that work will be
>> available?
>
> The issue is that a blkcipher is a synchronous version of the skcipher. So,
> you could easily move from blkcipher to skcipher and just rename the invoked
> API, provided you change the initialization to the following which triggers a
> synchronous operation:
>
> tfm = crypto_alloc_skcipher(kccavs_test->name, 0, CRYPTO_ALG_ASYNC);
>
> Note, you can only use ciphers marked as blkcipher or cipher in /proc/crypto
> with that.
>
> If you want to use all symmetric cipher implementation, you must use the async
> skcipher operation which is identical to the previous ablkcipher API. An
> example is given in the crypto API documentation, such as http://
> www.chronox.de/crypto-API/Code.html#id-1.8.2
>
> Ciao
> Stephan
^ permalink raw reply
* Re: Moving from blkcipher to skcipher
From: Stephan Mueller @ 2016-10-03 19:22 UTC (permalink / raw)
To: Alex Cope; +Cc: linux-crypto, Michael Halcrow, Eric Biggers
In-Reply-To: <CA+cSK116PYracoP=tDhNg_q5+S-JLo+6H3rRAmuORErwW1EM5w@mail.gmail.com>
Am Montag, 3. Oktober 2016, 10:58:03 CEST schrieb Alex Cope:
Hi Alex,
> I was unclear in my initial message. I'm implementing a block cipher
> mode of operation. I'm hoping there is a another block cipher mode of
> operation that already uses skcipher, so I can use it as a reference
> and avoid re-inventing the wheel. In particular, it would be helpful
> if there is some implementation of a block cipher mode of operation
> that is directly above the underlying block cipher, like CTR or CBC,
> rather than something like CTS or rfc3686 which wrap around another
> block cipher mode of operation.
See gcm.c with the implementation referenced with crypto_gcm_base_tmpl or
crypto_gcm_tmpl -- it uses the aead API which is conceptually similar to the
skcipher API. And again you see the same concept applied as in the example I
provided below.
>
> On Mon, Oct 3, 2016 at 10:36 AM, Stephan Mueller <smueller@chronox.de>
wrote:
> > Am Montag, 3. Oktober 2016, 10:06:23 CEST schrieb Alex Cope:
> >
> > Hi Alex,
> >
> >> I'm currently working on implementing HEH encryption, and am in the
> >> process of switching from the blkcipher interface to the skcipher
> >> interface. All the examples I have found that use skcipher are
> >> wrapping another mode of operation I.E. cts in cts(cbc(aes)) rather
> >> than being directly above the block cipher I.E. ctr in ctr(aes). Are
> >> there any existing examples of the latter type that I could use as a
> >> reference? If not, is there an estimate on when that work will be
> >> available?
> >
> > The issue is that a blkcipher is a synchronous version of the skcipher.
> > So,
> > you could easily move from blkcipher to skcipher and just rename the
> > invoked API, provided you change the initialization to the following
> > which triggers a synchronous operation:
> >
> > tfm = crypto_alloc_skcipher(kccavs_test->name, 0, CRYPTO_ALG_ASYNC);
> >
> > Note, you can only use ciphers marked as blkcipher or cipher in
> > /proc/crypto with that.
> >
> > If you want to use all symmetric cipher implementation, you must use the
> > async skcipher operation which is identical to the previous ablkcipher
> > API. An example is given in the crypto API documentation, such as http://
> > www.chronox.de/crypto-API/Code.html#id-1.8.2
> >
> > Ciao
> > Stephan
Ciao
Stephan
^ permalink raw reply
* RE: sha1_mb broken
From: Dey, Megha @ 2016-10-04 0:25 UTC (permalink / raw)
To: Stephan Mueller; +Cc: linux-crypto@vger.kernel.org, tim.c.chen@linux.intel.com
In-Reply-To: <9643256.KgnRM79R5B@positron.chronox.de>
-----Original Message-----
From: Stephan Mueller [mailto:smueller@chronox.de]
Sent: Wednesday, September 28, 2016 10:31 PM
To: Dey, Megha <megha.dey@intel.com>
Cc: linux-crypto@vger.kernel.org; tim.c.chen@linux.intel.com
Subject: Re: sha1_mb broken
Am Mittwoch, 28. September 2016, 22:52:46 CEST schrieb Dey, Megha:
Hi Megha,
see a self contained example code attached.
> Hi Stephan,
>
> Your test code initialized the completion structure incorrectly, that led to the missing completion from being received. The init_completion call should be made before the crypto_ahash_digest call. The following change to your test code fixes things. ( I have also fixed what I believe is a typo aead->ahash)
> @@ -74,6 +74,8 @@ static unsigned int kccavs_ahash_op(struct kccavs_ahash_def *ahash)
> {
> int rc = 0;
>
> + init_completion(&ahash->result.completion);
> +
> rc = crypto_ahash_digest(ahash->req);
>
> switch (rc) {
>@@ -84,7 +86,7 @@ static unsigned int kccavs_ahash_op(struct kccavs_ahash_def *ahash)
> rc = wait_for_completion_interruptible(&ahash->result.completion);
> if (!rc && !ahash->result.err) {
> #ifdef OLDASYNC
> - INIT_COMPLETION(aead->result.completion);
> + INIT_COMPLETION(&ahash->result.completion);
> #else
> reinit_completion(&ahash->result.completion);
> #endif
> @@ -95,7 +97,6 @@ static unsigned int kccavs_ahash_op(struct kccavs_ahash_def *ahash)
> " %d\n",rc, ahash->result.err);
> break;
> }
> - init_completion(&ahash->result.completion);
>
> return rc;
>}
> This initialization of the completion structure happens correctly in the tcrypt test module used by the kernel, hence I did not come across this issue earlier.
> Thanks,
> Megha
Ciao
Stephan
^ permalink raw reply
* Re: [PATCH] crypto: sha1-powerpc: little-endian support
From: Michael Ellerman @ 2016-10-04 6:23 UTC (permalink / raw)
To: Marcelo Cerri, Herbert Xu
Cc: linux-crypto, linuxppc-dev, linux-kernel, Benjamin Herrenschmidt,
Paul Mackerras, George Wilson, Claudio Carvalho,
Paulo Flabiano Smorigo, joy.latten, David S. Miller
In-Reply-To: <20160928132718.GB31082@gallifrey>
Marcelo Cerri <marcelo.cerri@canonical.com> writes:
> [ Unknown signature status ]
> On Wed, Sep 28, 2016 at 09:20:15PM +0800, Herbert Xu wrote:
>> On Wed, Sep 28, 2016 at 10:15:51AM -0300, Marcelo Cerri wrote:
>> > Hi Herbert,
>> >
>> > Any thoughts on this one?
>>
>> Can this patch wait until the next merge window? On the broken
>> platforms it should just fail the self-test, right?
>
> Yes. It fails on any LE platform (including Ubuntu and RHEL 7.1).
How are you testing this? I thought I was running the crypto tests but
I've never seen this fail.
cheers
^ permalink raw reply
* Re: [PATCH] crypto: sha1-powerpc: little-endian support
From: Marcelo Cerri @ 2016-10-04 12:07 UTC (permalink / raw)
To: Michael Ellerman
Cc: Herbert Xu, linux-crypto, linuxppc-dev, linux-kernel,
Benjamin Herrenschmidt, Paul Mackerras, George Wilson,
Claudio Carvalho, Paulo Flabiano Smorigo, joy.latten,
David S. Miller
In-Reply-To: <87pong4oq3.fsf@concordia.ellerman.id.au>
[-- Attachment #1: Type: text/plain, Size: 984 bytes --]
Hi Michael,
On Ubuntu, CRYPTO_MANAGER_DISABLE_TESTS is set by default. So I had to
disable this config in order to make sha1-powerpc fail in the crypto API
tests. However, even with tests disabled, any usage of sha1-powerpc
should result in incorrect results.
--
Regards,
Marcelo
On Tue, Oct 04, 2016 at 05:23:16PM +1100, Michael Ellerman wrote:
> Marcelo Cerri <marcelo.cerri@canonical.com> writes:
>
> > [ Unknown signature status ]
> > On Wed, Sep 28, 2016 at 09:20:15PM +0800, Herbert Xu wrote:
> >> On Wed, Sep 28, 2016 at 10:15:51AM -0300, Marcelo Cerri wrote:
> >> > Hi Herbert,
> >> >
> >> > Any thoughts on this one?
> >>
> >> Can this patch wait until the next merge window? On the broken
> >> platforms it should just fail the self-test, right?
> >
> > Yes. It fails on any LE platform (including Ubuntu and RHEL 7.1).
>
> How are you testing this? I thought I was running the crypto tests but
> I've never seen this fail.
>
> cheers
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* [PATCH v3 2/2] crypto: marvell - Don't break chain for computable last ahash requests
From: Romain Perier @ 2016-10-04 12:57 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, Nadav Haklai,
Ofer Heifetz, linux-crypto, linux-arm-kernel
In-Reply-To: <20161004125720.3347-1-romain.perier@free-electrons.com>
Currently, the driver breaks chain for all kind of hash requests in order to
don't override intermediate states of partial ahash updates. However, some final
ahash requests can be directly processed by the engine, and so without
intermediate state. This is typically the case for most for the HMAC requests
processed via IPSec.
This commits adds a TDMA descriptor to copy context for these of requests
into the "op" dma pool, then it allow to chain these requests at the DMA level.
The 'complete' operation is also updated to retrieve the MAC digest from the
right location.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
Changes in v3:
- Copy the whole context back to RAM and not just the digest. Also
fixed a rebase issue ^^ (whoops)
Changes in v2:
- Replaced BUG_ON by an error
- Add a variable "break_chain", with "type" to break the chain
with ahash requests. It improves code readability.
drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++++--------
1 file changed, 64 insertions(+), 15 deletions(-)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 9f28468..b36f196 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -312,24 +312,53 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
int i;
digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
- for (i = 0; i < digsize / 4; i++)
- creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
- if (creq->last_req) {
+ if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
+ !(creq->base.chain.last->flags & CESA_TDMA_BREAK_CHAIN)) {
+ struct mv_cesa_tdma_desc *tdma = NULL;
+ __le32 *data = NULL;
+
+ for (tdma = creq->base.chain.first; tdma; tdma = tdma->next) {
+ u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
+ if (type == CESA_TDMA_RESULT)
+ break;
+ }
+
+ if (!tdma) {
+ dev_err(cesa_dev->dev, "Failed to retrieve tdma "
+ "descriptor for outer data\n");
+ return;
+ }
+
/*
- * Hardware's MD5 digest is in little endian format, but
- * SHA in big endian format
+ * Result is already in the correct endianess when the SA is
+ * used
*/
- if (creq->algo_le) {
- __le32 *result = (void *)ahashreq->result;
+ data = tdma->op->ctx.hash.hash;
+ for (i = 0; i < digsize / 4; i++)
+ creq->state[i] = cpu_to_le32(data[i]);
- for (i = 0; i < digsize / 4; i++)
- result[i] = cpu_to_le32(creq->state[i]);
- } else {
- __be32 *result = (void *)ahashreq->result;
+ memcpy(ahashreq->result, data, digsize);
+ } else {
+ for (i = 0; i < digsize / 4; i++)
+ creq->state[i] = readl_relaxed(engine->regs +
+ CESA_IVDIG(i));
+ if (creq->last_req) {
+ /*
+ * Hardware's MD5 digest is in little endian format, but
+ * SHA in big endian format
+ */
+ if (creq->algo_le) {
+ __le32 *result = (void *)ahashreq->result;
+
+ for (i = 0; i < digsize / 4; i++)
+ result[i] = cpu_to_le32(creq->state[i]);
+ } else {
+ __be32 *result = (void *)ahashreq->result;
- for (i = 0; i < digsize / 4; i++)
- result[i] = cpu_to_be32(creq->state[i]);
+ for (i = 0; i < digsize / 4; i++)
+ result[i] = cpu_to_be32(creq->state[i]);
+ }
}
}
@@ -504,6 +533,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
CESA_SA_DESC_CFG_LAST_FRAG,
CESA_SA_DESC_CFG_FRAG_MSK);
+ ret = mv_cesa_dma_add_result_op(chain,
+ CESA_SA_CFG_SRAM_OFFSET,
+ CESA_SA_DATA_SRAM_OFFSET,
+ CESA_TDMA_SRC_IN_SRAM, flags);
+ if (ret)
+ return ERR_PTR(-ENOMEM);
return op;
}
@@ -564,6 +599,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
struct mv_cesa_op_ctx *op = NULL;
unsigned int frag_len;
int ret;
+ u32 type;
+ bool break_chain = true;
basereq->chain.first = NULL;
basereq->chain.last = NULL;
@@ -635,6 +672,16 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
goto err_free_tdma;
}
+ /*
+ * If results are copied via DMA, this means that this
+ * request can be directly processed by the engine,
+ * without partial updates. So we can chain it at the
+ * DMA level with other requests.
+ */
+ type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
+ if (type == CESA_TDMA_RESULT)
+ break_chain = false;
+
if (op) {
/* Add dummy desc to wait for crypto operation end */
ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
@@ -648,8 +695,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
else
creq->cache_ptr = 0;
- basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
- CESA_TDMA_BREAK_CHAIN);
+ basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
+
+ if (break_chain)
+ basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
return 0;
--
2.9.3
^ permalink raw reply related
* [PATCH v3 1/2] crypto: marvell - Use an unique pool to copy results of requests
From: Romain Perier @ 2016-10-04 12:57 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, Nadav Haklai,
Ofer Heifetz, linux-crypto, linux-arm-kernel
In-Reply-To: <20161004125720.3347-1-romain.perier@free-electrons.com>
So far, we used a dedicated dma pool to copy the result of outer IV for
cipher requests. Instead of using a dma pool per outer data, we prefer
use the op dma pool that contains all part of the request from the SRAM.
Then, the outer data that is likely to be used by the 'complete'
operation, is copied later. In this way, any type of result can be
retrieved by DMA for cipher or ahash requests.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---
Changes in v3:
- Don't allocate a new op ctx for the last tdma descriptor. Instead
we point to the last op ctx in the tdma chain, and copy the context
of the current request to this location.
Changes in v2:
- Use the dma pool "op" to retrieve outer data intead of introducing
a new one.
drivers/crypto/marvell/cesa.c | 4 ----
drivers/crypto/marvell/cesa.h | 5 ++---
drivers/crypto/marvell/cipher.c | 8 +++++---
drivers/crypto/marvell/tdma.c | 28 ++++++++++++++--------------
4 files changed, 21 insertions(+), 24 deletions(-)
diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 37dadb2..6e7a5c7 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -375,10 +375,6 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
if (!dma->padding_pool)
return -ENOMEM;
- dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
- if (!dma->iv_pool)
- return -ENOMEM;
-
cesa->dma = dma;
return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e423d33..a768da7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -277,7 +277,7 @@ struct mv_cesa_op_ctx {
#define CESA_TDMA_DUMMY 0
#define CESA_TDMA_DATA 1
#define CESA_TDMA_OP 2
-#define CESA_TDMA_IV 3
+#define CESA_TDMA_RESULT 3
/**
* struct mv_cesa_tdma_desc - TDMA descriptor
@@ -393,7 +393,6 @@ struct mv_cesa_dev_dma {
struct dma_pool *op_pool;
struct dma_pool *cache_pool;
struct dma_pool *padding_pool;
- struct dma_pool *iv_pool;
};
/**
@@ -839,7 +838,7 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
memset(chain, 0, sizeof(*chain));
}
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
u32 size, u32 flags, gfp_t gfp_flags);
struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index d19dc96..098871a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,7 +212,8 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
struct mv_cesa_req *basereq;
basereq = &creq->base;
- memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+ memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
+ ivsize);
} else {
memcpy_fromio(ablkreq->info,
engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -373,8 +374,9 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
/* Add output data for IV */
ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
- ret = mv_cesa_dma_add_iv_op(&basereq->chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
- ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+ ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
+ CESA_SA_DATA_SRAM_OFFSET,
+ CESA_TDMA_SRC_IN_SRAM, flags);
if (ret)
goto err_free_tdma;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9fd7a5f..991dc3f 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,9 +69,6 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
if (type == CESA_TDMA_OP)
dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
le32_to_cpu(tdma->src));
- else if (type == CESA_TDMA_IV)
- dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
- le32_to_cpu(tdma->dst));
tdma = tdma->next;
dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
@@ -209,29 +206,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
return new_tdma;
}
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
u32 size, u32 flags, gfp_t gfp_flags)
{
-
- struct mv_cesa_tdma_desc *tdma;
- u8 *iv;
- dma_addr_t dma_handle;
+ struct mv_cesa_tdma_desc *tdma, *op_desc;
tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
if (IS_ERR(tdma))
return PTR_ERR(tdma);
- iv = dma_pool_alloc(cesa_dev->dma->iv_pool, gfp_flags, &dma_handle);
- if (!iv)
- return -ENOMEM;
+ for (op_desc = chain->first; op_desc; op_desc = op_desc->next) {
+ u32 type = op_desc->flags & CESA_TDMA_TYPE_MSK;
+
+ if (type == CESA_TDMA_OP)
+ break;
+ }
+
+ if (!op_desc)
+ return -EIO;
tdma->byte_cnt = cpu_to_le32(size | BIT(31));
tdma->src = src;
- tdma->dst = cpu_to_le32(dma_handle);
- tdma->data = iv;
+ tdma->dst = op_desc->src;
+ tdma->op = op_desc->op;
flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
- tdma->flags = flags | CESA_TDMA_IV;
+ tdma->flags = flags | CESA_TDMA_RESULT;
return 0;
}
--
2.9.3
^ permalink raw reply related
* [PATCH v3 0/2] Improve DMA chaining for ahash requests
From: Romain Perier @ 2016-10-04 12:57 UTC (permalink / raw)
To: Boris Brezillon, Arnaud Ebalard
Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, Nadav Haklai,
Ofer Heifetz, linux-crypto, linux-arm-kernel
This series contain performance improvement regarding ahash requests.
So far, ahash requests were systematically not chained at the DMA level.
However, in some case, like this is the case by using IPSec, some ahash
requests can be processed directly by the engine, and don't have
intermediaire partial update states.
This series firstly re-work the way outer IVs are copied from the SRAM
into the dma pool. To do so, we introduce a common dma pool for all type
of requests that contains outer results (like IV or digest). Then, for
ahash requests that can be processed directly by the engine, outer
results are copied from the SRAM into the common dma pool. These requests
are then allowed to be chained at the DMA level.
Benchmarking results with iperf throught IPSec
==============================================
ESP AH
Before 343 Mbits/s 492 Mbits/s
After 422 Mbits/s 577 Mbits/s
Improvement +23% +17%
Romain Perier (2):
crypto: marvell - Use an unique pool to copy results of requests
crypto: marvell - Don't break chain for computable last ahash requests
drivers/crypto/marvell/cesa.c | 4 ---
drivers/crypto/marvell/cesa.h | 5 ++-
drivers/crypto/marvell/cipher.c | 8 +++--
drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++--------
drivers/crypto/marvell/tdma.c | 28 +++++++--------
5 files changed, 85 insertions(+), 39 deletions(-)
--
2.9.3
^ permalink raw reply
* Re: [PATCH v3 1/2] crypto: marvell - Use an unique pool to copy results of requests
From: Boris Brezillon @ 2016-10-04 13:17 UTC (permalink / raw)
To: Romain Perier
Cc: Arnaud Ebalard, David S. Miller, Herbert Xu, Thomas Petazzoni,
Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
Nadav Haklai, Ofer Heifetz, linux-crypto, linux-arm-kernel
In-Reply-To: <20161004125720.3347-2-romain.perier@free-electrons.com>
On Tue, 4 Oct 2016 14:57:19 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:
> So far, we used a dedicated dma pool to copy the result of outer IV for
> cipher requests. Instead of using a dma pool per outer data, we prefer
> use the op dma pool that contains all part of the request from the SRAM.
> Then, the outer data that is likely to be used by the 'complete'
> operation, is copied later. In this way, any type of result can be
> retrieved by DMA for cipher or ahash requests.
>
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>
> Changes in v3:
> - Don't allocate a new op ctx for the last tdma descriptor. Instead
> we point to the last op ctx in the tdma chain, and copy the context
> of the current request to this location.
>
> Changes in v2:
> - Use the dma pool "op" to retrieve outer data intead of introducing
> a new one.
>
> drivers/crypto/marvell/cesa.c | 4 ----
> drivers/crypto/marvell/cesa.h | 5 ++---
> drivers/crypto/marvell/cipher.c | 8 +++++---
> drivers/crypto/marvell/tdma.c | 28 ++++++++++++++--------------
> 4 files changed, 21 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index 37dadb2..6e7a5c7 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -375,10 +375,6 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
> if (!dma->padding_pool)
> return -ENOMEM;
>
> - dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
> - if (!dma->iv_pool)
> - return -ENOMEM;
> -
> cesa->dma = dma;
>
> return 0;
> diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
> index e423d33..a768da7 100644
> --- a/drivers/crypto/marvell/cesa.h
> +++ b/drivers/crypto/marvell/cesa.h
> @@ -277,7 +277,7 @@ struct mv_cesa_op_ctx {
> #define CESA_TDMA_DUMMY 0
> #define CESA_TDMA_DATA 1
> #define CESA_TDMA_OP 2
> -#define CESA_TDMA_IV 3
> +#define CESA_TDMA_RESULT 3
>
> /**
> * struct mv_cesa_tdma_desc - TDMA descriptor
> @@ -393,7 +393,6 @@ struct mv_cesa_dev_dma {
> struct dma_pool *op_pool;
> struct dma_pool *cache_pool;
> struct dma_pool *padding_pool;
> - struct dma_pool *iv_pool;
> };
>
> /**
> @@ -839,7 +838,7 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
> memset(chain, 0, sizeof(*chain));
> }
>
> -int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> u32 size, u32 flags, gfp_t gfp_flags);
>
> struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
> diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
> index d19dc96..098871a 100644
> --- a/drivers/crypto/marvell/cipher.c
> +++ b/drivers/crypto/marvell/cipher.c
> @@ -212,7 +212,8 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
> struct mv_cesa_req *basereq;
>
> basereq = &creq->base;
> - memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
> + memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
> + ivsize);
> } else {
> memcpy_fromio(ablkreq->info,
> engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
> @@ -373,8 +374,9 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
>
> /* Add output data for IV */
> ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> - ret = mv_cesa_dma_add_iv_op(&basereq->chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
> - ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
> + ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
> + CESA_SA_DATA_SRAM_OFFSET,
> + CESA_TDMA_SRC_IN_SRAM, flags);
>
> if (ret)
> goto err_free_tdma;
> diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
> index 9fd7a5f..991dc3f 100644
> --- a/drivers/crypto/marvell/tdma.c
> +++ b/drivers/crypto/marvell/tdma.c
> @@ -69,9 +69,6 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
> if (type == CESA_TDMA_OP)
> dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
> le32_to_cpu(tdma->src));
> - else if (type == CESA_TDMA_IV)
> - dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
> - le32_to_cpu(tdma->dst));
>
> tdma = tdma->next;
> dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
> @@ -209,29 +206,32 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
> return new_tdma;
> }
>
> -int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> +int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
> u32 size, u32 flags, gfp_t gfp_flags)
> {
> -
> - struct mv_cesa_tdma_desc *tdma;
> - u8 *iv;
> - dma_addr_t dma_handle;
> + struct mv_cesa_tdma_desc *tdma, *op_desc;
>
> tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
> if (IS_ERR(tdma))
> return PTR_ERR(tdma);
>
> - iv = dma_pool_alloc(cesa_dev->dma->iv_pool, gfp_flags, &dma_handle);
> - if (!iv)
> - return -ENOMEM;
Can you add a comment explaining what you're doing here?
/* We re-use an existing op_desc object to retrieve the context
* and result instead of allocating a new one.
* There is at least one object of this type in a CESA crypto
* req, just pick the first one in the chain.
*/
Once this is addressed, you can add my
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> + for (op_desc = chain->first; op_desc; op_desc = op_desc->next) {
> + u32 type = op_desc->flags & CESA_TDMA_TYPE_MSK;
> +
> + if (type == CESA_TDMA_OP)
> + break;
> + }
> +
> + if (!op_desc)
> + return -EIO;
>
> tdma->byte_cnt = cpu_to_le32(size | BIT(31));
> tdma->src = src;
> - tdma->dst = cpu_to_le32(dma_handle);
> - tdma->data = iv;
> + tdma->dst = op_desc->src;
> + tdma->op = op_desc->op;
>
> flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
> - tdma->flags = flags | CESA_TDMA_IV;
> + tdma->flags = flags | CESA_TDMA_RESULT;
> return 0;
> }
>
^ permalink raw reply
* [PATCH] crypto: caam: add support for iMX6UL
From: Marcus Folkesson @ 2016-10-04 13:32 UTC (permalink / raw)
To: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
mark.rutland-5wv7dgnIgG8, horia.geanta-3arQi8VN3Tc,
tudor-dan.ambarus-3arQi8VN3Tc,
marcus.folkesson-Re5JQEeQqe8AvxtiuMwx3w,
alexandru.porosanu-3arQi8VN3Tc, arnd-r2nGTMty4D4
Cc: linux-crypto-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
i.MX6UL does only require three clocks to enable CAAM module.
Signed-off-by: Marcus Folkesson <marcus.folkesson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
.../devicetree/bindings/crypto/fsl-sec4.txt | 20 +++++++++++++
drivers/crypto/caam/ctrl.c | 35 ++++++++++++----------
2 files changed, 40 insertions(+), 15 deletions(-)
diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
index adeca34..6617e24 100644
--- a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
+++ b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
@@ -123,6 +123,9 @@ PROPERTIES
EXAMPLE
+
+i.MX6Q/DL/SX requires four clocks
+
crypto@300000 {
compatible = "fsl,sec-v4.0";
fsl,sec-era = <2>;
@@ -139,6 +142,23 @@ EXAMPLE
clock-names = "mem", "aclk", "ipg", "emi_slow";
};
+
+i.MX6UL does only require three clocks
+
+ crypto: caam@2140000 {
+ compatible = "fsl,sec-v4.0";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ reg = <0x2140000 0x3c000>;
+ ranges = <0 0x2140000 0x3c000>;
+ interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
+
+ clocks = <&clks IMX6UL_CLK_CAAM_MEM>,
+ <&clks IMX6UL_CLK_CAAM_ACLK>,
+ <&clks IMX6UL_CLK_CAAM_IPG>;
+ clock-names = "mem", "aclk", "ipg";
+ };
+
=====================================================================
Job Ring (JR) Node
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 0ec112e..5abaf37 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -329,8 +329,8 @@ static int caam_remove(struct platform_device *pdev)
clk_disable_unprepare(ctrlpriv->caam_ipg);
clk_disable_unprepare(ctrlpriv->caam_mem);
clk_disable_unprepare(ctrlpriv->caam_aclk);
- clk_disable_unprepare(ctrlpriv->caam_emi_slow);
-
+ if (!of_machine_is_compatible("fsl,imx6ul"))
+ clk_disable_unprepare(ctrlpriv->caam_emi_slow);
return 0;
}
@@ -481,14 +481,16 @@ static int caam_probe(struct platform_device *pdev)
}
ctrlpriv->caam_aclk = clk;
- clk = caam_drv_identify_clk(&pdev->dev, "emi_slow");
- if (IS_ERR(clk)) {
- ret = PTR_ERR(clk);
- dev_err(&pdev->dev,
- "can't identify CAAM emi_slow clk: %d\n", ret);
- return ret;
+ if (!of_machine_is_compatible("fsl,imx6ul")) {
+ clk = caam_drv_identify_clk(&pdev->dev, "emi_slow");
+ if (IS_ERR(clk)) {
+ ret = PTR_ERR(clk);
+ dev_err(&pdev->dev,
+ "can't identify CAAM emi_slow clk: %d\n", ret);
+ return ret;
+ }
+ ctrlpriv->caam_emi_slow = clk;
}
- ctrlpriv->caam_emi_slow = clk;
ret = clk_prepare_enable(ctrlpriv->caam_ipg);
if (ret < 0) {
@@ -509,11 +511,13 @@ static int caam_probe(struct platform_device *pdev)
goto disable_caam_mem;
}
- ret = clk_prepare_enable(ctrlpriv->caam_emi_slow);
- if (ret < 0) {
- dev_err(&pdev->dev, "can't enable CAAM emi slow clock: %d\n",
- ret);
- goto disable_caam_aclk;
+ if (!of_machine_is_compatible("fsl,imx6ul")) {
+ ret = clk_prepare_enable(ctrlpriv->caam_emi_slow);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "can't enable CAAM emi slow clock: %d\n",
+ ret);
+ goto disable_caam_aclk;
+ }
}
/* Get configuration properties from device tree */
@@ -829,7 +833,8 @@ caam_remove:
iounmap_ctrl:
iounmap(ctrl);
disable_caam_emi_slow:
- clk_disable_unprepare(ctrlpriv->caam_emi_slow);
+ if (!of_machine_is_compatible("fsl,imx6ul"))
+ clk_disable_unprepare(ctrlpriv->caam_emi_slow);
disable_caam_aclk:
clk_disable_unprepare(ctrlpriv->caam_aclk);
disable_caam_mem:
--
2.8.0
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: sha1_mb broken
From: Stephan Mueller @ 2016-10-04 14:10 UTC (permalink / raw)
To: Dey, Megha; +Cc: linux-crypto@vger.kernel.org, tim.c.chen@linux.intel.com
In-Reply-To: <C440BA31B54DCD4AAC682D2365C8FEE703CEF891@ORSMSX111.amr.corp.intel.com>
Am Dienstag, 4. Oktober 2016, 00:25:07 CEST schrieb Dey, Megha:
Hi Megha,
>
> > Hi Stephan,
> >
> > Your test code initialized the completion structure incorrectly, that led
> > to the missing completion from being received. The init_completion call
> > should be made before the crypto_ahash_digest call. The following change
Thanks a lot for pointing that one out. Can you help me understand why your
code trips over that issue whereas other ahash implementations do not (all
other SHA-1 or SHA-2 implementations work perfectly fine with that code)?
Thanks again!
Ciao
Stephan
^ permalink raw reply
* Re: [PATCH v3 2/2] crypto: marvell - Don't break chain for computable last ahash requests
From: Boris Brezillon @ 2016-10-04 14:14 UTC (permalink / raw)
To: Romain Perier
Cc: Arnaud Ebalard, David S. Miller, Herbert Xu, Thomas Petazzoni,
Jason Cooper, Andrew Lunn, Sebastian Hesselbarth, Gregory Clement,
Nadav Haklai, Ofer Heifetz, linux-crypto, linux-arm-kernel
In-Reply-To: <20161004125720.3347-3-romain.perier@free-electrons.com>
On Tue, 4 Oct 2016 14:57:20 +0200
Romain Perier <romain.perier@free-electrons.com> wrote:
> Currently, the driver breaks chain for all kind of hash requests in order to
> don't override intermediate states of partial ahash updates. However, some final
> ahash requests can be directly processed by the engine, and so without
> intermediate state. This is typically the case for most for the HMAC requests
> processed via IPSec.
>
> This commits adds a TDMA descriptor to copy context for these of requests
> into the "op" dma pool, then it allow to chain these requests at the DMA level.
> The 'complete' operation is also updated to retrieve the MAC digest from the
> right location.
>
> Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
> ---
>
> Changes in v3:
> - Copy the whole context back to RAM and not just the digest. Also
> fixed a rebase issue ^^ (whoops)
>
> Changes in v2:
> - Replaced BUG_ON by an error
> - Add a variable "break_chain", with "type" to break the chain
>
> with ahash requests. It improves code readability.
> drivers/crypto/marvell/hash.c | 79 +++++++++++++++++++++++++++++++++++--------
> 1 file changed, 64 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> index 9f28468..b36f196 100644
> --- a/drivers/crypto/marvell/hash.c
> +++ b/drivers/crypto/marvell/hash.c
> @@ -312,24 +312,53 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
> int i;
>
> digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
> - for (i = 0; i < digsize / 4; i++)
> - creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
>
> - if (creq->last_req) {
> + if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
> + !(creq->base.chain.last->flags & CESA_TDMA_BREAK_CHAIN)) {
> + struct mv_cesa_tdma_desc *tdma = NULL;
> + __le32 *data = NULL;
> +
> + for (tdma = creq->base.chain.first; tdma; tdma = tdma->next) {
> + u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
> + if (type == CESA_TDMA_RESULT)
> + break;
> + }
You should be able to drop the DUMMY desc at the end of the chain and
replace it by the RESULT desc. This way, you won't have to iterate over
the chain to find the TDMA_RESULT element: it should always be the last
desc in the chain.
> +
> + if (!tdma) {
> + dev_err(cesa_dev->dev, "Failed to retrieve tdma "
> + "descriptor for outer data\n");
> + return;
> + }
> +
> /*
> - * Hardware's MD5 digest is in little endian format, but
> - * SHA in big endian format
> + * Result is already in the correct endianess when the SA is
> + * used
> */
> - if (creq->algo_le) {
> - __le32 *result = (void *)ahashreq->result;
> + data = tdma->op->ctx.hash.hash;
> + for (i = 0; i < digsize / 4; i++)
> + creq->state[i] = cpu_to_le32(data[i]);
>
> - for (i = 0; i < digsize / 4; i++)
> - result[i] = cpu_to_le32(creq->state[i]);
> - } else {
> - __be32 *result = (void *)ahashreq->result;
> + memcpy(ahashreq->result, data, digsize);
> + } else {
> + for (i = 0; i < digsize / 4; i++)
> + creq->state[i] = readl_relaxed(engine->regs +
> + CESA_IVDIG(i));
> + if (creq->last_req) {
> + /*
> + * Hardware's MD5 digest is in little endian format, but
> + * SHA in big endian format
> + */
> + if (creq->algo_le) {
> + __le32 *result = (void *)ahashreq->result;
> +
> + for (i = 0; i < digsize / 4; i++)
> + result[i] = cpu_to_le32(creq->state[i]);
> + } else {
> + __be32 *result = (void *)ahashreq->result;
>
> - for (i = 0; i < digsize / 4; i++)
> - result[i] = cpu_to_be32(creq->state[i]);
> + for (i = 0; i < digsize / 4; i++)
> + result[i] = cpu_to_be32(creq->state[i]);
> + }
> }
> }
>
> @@ -504,6 +533,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
> CESA_SA_DESC_CFG_LAST_FRAG,
> CESA_SA_DESC_CFG_FRAG_MSK);
>
> + ret = mv_cesa_dma_add_result_op(chain,
> + CESA_SA_CFG_SRAM_OFFSET,
> + CESA_SA_DATA_SRAM_OFFSET,
> + CESA_TDMA_SRC_IN_SRAM, flags);
> + if (ret)
> + return ERR_PTR(-ENOMEM);
> return op;
> }
>
> @@ -564,6 +599,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
> struct mv_cesa_op_ctx *op = NULL;
> unsigned int frag_len;
> int ret;
> + u32 type;
> + bool break_chain = true;
>
> basereq->chain.first = NULL;
> basereq->chain.last = NULL;
> @@ -635,6 +672,16 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
> goto err_free_tdma;
> }
>
> + /*
> + * If results are copied via DMA, this means that this
> + * request can be directly processed by the engine,
> + * without partial updates. So we can chain it at the
> + * DMA level with other requests.
> + */
> + type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
> + if (type == CESA_TDMA_RESULT)
> + break_chain = false;
> +
> if (op) {
> /* Add dummy desc to wait for crypto operation end */
> ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
> @@ -648,8 +695,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
> else
> creq->cache_ptr = 0;
>
> - basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
> - CESA_TDMA_BREAK_CHAIN);
> + basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
> +
> + if (break_chain)
> + basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
Not sure this break_chain variable is really needed. you can directly
test the type of the last element in the TDMA chain here and if it's
!= CESA_TDMA_RESULT, pass the CESA_TDMA_BREAK_CHAIN flag.
>
> return 0;
>
^ permalink raw reply
* Re: sha1_mb broken
From: Tim Chen @ 2016-10-04 16:08 UTC (permalink / raw)
To: Stephan Mueller, Dey, Megha; +Cc: linux-crypto@vger.kernel.org
In-Reply-To: <2176107.gSz0A05ekE@tauon.atsec.com>
On Tue, 2016-10-04 at 16:10 +0200, Stephan Mueller wrote:
> Am Dienstag, 4. Oktober 2016, 00:25:07 CEST schrieb Dey, Megha:
>
> Hi Megha,
>
> >
> >
> > >
> > > Hi Stephan,
> > >
> > > Your test code initialized the completion structure incorrectly, that led
> > > to the missing completion from being received. The init_completion call
> > > should be made before the crypto_ahash_digest call. The following change
> Thanks a lot for pointing that one out. Can you help me understand why your
> code trips over that issue whereas other ahash implementations do not (all
> other SHA-1 or SHA-2 implementations work perfectly fine with that code)?
>
There is a spin lock protecting the completion's wait_queue on the processes waiting for
the completion of the job, and the queue head. My suspicion is if these
structures are not initialized properly, we fail to look up the waiting process in the queue
properly to call it. For the other tested cases, they may not be a true ahash operation
in the sense of passing request through the crypto daemon, and have to context switch
to let crypto daemon complete the job. The computation proceeds
and returns in the same call chain.
Thanks.
Tim
^ permalink raw reply
* Re: sha1_mb broken
From: Stephan Mueller @ 2016-10-04 16:27 UTC (permalink / raw)
To: Tim Chen; +Cc: Dey, Megha, linux-crypto@vger.kernel.org
In-Reply-To: <1475597322.3916.283.camel@linux.intel.com>
Am Dienstag, 4. Oktober 2016, 09:08:42 CEST schrieb Tim Chen:
Hi Tim,
> There is a spin lock protecting the completion's wait_queue on the processes
> waiting for the completion of the job, and the queue head. My suspicion is
> if these structures are not initialized properly, we fail to look up the
> waiting process in the queue properly to call it. For the other tested
> cases, they may not be a true ahash operation in the sense of passing
> request through the crypto daemon, and have to context switch to let crypto
> daemon complete the job. The computation proceeds
> and returns in the same call chain.
Thanks a lot for the clarification.
Ciao
Stephan
^ permalink raw reply
* [PATCH] Fix Kconfig dependencies for FIPS
From: Alec Ari @ 2016-10-04 22:34 UTC (permalink / raw)
To: linux-crypto
Currently FIPS depends on MODULE_SIG, even if MODULES is disabled.
This change allows the enabling of FIPS without support for modules.
If module loading support is enabled, only then does
FIPS require MODULE_SIG.
Signed-off-by: Alec Ari <neotheuser@gmail.com>
---
crypto/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 84d7148..fd28805 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -24,7 +24,7 @@ comment "Crypto core or helper"
config CRYPTO_FIPS
bool "FIPS 200 compliance"
depends on (CRYPTO_ANSI_CPRNG || CRYPTO_DRBG) &&
!CRYPTO_MANAGER_DISABLE_TESTS
- depends on MODULE_SIG
+ depends on (MODULE_SIG || !MODULES)
help
This options enables the fips boot option which is
required if you want to system to operate in a FIPS 200
--
2.7.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox