Linux cryptographic layer development
 help / color / mirror / Atom feed
* Char.c
From: Fontaine david @ 2016-09-30  8:09 UTC (permalink / raw)
  To: linux-crypto

Hi Linus:

This push fixes a weakness in random number generation of file random.c.

The two polynomials of LFSR used in Linux/drivers/char/random.c are
P1(X) = x^128 + x^104 + x^76 + x^51 +x^25 + x + 1, for input pool
P2(X) = x^32 + x^26 + x^19 + x^14 + x^7 + x + 1 , for output pool

These polynomials Q1(X) = alpha ^3 *(P1(X)-1)+1 and Q2(X) = alpha ^3
*(P2(X)-1)+1 are not primitive over GF(2^32) where alpha is an
primitive element of GF(2^32).
It turns out that periods of LFSR corresponding to these polynomials
are not optimal, it means that the space of numbers generated by these
LFSR is not  GF(2^(32*deg(Pi))-1), i=1,2.

As mentioned in the random.c file, these polynomials come from an
article http://eprint.iacr.org/2012/251.pdf. It is stated in the
article that these polynomials have periods (2^(32*deg(Pi))-1)/3,
i=1,2, so not optimal.

We can improve these LFSR choosing these polynomials as primitive and
therefore increase the space of numbers generated by 3.

The polynomials used in the current implementation of the PRNG and the
point presented here, do not conclude a practical attack on the PRNG.

After several calculations, we propose here the following polynomials:
R1(x) = x^128 + x^106 +x^79 + x^51 +x^25 + x+ 1, as new polynomial of input pool
R2(x) = x^32 + x^27 + x^21 + x^14 + x^7 + x +1, as new polynomial of
the output pool

So polynomials S1(X) = alpha^4*(R1(X)-1)+1 and S2(X) =
alpha^4*(R2(X)-1)+1 are primitive on GF(2^32).
It is very easy to check their primitive with magma. Use the online
tool (http://magma.maths.usyd.edu.au/calc/) and the following code:

K0:=GF(2);
P<X> := PolynomialRing(K0);
K1<a> := ext<K0|X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+1>;K1;
P<t> := PolynomialRing(K1);

R1 :=  t^128 + t^106 + t^79 + t^51 + t^25 + t + 1;
S1 := a^4*(R1-1)+1;

R2 := t^32 + t^27 + t^21 + t^14 + t^7 + t + 1;
S2 := a^4*(R2-1)+1;

S1test := Evaluate((1/(t^128))*S1,1/t);S1test;
> t^128 + a*t^127 + a*t^100 + a*t^74 + a*t^53 + a*t^25 + a

S1test := t^128 + a^4*t^127 + a^4*t^103 + a^4*t^77 + a^4*t^49 + a^4*t^22 + a^4;
IsPrimitive(S1test);
> true

S2test := Evaluate((1/(t^32))*S2,1/t);S2test;
> t^32 + a*t^31 + a*t^24 + a*t^16 + a*t^12 + a*t^6 + a

S2test := t^32 + a^4*t^31 + a^4*t^25 + a^4*t^18 + a^4*t^11 + a^4*t^5 + a^4;
IsPrimitive(S2test);
> true

To use these polynomials, the following changes in the random.c file
should be applied:

olivier@Zebulon:~/Documents/linux-4.7.4/drivers/char$ diff random.c random-new.c
371,372c371,373
<        /* x^128 + x^104 + x^76 + x^51 +x^25 + x + 1 */
<        { S(128),         104,     76,       51,       25,       1 },
---
>        /* was: x^128 + x^104 + x^76 + x^51 +x^25 + x + 1 */
>        /* x^128 + x^106 + x^79 + x^51 +x^25 + x + 1 */
>        { S(128),         106,     79,       51,       25,       1 },
374,375c375,377
<        /* x^32 + x^26 + x^19 + x^14 + x^7 + x + 1 */
<        { S(32),           26,       19,       14,       7,         1 },
---
>        /* was: x^32 + x^26 + x^19 + x^14 + x^7 + x + 1 */
>        /* x^32 + x^27 + x^21 + x^14 + x^7 + x + 1 */
>        { S(32),           27,       21,       14,       7,         1 },
478a481
> /* was:
481a485,490
> */
> static __u32 const twist_table[16] = {
>        0x00000000, 0x1db71064, 0x3b6e20c8, 0x26d930ac,
>         0x76dc4190, 0x6b6b51f4, 0x4db26158, 0x5005713c,
>         0xedb88320, 0xf00f9344, 0xd6d6a3e8, 0xcb61b38c,
>         0x9b64c2b0, 0x86d3d2d4, 0xa00ae278, 0xbdbdf21c };
525c534,536
<                    r->pool[i] = (w >> 3) ^ twist_table[w & 7];
---
>                    /*
>                   was: r->pool[i] = (w >> 3) ^ twist_table[w & 7];*/
>                    r->pool[i] = (w >> 4) ^ twist_table[w & 15];

Regards,
David Fontaine & Olivier Vivolo

^ permalink raw reply

* [PATCH v2] crypto: caam - treat SGT address pointer as u64
From: Tudor Ambarus @ 2016-09-30  9:09 UTC (permalink / raw)
  To: fabio.estevam, horia.geanta, herbert; +Cc: linux-crypto, Tudor Ambarus

Even for i.MX, CAAM is able to use address pointers greater than
32 bits, the address pointer field being interpreted as a double word.
Enforce u64 address pointer in the sec4_sg_entry struct.

This patch fixes the SGT address pointer endianness issue for
32bit platforms where core endianness != caam endianness.

Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
---
v2: Removed mx7d restriction.

 drivers/crypto/caam/desc.h       | 6 ------
 drivers/crypto/caam/regs.h       | 8 ++++++++
 drivers/crypto/caam/sg_sw_sec4.h | 2 +-
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/crypto/caam/desc.h b/drivers/crypto/caam/desc.h
index 26427c1..513b664 100644
--- a/drivers/crypto/caam/desc.h
+++ b/drivers/crypto/caam/desc.h
@@ -23,13 +23,7 @@
 #define SEC4_SG_OFFSET_MASK	0x00001fff
 
 struct sec4_sg_entry {
-#if !defined(CONFIG_ARCH_DMA_ADDR_T_64BIT) && \
-	defined(CONFIG_CRYPTO_DEV_FSL_CAAM_IMX)
-	u32 rsvd1;
-	dma_addr_t ptr;
-#else
 	u64 ptr;
-#endif /* CONFIG_CRYPTO_DEV_FSL_CAAM_IMX */
 	u32 len;
 	u32 bpid_offset;
 };
diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index b3c5016..84d2f83 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -196,6 +196,14 @@ static inline u64 rd_reg64(void __iomem *reg)
 #define caam_dma_to_cpu(value) caam32_to_cpu(value)
 #endif /* CONFIG_ARCH_DMA_ADDR_T_64BIT  */
 
+#ifdef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
+#define cpu_to_caam_dma64(value) \
+		(((u64)cpu_to_caam32(lower_32_bits(value)) << 32) | \
+		 (u64)cpu_to_caam32(upper_32_bits(value)))
+#else
+#define cpu_to_caam_dma64(value) cpu_to_caam64(value)
+#endif
+
 /*
  * jr_outentry
  * Represents each entry in a JobR output ring
diff --git a/drivers/crypto/caam/sg_sw_sec4.h b/drivers/crypto/caam/sg_sw_sec4.h
index 19dc64f..41cd5a3 100644
--- a/drivers/crypto/caam/sg_sw_sec4.h
+++ b/drivers/crypto/caam/sg_sw_sec4.h
@@ -15,7 +15,7 @@ struct sec4_sg_entry;
 static inline void dma_to_sec4_sg_one(struct sec4_sg_entry *sec4_sg_ptr,
 				      dma_addr_t dma, u32 len, u16 offset)
 {
-	sec4_sg_ptr->ptr = cpu_to_caam_dma(dma);
+	sec4_sg_ptr->ptr = cpu_to_caam_dma64(dma);
 	sec4_sg_ptr->len = cpu_to_caam32(len);
 	sec4_sg_ptr->bpid_offset = cpu_to_caam32(offset & SEC4_SG_OFFSET_MASK);
 #ifdef DEBUG
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] arm64: add support for SHA256 using NEON instructions
From: Andy Polyakov @ 2016-09-30 10:44 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel, linux-crypto, herbert
  Cc: catalin.marinas, victor.chong, will.deacon, daniel.thompson
In-Reply-To: <1475189503-9175-1-git-send-email-ard.biesheuvel@linaro.org>

> This is a port of the ARMv7 implementation in arch/arm/crypto. For a Cortex-A57
> (r2p1), the performance numbers are listed below. In summary, 40% - 50% speedup
> where it counts, i.e., block sizes over 256 bytes with few updates.

Cool! Great! Just in case for reference. You compare generic, new NEON
and hardware-assisted implementations. I assume that first one refers to
C compiler-generated code. But there is another option, i.e. non-NEON
assembly. Now to the "for reference" part. The reason for why NEON is
not utilized in OpenSSL is because it's deemed that it doesn't provide
"extraordinary" improvement over non-NEON assembly code, especially on
less sophisticated processors such as Cortex-A53. Note that I'm not
saying that NEON SHA256 subroutine is not faster, it is, only that it's
not "extraordinarily" faster in most relevant cases(*). In other words
it's reckoned that non-NEON assembly provides adequate *all-round*
performance, taking into consideration that it does it without being
dependent on optional NEON. Non-NEON assembly should also be interesting
in kernel context, because there are situations when you can't call NEON
procedure, be it suggested one or hardware-assisted, which itself relies
on NEON. And of course another nice quality about SHA2 module in OpenSSL
is that it emits both SHA256 and SHA512 codes ;-) On related note it
should be noted that NEON-izing SHA512 on ARM64 makes lesser sense, it's
bound to provide lesser improvement than SHA256 [if any at all in some
cases]. This is because in SHA256 you engage 4 lanes of NEON registers,
while in SHA512 case you have only 2.

(*) Well, this is also question of priorities. My rationale is that
there is a lot of Cortex-A53 and A57 phones out there that don't have
crypto-extensions, I refer to Qualcomm SoCs, where NEON gives less than
10% improvement [over non-NEON assembly]. Yes, it gives more on X-Gene,
but X-Gene is not wide-spread, and the rest (including upcoming X-Gene)
have crypto-extensions, so alternative code path doesn't matter.

^ permalink raw reply

* [PATCH] padata: add helper function for queue length
From: Jason A. Donenfeld @ 2016-10-02  1:46 UTC (permalink / raw)
  To: Steffen Klassert, linux-crypto, linux-kernel; +Cc: Jason A. Donenfeld

Since padata has a maximum number of inflight jobs, currently 1000, it's
very useful to know how many jobs are currently queued up. This adds a
simple helper function to expose this information.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 include/linux/padata.h |  2 ++
 kernel/padata.c        | 16 ++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/padata.h b/include/linux/padata.h
index 113ee62..4840ae4 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -3,6 +3,7 @@
  *
  * Copyright (C) 2008, 2009 secunet Security Networks AG
  * Copyright (C) 2008, 2009 Steffen Klassert <steffen.klassert@secunet.com>
+ * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -181,4 +182,5 @@ extern int padata_register_cpumask_notifier(struct padata_instance *pinst,
 					    struct notifier_block *nblock);
 extern int padata_unregister_cpumask_notifier(struct padata_instance *pinst,
 					      struct notifier_block *nblock);
+extern int padata_queue_len(struct padata_instance *pinst);
 #endif
diff --git a/kernel/padata.c b/kernel/padata.c
index 9932788..17c1e08 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -5,6 +5,7 @@
  *
  * Copyright (C) 2008, 2009 secunet Security Networks AG
  * Copyright (C) 2008, 2009 Steffen Klassert <steffen.klassert@secunet.com>
+ * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -1039,3 +1040,18 @@ void padata_free(struct padata_instance *pinst)
 	kobject_put(&pinst->kobj);
 }
 EXPORT_SYMBOL(padata_free);
+
+/**
+ * padata_queue_len - retreive the number of in progress jobs
+ *
+ * @padata_inst: padata instance from which to read the queue size
+ */
+int padata_queue_len(struct padata_instance *pinst)
+{
+	int len;
+	rcu_read_lock_bh();
+	len = atomic_read(&rcu_dereference_bh(pinst->pd)->refcnt);
+	rcu_read_unlock_bh();
+	return len;
+}
+EXPORT_SYMBOL(padata_queue_len);
-- 
2.10.0

^ permalink raw reply related

* Re: [PATCH] crypto: arm64/sha256 - add support for SHA256 using NEON instructions
From: Ard Biesheuvel @ 2016-10-02  2:58 UTC (permalink / raw)
  To: linux-arm-kernel@lists.infradead.org,
	linux-crypto@vger.kernel.org, Herbert Xu
  Cc: Daniel Thompson, Ard Biesheuvel, Catalin Marinas, Will Deacon,
	Andy Polyakov, Victor Chong
In-Reply-To: <CAKv+Gu8J9rHA2opKqHQr74GU6xxwMpaJak1mNpRVpOf=+BZ81w@mail.gmail.com>

On 29 September 2016 at 16:37, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 29 September 2016 at 15:51, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> This is a port to arm64 of the NEON implementation of SHA256 that lives
>> under arch/arm/crypto.
>>
>> Due to the fact that the AArch64 assembler dialect deviates from the
>> 32-bit ARM one in ways that makes sharing code problematic, and given
>> that this version only uses the NEON version whereas the original
>> implementation supports plain ALU assembler, NEON and Crypto Extensions,
>> this code is built from a version sha256-armv4.pl that has been
>> transliterated to the AArch64 NEON dialect.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/crypto/Kconfig               |   5 +
>>  arch/arm64/crypto/Makefile              |  11 +
>>  arch/arm64/crypto/sha256-armv4.pl       | 413 +++++++++
>>  arch/arm64/crypto/sha256-core.S_shipped | 883 ++++++++++++++++++++
>>  arch/arm64/crypto/sha256_neon_glue.c    | 103 +++
>>  5 files changed, 1415 insertions(+)
>>
>> diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
>> index 2cf32e9887e1..d32371198474 100644
>> --- a/arch/arm64/crypto/Kconfig
>> +++ b/arch/arm64/crypto/Kconfig
>> @@ -18,6 +18,11 @@ config CRYPTO_SHA2_ARM64_CE
>>         depends on ARM64 && KERNEL_MODE_NEON
>>         select CRYPTO_HASH
>>
>> +config CRYPTO_SHA2_ARM64_NEON
>> +       tristate "SHA-224/SHA-256 digest algorithm (ARMv8 NEON)"
>> +       depends on ARM64 && KERNEL_MODE_NEON
>> +       select CRYPTO_HASH
>> +
>>  config CRYPTO_GHASH_ARM64_CE
>>         tristate "GHASH (for GCM chaining mode) using ARMv8 Crypto Extensions"
>>         depends on ARM64 && KERNEL_MODE_NEON
>> diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
>> index abb79b3cfcfe..5156ebee0488 100644
>> --- a/arch/arm64/crypto/Makefile
>> +++ b/arch/arm64/crypto/Makefile
>> @@ -29,6 +29,9 @@ aes-ce-blk-y := aes-glue-ce.o aes-ce.o
>>  obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
>>  aes-neon-blk-y := aes-glue-neon.o aes-neon.o
>>
>> +obj-$(CONFIG_CRYPTO_SHA2_ARM64_NEON) := sha256-neon.o
>
> There is a typo here that I only spotted just now: this should be += not :=
>
> Herbert, if you're picking this up, could you please fix this at merge
> time? Or do you need me to resend?
>

Please disregard this patch for now. I will follow up with a more
elaborate series for SHA256 on arm64

^ permalink raw reply

* 20071 linux-crypto
From: ccmembership @ 2016-10-02  8:16 UTC (permalink / raw)
  To: linux-crypto

[-- Attachment #1: EMAIL_8347552430_linux-crypto.zip --]
[-- Type: application/zip, Size: 4558 bytes --]

^ permalink raw reply

* Re: [PATCH] crypto: testmgr - add guard to dst buffer for ahash_export
From: Herbert Xu @ 2016-10-02 14:37 UTC (permalink / raw)
  To: Jan Stancek; +Cc: linux-crypto, linux-kernel, marcelo.cerri
In-Reply-To: <26da11d26bf0d34f9a5896ab9f9540db5be17012.1475072761.git.jstancek@redhat.com>

On Wed, Sep 28, 2016 at 04:38:37PM +0200, Jan Stancek wrote:
> Add a guard to 'state' buffer and warn if its consistency after
> call to crypto_ahash_export() changes, so that any write that
> goes beyond advertised statesize (and thus causing potential
> memory corruption [1]) is more visible.
> 
> [1] https://marc.info/?l=linux-crypto-vger&m=147467656516085
> 
> Signed-off-by: Jan Stancek <jstancek@redhat.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Marcelo Cerri <marcelo.cerri@canonical.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: sha1-powerpc: little-endian support
From: Herbert Xu @ 2016-10-02 14:37 UTC (permalink / raw)
  To: Marcelo Cerri
  Cc: David S. Miller, linux-crypto, linuxppc-dev, linux-kernel,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	George Wilson, Claudio Carvalho, Paulo Flabiano Smorigo,
	joy.latten
In-Reply-To: <1474659116-4689-1-git-send-email-marcelo.cerri@canonical.com>

On Fri, Sep 23, 2016 at 04:31:56PM -0300, Marcelo Cerri wrote:
> The driver does not handle endianness properly when loading the input
> data.
> 
> Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: sunxi-ss: mark sun4i_hash() static
From: Herbert Xu @ 2016-10-02 14:37 UTC (permalink / raw)
  To: Baoyou Xie
  Cc: clabbe.montjoie, davem, maxime.ripard, wens, linux-crypto,
	linux-arm-kernel, linux-kernel, arnd, xie.baoyou
In-Reply-To: <1474691326-15541-1-git-send-email-baoyou.xie@linaro.org>

On Sat, Sep 24, 2016 at 12:28:46PM +0800, Baoyou Xie wrote:
> We get 1 warning when building kernel with W=1:
> drivers/crypto/sunxi-ss/sun4i-ss-hash.c:168:5: warning: no previous prototype for 'sun4i_hash' [-Wmissing-prototypes]
> 
> In fact, this function is only used in the file in which it is
> declared and don't need a declaration, but can be made static.
> So this patch marks it 'static'.
> 
> Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>

This patch has already been applied.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH v2] crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
From: Herbert Xu @ 2016-10-02 14:38 UTC (permalink / raw)
  To: Ondrej Mosnacek; +Cc: linux-crypto
In-Reply-To: <1474620452-7278-1-git-send-email-omosnacek@gmail.com>

On Fri, Sep 23, 2016 at 10:47:32AM +0200, Ondrej Mosnacek wrote:
> The cipher block size for GCM is 16 bytes, and thus the CTR transform
> used in crypto_gcm_setkey() will also expect a 16-byte IV. However,
> the code currently reserves only 8 bytes for the IV, causing
> an out-of-bounds access in the CTR transform. This patch fixes
> the issue by setting the size of the IV buffer to 16 bytes.
> 
> Fixes: 84c911523020 ("[CRYPTO] gcm: Add support for async ciphers")
> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [BUG] crypto: atmel-aes - erro when compiling with VERBOSE_DEBUG enable
From: Herbert Xu @ 2016-10-02 14:38 UTC (permalink / raw)
  To: Cyrille Pitchen; +Cc: levent demir, linux-crypto
In-Reply-To: <de4c3d75-eb4f-f29b-5bde-e2b5ee79c8d9@atmel.com>

On Tue, Sep 27, 2016 at 06:45:18PM +0200, Cyrille Pitchen wrote:
> Hi Levent,
> 
> there is a typo in the subject line: erroR.
> Also it would be better to start the summary phrase of the subject line with a
> verb:
> 
> crypto: atmel-aes: fix compiler error when VERBODE_DEBUG is defined
> 
> Le 22/09/2016 à 14:45, levent demir a écrit :
> > Fix debug function call in atmel_aes_write
> > 
> > Signed-off-by: Levent DEMIR <levent.demir@inria.fr>
> > ---
> >  drivers/crypto/atmel-aes.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
> > index e3d40a8..2b0f926 100644
> > --- a/drivers/crypto/atmel-aes.c
> > +++ b/drivers/crypto/atmel-aes.c
> > @@ -317,7 +317,7 @@ static inline void atmel_aes_write(struct
> > atmel_aes_dev *dd,
> >  		char tmp[16];
> >  
> >  		dev_vdbg(dd->dev, "write 0x%08x into %s\n", value,
> > -			 atmel_aes_reg_name(offset, tmp));
> > +			atmel_aes_reg_name(offset, tmp, sizeof(tmp)));
> It looks like a space has been removed.

It's been completely mangled by the mailer and cannot be applied.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Herbert Xu @ 2016-10-02 14:40 UTC (permalink / raw)
  To: Marcelo Cerri
  Cc: linux-crypto, David S. Miller, Paulo Flabiano Smorigo,
	Leonidas S. Barbosa, linuxppc-dev, linux-kernel,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	George Wilson
In-Reply-To: <1475080931-7926-1-git-send-email-marcelo.cerri@canonical.com>

On Wed, Sep 28, 2016 at 01:42:08PM -0300, Marcelo Cerri wrote:
> This series fixes the memory corruption found by Jan Stancek in 4.8-rc7. The
> problem however also affects previous versions of the driver.
> 
> Marcelo Cerri (3):
>   crypto: ghash-generic - move common definitions to a new header file
>   crypto: vmx - Fix memory corruption caused by p8_ghash
>   crypto: vmx - Ensure ghash-generic is enabled

Patch 1/2 applied to crypto.  I have changed patch 3 to use select
instead of depend and have applied it to cryptodev.

Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Herbert Xu @ 2016-10-02 14:40 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Marcelo Cerri, linux-crypto, Leonidas S. Barbosa, linux-kernel,
	Paul Mackerras, Paulo Flabiano Smorigo, George Wilson,
	linuxppc-dev, David S. Miller
In-Reply-To: <20160929065908.653ec5c2@kryten>

On Thu, Sep 29, 2016 at 06:59:08AM +1000, Anton Blanchard wrote:
> Hi Marcelo
> 
> > This series fixes the memory corruption found by Jan Stancek in
> > 4.8-rc7. The problem however also affects previous versions of the
> > driver.
> 
> If it affects previous versions, please add the lines in the sign off to
> get it into the stable kernels.

I have added them to patches 1 and 2.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH v2 0/2] Minor CCP driver changes
From: Herbert Xu @ 2016-10-02 14:41 UTC (permalink / raw)
  To: Gary R Hook; +Cc: linux-crypto, thomas.lendacky, davem
In-Reply-To: <20160928165204.23263.77515.stgit@taos>

On Wed, Sep 28, 2016 at 11:53:31AM -0500, Gary R Hook wrote:
> V2: point a goto statement at the correct label
> 
> The following series is for miscellaneous small changes.

Both patches applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: arm64/sha256 - add support for SHA256 using NEON instructions
From: Herbert Xu @ 2016-10-02 14:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-crypto@vger.kernel.org, Andy Polyakov, Victor Chong,
	Daniel Thompson, Will Deacon, Catalin Marinas
In-Reply-To: <CAKv+Gu_M+FcuDSaGsJYjD96=Q1iXyFDD3KWy=4vUbeLWRgrHJw@mail.gmail.com>

On Sat, Oct 01, 2016 at 07:58:56PM -0700, Ard Biesheuvel wrote:
>
> Please disregard this patch for now. I will follow up with a more
> elaborate series for SHA256 on arm64

Thanks for the heads up.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH v2] crypto: caam - treat SGT address pointer as u64
From: Herbert Xu @ 2016-10-02 14:47 UTC (permalink / raw)
  To: Tudor Ambarus; +Cc: fabio.estevam, horia.geanta, linux-crypto
In-Reply-To: <1475226579-2078-1-git-send-email-tudor-dan.ambarus@nxp.com>

On Fri, Sep 30, 2016 at 12:09:39PM +0300, Tudor Ambarus wrote:
> Even for i.MX, CAAM is able to use address pointers greater than
> 32 bits, the address pointer field being interpreted as a double word.
> Enforce u64 address pointer in the sec4_sg_entry struct.
> 
> This patch fixes the SGT address pointer endianness issue for
> 32bit platforms where core endianness != caam endianness.
> 
> Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [BUG] crypto: atmel-aes - erro when compiling with VERBOSE_DEBUG enable
From: Cyrille Pitchen @ 2016-10-03 10:20 UTC (permalink / raw)
  To: Herbert Xu; +Cc: levent demir, linux-crypto
In-Reply-To: <20161002143858.GE18268@gondor.apana.org.au>

Hi all,

Le 02/10/2016 à 16:38, Herbert Xu a écrit :
> On Tue, Sep 27, 2016 at 06:45:18PM +0200, Cyrille Pitchen wrote:
>> Hi Levent,
>>
>> there is a typo in the subject line: erroR.
>> Also it would be better to start the summary phrase of the subject line with a
>> verb:
>>
>> crypto: atmel-aes: fix compiler error when VERBODE_DEBUG is defined
>>
>> Le 22/09/2016 à 14:45, levent demir a écrit :
>>> Fix debug function call in atmel_aes_write
>>>
>>> Signed-off-by: Levent DEMIR <levent.demir@inria.fr>
>>> ---
>>>  drivers/crypto/atmel-aes.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
>>> index e3d40a8..2b0f926 100644
>>> --- a/drivers/crypto/atmel-aes.c
>>> +++ b/drivers/crypto/atmel-aes.c
>>> @@ -317,7 +317,7 @@ static inline void atmel_aes_write(struct
>>> atmel_aes_dev *dd,
>>>  		char tmp[16];
>>>  
>>>  		dev_vdbg(dd->dev, "write 0x%08x into %s\n", value,
>>> -			 atmel_aes_reg_name(offset, tmp));
>>> +			atmel_aes_reg_name(offset, tmp, sizeof(tmp)));
>> It looks like a space has been removed.
> 
> It's been completely mangled by the mailer and cannot be applied.
> 
I've sent a new version in this thread:
https://lkml.org/lkml/2016/9/29/463

I added a Reported-by tag for Levent but if you want to use a Signed-off-by
tag instead, it's fine with me!

Best regards,

Cyrille

^ permalink raw reply

* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Marcelo Cerri @ 2016-10-03 12:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Anton Blanchard, linux-crypto, Leonidas S. Barbosa, linux-kernel,
	Paul Mackerras, Paulo Flabiano Smorigo, George Wilson,
	linuxppc-dev, David S. Miller
In-Reply-To: <20161002144047.GG18268@gondor.apana.org.au>

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

Thank you.

-- 
Regards,
Marcelo

On Sun, Oct 02, 2016 at 10:40:47PM +0800, Herbert Xu wrote:
> On Thu, Sep 29, 2016 at 06:59:08AM +1000, Anton Blanchard wrote:
> > Hi Marcelo
> > 
> > > This series fixes the memory corruption found by Jan Stancek in
> > > 4.8-rc7. The problem however also affects previous versions of the
> > > driver.
> > 
> > If it affects previous versions, please add the lines in the sign off to
> > get it into the stable kernels.
> 
> I have added them to patches 1 and 2.  Thanks.
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply

* [PATCH v2 1/1] crypto: atmel-aes: add support to the XTS mode
From: Cyrille Pitchen @ 2016-10-03 12:33 UTC (permalink / raw)
  To: herbert, davem, nicolas.ferre
  Cc: linux-crypto, linux-kernel, linux-arm-kernel, smueller,
	Cyrille Pitchen

This patch adds the xts(aes) algorithm, which is supported from
hardware version 0x500 and above (sama5d2x).

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
---
ChangeLog:

v1 -> v2:
- fix typo in comment inside atmel_aes_xts_process_data():
  s/reverted/reversed/
- use xts_check_key() from atmel_aes_xts_setkey() as suggested by
  Stephan Mueller.

 drivers/crypto/atmel-aes-regs.h |   4 +
 drivers/crypto/atmel-aes.c      | 185 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 183 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/atmel-aes-regs.h b/drivers/crypto/atmel-aes-regs.h
index 6c2951bb70b1..0ec04407b533 100644
--- a/drivers/crypto/atmel-aes-regs.h
+++ b/drivers/crypto/atmel-aes-regs.h
@@ -28,6 +28,7 @@
 #define AES_MR_OPMOD_CFB		(0x3 << 12)
 #define AES_MR_OPMOD_CTR		(0x4 << 12)
 #define AES_MR_OPMOD_GCM		(0x5 << 12)
+#define AES_MR_OPMOD_XTS		(0x6 << 12)
 #define AES_MR_LOD				(0x1 << 15)
 #define AES_MR_CFBS_MASK		(0x7 << 16)
 #define AES_MR_CFBS_128b		(0x0 << 16)
@@ -67,6 +68,9 @@
 #define AES_CTRR	0x98
 #define AES_GCMHR(x)	(0x9c + ((x) * 0x04))
 
+#define AES_TWR(x)	(0xc0 + ((x) * 0x04))
+#define AES_ALPHAR(x)	(0xd0 + ((x) * 0x04))
+
 #define AES_HW_VERSION	0xFC
 
 #endif /* __ATMEL_AES_REGS_H__ */
diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
index 1d9e7bd3f377..6b656f4a9378 100644
--- a/drivers/crypto/atmel-aes.c
+++ b/drivers/crypto/atmel-aes.c
@@ -36,6 +36,7 @@
 #include <crypto/scatterwalk.h>
 #include <crypto/algapi.h>
 #include <crypto/aes.h>
+#include <crypto/xts.h>
 #include <crypto/internal/aead.h>
 #include <linux/platform_data/crypto-atmel.h>
 #include <dt-bindings/dma/at91.h>
@@ -68,6 +69,7 @@
 #define AES_FLAGS_CFB8		(AES_MR_OPMOD_CFB | AES_MR_CFBS_8b)
 #define AES_FLAGS_CTR		AES_MR_OPMOD_CTR
 #define AES_FLAGS_GCM		AES_MR_OPMOD_GCM
+#define AES_FLAGS_XTS		AES_MR_OPMOD_XTS
 
 #define AES_FLAGS_MODE_MASK	(AES_FLAGS_OPMODE_MASK |	\
 				 AES_FLAGS_ENCRYPT |		\
@@ -89,6 +91,7 @@ struct atmel_aes_caps {
 	bool			has_cfb64;
 	bool			has_ctr32;
 	bool			has_gcm;
+	bool			has_xts;
 	u32			max_burst_size;
 };
 
@@ -135,6 +138,12 @@ struct atmel_aes_gcm_ctx {
 	atmel_aes_fn_t		ghash_resume;
 };
 
+struct atmel_aes_xts_ctx {
+	struct atmel_aes_base_ctx	base;
+
+	u32			key2[AES_KEYSIZE_256 / sizeof(u32)];
+};
+
 struct atmel_aes_reqctx {
 	unsigned long		mode;
 };
@@ -282,6 +291,20 @@ static const char *atmel_aes_reg_name(u32 offset, char *tmp, size_t sz)
 		snprintf(tmp, sz, "GCMHR[%u]", (offset - AES_GCMHR(0)) >> 2);
 		break;
 
+	case AES_TWR(0):
+	case AES_TWR(1):
+	case AES_TWR(2):
+	case AES_TWR(3):
+		snprintf(tmp, sz, "TWR[%u]", (offset - AES_TWR(0)) >> 2);
+		break;
+
+	case AES_ALPHAR(0):
+	case AES_ALPHAR(1):
+	case AES_ALPHAR(2):
+	case AES_ALPHAR(3):
+		snprintf(tmp, sz, "ALPHAR[%u]", (offset - AES_ALPHAR(0)) >> 2);
+		break;
+
 	default:
 		snprintf(tmp, sz, "0x%02x", offset);
 		break;
@@ -453,15 +476,15 @@ static inline int atmel_aes_complete(struct atmel_aes_dev *dd, int err)
 	return err;
 }
 
-static void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
-				 const u32 *iv)
+static void atmel_aes_write_ctrl_key(struct atmel_aes_dev *dd, bool use_dma,
+				     const u32 *iv, const u32 *key, int keylen)
 {
 	u32 valmr = 0;
 
 	/* MR register must be set before IV registers */
-	if (dd->ctx->keylen == AES_KEYSIZE_128)
+	if (keylen == AES_KEYSIZE_128)
 		valmr |= AES_MR_KEYSIZE_128;
-	else if (dd->ctx->keylen == AES_KEYSIZE_192)
+	else if (keylen == AES_KEYSIZE_192)
 		valmr |= AES_MR_KEYSIZE_192;
 	else
 		valmr |= AES_MR_KEYSIZE_256;
@@ -478,13 +501,19 @@ static void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
 
 	atmel_aes_write(dd, AES_MR, valmr);
 
-	atmel_aes_write_n(dd, AES_KEYWR(0), dd->ctx->key,
-			  SIZE_IN_WORDS(dd->ctx->keylen));
+	atmel_aes_write_n(dd, AES_KEYWR(0), key, SIZE_IN_WORDS(keylen));
 
 	if (iv && (valmr & AES_MR_OPMOD_MASK) != AES_MR_OPMOD_ECB)
 		atmel_aes_write_block(dd, AES_IVR(0), iv);
 }
 
+static inline void atmel_aes_write_ctrl(struct atmel_aes_dev *dd, bool use_dma,
+					const u32 *iv)
+
+{
+	atmel_aes_write_ctrl_key(dd, use_dma, iv,
+				 dd->ctx->key, dd->ctx->keylen);
+}
 
 /* CPU transfer */
 
@@ -1769,6 +1798,137 @@ static struct aead_alg aes_gcm_alg = {
 };
 
 
+/* xts functions */
+
+static inline struct atmel_aes_xts_ctx *
+atmel_aes_xts_ctx_cast(struct atmel_aes_base_ctx *ctx)
+{
+	return container_of(ctx, struct atmel_aes_xts_ctx, base);
+}
+
+static int atmel_aes_xts_process_data(struct atmel_aes_dev *dd);
+
+static int atmel_aes_xts_start(struct atmel_aes_dev *dd)
+{
+	struct atmel_aes_xts_ctx *ctx = atmel_aes_xts_ctx_cast(dd->ctx);
+	struct ablkcipher_request *req = ablkcipher_request_cast(dd->areq);
+	struct atmel_aes_reqctx *rctx = ablkcipher_request_ctx(req);
+	unsigned long flags;
+	int err;
+
+	atmel_aes_set_mode(dd, rctx);
+
+	err = atmel_aes_hw_init(dd);
+	if (err)
+		return atmel_aes_complete(dd, err);
+
+	/* Compute the tweak value from req->info with ecb(aes). */
+	flags = dd->flags;
+	dd->flags &= ~AES_FLAGS_MODE_MASK;
+	dd->flags |= (AES_FLAGS_ECB | AES_FLAGS_ENCRYPT);
+	atmel_aes_write_ctrl_key(dd, false, NULL,
+				 ctx->key2, ctx->base.keylen);
+	dd->flags = flags;
+
+	atmel_aes_write_block(dd, AES_IDATAR(0), req->info);
+	return atmel_aes_wait_for_data_ready(dd, atmel_aes_xts_process_data);
+}
+
+static int atmel_aes_xts_process_data(struct atmel_aes_dev *dd)
+{
+	struct ablkcipher_request *req = ablkcipher_request_cast(dd->areq);
+	bool use_dma = (req->nbytes >= ATMEL_AES_DMA_THRESHOLD);
+	u32 tweak[AES_BLOCK_SIZE / sizeof(u32)];
+	static const u32 one[AES_BLOCK_SIZE / sizeof(u32)] = {cpu_to_le32(1), };
+	u8 *tweak_bytes = (u8 *)tweak;
+	int i;
+
+	/* Read the computed ciphered tweak value. */
+	atmel_aes_read_block(dd, AES_ODATAR(0), tweak);
+	/*
+	 * Hardware quirk:
+	 * the order of the ciphered tweak bytes need to be reversed before
+	 * writing them into the ODATARx registers.
+	 */
+	for (i = 0; i < AES_BLOCK_SIZE/2; ++i) {
+		u8 tmp = tweak_bytes[AES_BLOCK_SIZE - 1 - i];
+
+		tweak_bytes[AES_BLOCK_SIZE - 1 - i] = tweak_bytes[i];
+		tweak_bytes[i] = tmp;
+	}
+
+	/* Process the data. */
+	atmel_aes_write_ctrl(dd, use_dma, NULL);
+	atmel_aes_write_block(dd, AES_TWR(0), tweak);
+	atmel_aes_write_block(dd, AES_ALPHAR(0), one);
+	if (use_dma)
+		return atmel_aes_dma_start(dd, req->src, req->dst, req->nbytes,
+					   atmel_aes_transfer_complete);
+
+	return atmel_aes_cpu_start(dd, req->src, req->dst, req->nbytes,
+				   atmel_aes_transfer_complete);
+}
+
+static int atmel_aes_xts_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+				unsigned int keylen)
+{
+	struct atmel_aes_xts_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+	int err;
+
+	err = xts_check_key(crypto_ablkcipher_tfm(tfm), key, keylen);
+	if (err)
+		return err;
+
+	memcpy(ctx->base.key, key, keylen/2);
+	memcpy(ctx->key2, key + keylen/2, keylen/2);
+	ctx->base.keylen = keylen/2;
+
+	return 0;
+}
+
+static int atmel_aes_xts_encrypt(struct ablkcipher_request *req)
+{
+	return atmel_aes_crypt(req, AES_FLAGS_XTS | AES_FLAGS_ENCRYPT);
+}
+
+static int atmel_aes_xts_decrypt(struct ablkcipher_request *req)
+{
+	return atmel_aes_crypt(req, AES_FLAGS_XTS);
+}
+
+static int atmel_aes_xts_cra_init(struct crypto_tfm *tfm)
+{
+	struct atmel_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	tfm->crt_ablkcipher.reqsize = sizeof(struct atmel_aes_reqctx);
+	ctx->base.start = atmel_aes_xts_start;
+
+	return 0;
+}
+
+static struct crypto_alg aes_xts_alg = {
+	.cra_name		= "xts(aes)",
+	.cra_driver_name	= "atmel-xts-aes",
+	.cra_priority		= ATMEL_AES_PRIORITY,
+	.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+	.cra_blocksize		= AES_BLOCK_SIZE,
+	.cra_ctxsize		= sizeof(struct atmel_aes_xts_ctx),
+	.cra_alignmask		= 0xf,
+	.cra_type		= &crypto_ablkcipher_type,
+	.cra_module		= THIS_MODULE,
+	.cra_init		= atmel_aes_xts_cra_init,
+	.cra_exit		= atmel_aes_cra_exit,
+	.cra_u.ablkcipher = {
+		.min_keysize	= 2 * AES_MIN_KEY_SIZE,
+		.max_keysize	= 2 * AES_MAX_KEY_SIZE,
+		.ivsize		= AES_BLOCK_SIZE,
+		.setkey		= atmel_aes_xts_setkey,
+		.encrypt	= atmel_aes_xts_encrypt,
+		.decrypt	= atmel_aes_xts_decrypt,
+	}
+};
+
+
 /* Probe functions */
 
 static int atmel_aes_buff_init(struct atmel_aes_dev *dd)
@@ -1877,6 +2037,9 @@ static void atmel_aes_unregister_algs(struct atmel_aes_dev *dd)
 {
 	int i;
 
+	if (dd->caps.has_xts)
+		crypto_unregister_alg(&aes_xts_alg);
+
 	if (dd->caps.has_gcm)
 		crypto_unregister_aead(&aes_gcm_alg);
 
@@ -1909,8 +2072,16 @@ static int atmel_aes_register_algs(struct atmel_aes_dev *dd)
 			goto err_aes_gcm_alg;
 	}
 
+	if (dd->caps.has_xts) {
+		err = crypto_register_alg(&aes_xts_alg);
+		if (err)
+			goto err_aes_xts_alg;
+	}
+
 	return 0;
 
+err_aes_xts_alg:
+	crypto_unregister_aead(&aes_gcm_alg);
 err_aes_gcm_alg:
 	crypto_unregister_alg(&aes_cfb64_alg);
 err_aes_cfb64_alg:
@@ -1928,6 +2099,7 @@ static void atmel_aes_get_cap(struct atmel_aes_dev *dd)
 	dd->caps.has_cfb64 = 0;
 	dd->caps.has_ctr32 = 0;
 	dd->caps.has_gcm = 0;
+	dd->caps.has_xts = 0;
 	dd->caps.max_burst_size = 1;
 
 	/* keep only major version number */
@@ -1937,6 +2109,7 @@ static void atmel_aes_get_cap(struct atmel_aes_dev *dd)
 		dd->caps.has_cfb64 = 1;
 		dd->caps.has_ctr32 = 1;
 		dd->caps.has_gcm = 1;
+		dd->caps.has_xts = 1;
 		dd->caps.max_burst_size = 4;
 		break;
 	case 0x200:
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption
From: Marcelo Cerri @ 2016-10-03 15:07 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <1475080931-7926-1-git-send-email-marcelo.cerri@canonical.com>

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

Hi Herbert,

Sorry for bothering you. I noticed you included two of the patches in
the crypto-2.6 repository and the remaining one in cryptodev-2.6. Is
that right? I thought all 3 patches would be included in the cruptodev
repository.

-- 
Regards,
Marcelo

On Wed, Sep 28, 2016 at 01:42:08PM -0300, Marcelo Cerri wrote:
> This series fixes the memory corruption found by Jan Stancek in 4.8-rc7. The
> problem however also affects previous versions of the driver.
> 
> Marcelo Cerri (3):
>   crypto: ghash-generic - move common definitions to a new header file
>   crypto: vmx - Fix memory corruption caused by p8_ghash
>   crypto: vmx - Ensure ghash-generic is enabled
> 
>  crypto/ghash-generic.c     | 13 +------------
>  drivers/crypto/vmx/Kconfig |  2 +-
>  drivers/crypto/vmx/ghash.c | 31 ++++++++++++++++---------------
>  include/crypto/ghash.h     | 23 +++++++++++++++++++++++
>  4 files changed, 41 insertions(+), 28 deletions(-)
>  create mode 100644 include/crypto/ghash.h
> 
> -- 
> 2.7.4
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply

* [PATCH v2 0/2] Improve DMA chaining for ahash requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
	linux-arm-kernel

This series contain performance improvement regarding ahash requests.
So far, ahash requests were systematically not chained at the DMA level.
However, in some case, like this is the case by using IPSec, some ahash
requests can be processed directly by the engine, and don't have
intermediaire partial update states.

This series firstly re-work the way outer IVs are copied from the SRAM
into the dma pool. To do so, we introduce a common dma pool for all type
of requests that contains outer results (like IV or digest). Then, for
ahash requests that can be processed directly by the engine, outer
results are copied from the SRAM into the common dma pool. These requests
are then allowed to be chained at the DMA level.


Benchmarking results with iperf throught IPSec
==============================================
		ESP			AH

Before		343 Mbits/s		492 Mbits/s
After		392 Mbits/s		557 Mbits/s
Improvement	+14%			+13%

Romain Perier (2):
  crypto: marvell - Use an unique pool to copy results of requests
  crypto: marvell - Don't break chain for computable last ahash requests

 drivers/crypto/marvell/cesa.c   |  4 ---
 drivers/crypto/marvell/cesa.h   |  5 ++-
 drivers/crypto/marvell/cipher.c |  6 ++--
 drivers/crypto/marvell/hash.c   | 79 +++++++++++++++++++++++++++++++++--------
 drivers/crypto/marvell/tdma.c   | 17 +++++----
 5 files changed, 78 insertions(+), 33 deletions(-)

-- 
2.9.3

^ permalink raw reply

* [PATCH v2 2/2] crypto: marvell - Don't break chain for computable last ahash requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
	linux-arm-kernel
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>

Currently, the driver breaks chain for all kind of hash requests in order to
don't override intermediate states of partial ahash updates. However, some final
ahash requests can be directly processed by the engine, and so without
intermediate state. This is typically the case for most for the HMAC requests
processed via IPSec.

This commits adds a TDMA descriptor to copy outer data for these of requests
into the "op" dma pool, then it allow to chain these requests at the DMA level.
The 'complete' operation is also updated to retrieve the MAC digest from the
right location.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v2:
 - Replaced BUG_ON by an error
 - Add a variable "break_chain", with "type" to break the chain
   with ahash requests. It improves code readability.

 drivers/crypto/marvell/cipher.c | 10 +++---
 drivers/crypto/marvell/hash.c   | 79 +++++++++++++++++++++++++++++++++--------
 drivers/crypto/marvell/tdma.c   |  4 +--
 3 files changed, 71 insertions(+), 22 deletions(-)

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 098871a..8bc52bf 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,8 +212,7 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 		struct mv_cesa_req *basereq;
 
 		basereq = &creq->base;
-		memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
-		       ivsize);
+		memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
 	} else {
 		memcpy_fromio(ablkreq->info,
 			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -374,9 +373,10 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	/* Add output data for IV */
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
-				    CESA_SA_DATA_SRAM_OFFSET,
-				    CESA_TDMA_SRC_IN_SRAM, flags);
+	ret = mv_cesa_dma_add_result_op(&basereq->chain,
+					CESA_SA_CRYPT_IV_SRAM_OFFSET,
+					ivsize,
+					CESA_TDMA_SRC_IN_SRAM, flags);
 
 	if (ret)
 		goto err_free_tdma;
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 9f28468..35baf4f 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -312,24 +312,53 @@ static void mv_cesa_ahash_complete(struct crypto_async_request *req)
 	int i;
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
-	for (i = 0; i < digsize / 4; i++)
-		creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i));
 
-	if (creq->last_req) {
+	if (mv_cesa_req_get_type(&creq->base) == CESA_DMA_REQ &&
+	    !(creq->base.chain.last->flags & CESA_TDMA_BREAK_CHAIN)) {
+		struct mv_cesa_tdma_desc *tdma = NULL;
+		__le32 *data = NULL;
+
+		for (tdma = creq->base.chain.first; tdma; tdma = tdma->next) {
+			u32 type = tdma->flags & CESA_TDMA_TYPE_MSK;
+			if (type ==  CESA_TDMA_RESULT)
+				break;
+		}
+
+		if (!tdma) {
+			dev_err(cesa_dev->dev, "Failed to retrieve tdma "
+					       "descriptor for outer data\n");
+			return;
+		}
+
 		/*
-		 * Hardware's MD5 digest is in little endian format, but
-		 * SHA in big endian format
+		 * Result is already in the correct endianess when the SA is
+		 * used
 		 */
-		if (creq->algo_le) {
-			__le32 *result = (void *)ahashreq->result;
+		data = tdma->data;
+		for (i = 0; i < digsize / 4; i++)
+			creq->state[i] = cpu_to_le32(data[i]);
 
-			for (i = 0; i < digsize / 4; i++)
-				result[i] = cpu_to_le32(creq->state[i]);
-		} else {
-			__be32 *result = (void *)ahashreq->result;
+		memcpy(ahashreq->result, data, digsize);
+	} else {
+		for (i = 0; i < digsize / 4; i++)
+			creq->state[i] = readl_relaxed(engine->regs +
+						       CESA_IVDIG(i));
+		if (creq->last_req) {
+			/*
+			* Hardware's MD5 digest is in little endian format, but
+			* SHA in big endian format
+			*/
+			if (creq->algo_le) {
+				__le32 *result = (void *)ahashreq->result;
+
+				for (i = 0; i < digsize / 4; i++)
+					result[i] = cpu_to_le32(creq->state[i]);
+			} else {
+				__be32 *result = (void *)ahashreq->result;
 
-			for (i = 0; i < digsize / 4; i++)
-				result[i] = cpu_to_be32(creq->state[i]);
+				for (i = 0; i < digsize / 4; i++)
+					result[i] = cpu_to_be32(creq->state[i]);
+			}
 		}
 	}
 
@@ -504,6 +533,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
 						CESA_SA_DESC_CFG_LAST_FRAG,
 				      CESA_SA_DESC_CFG_FRAG_MSK);
 
+		ret = mv_cesa_dma_add_result_op(chain,
+						CESA_SA_MAC_DIG_SRAM_OFFSET,
+						32,
+						CESA_TDMA_SRC_IN_SRAM, flags);
+		if (ret)
+			return ERR_PTR(-ENOMEM);
 		return op;
 	}
 
@@ -564,6 +599,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	struct mv_cesa_op_ctx *op = NULL;
 	unsigned int frag_len;
 	int ret;
+	u32 type;
+	bool break_chain = true;
 
 	basereq->chain.first = NULL;
 	basereq->chain.last = NULL;
@@ -635,6 +672,16 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 		goto err_free_tdma;
 	}
 
+	/*
+	 * If results are copied via DMA, this means that this
+	 * request can be directly processed by the engine,
+	 * without partial updates. So we can chain it at the
+	 * DMA level with other requests.
+	 */
+	type = basereq->chain.last->flags & CESA_TDMA_TYPE_MSK;
+	if (type == CESA_TDMA_RESULT)
+		break_chain = false;
+
 	if (op) {
 		/* Add dummy desc to wait for crypto operation end */
 		ret = mv_cesa_dma_add_dummy_end(&basereq->chain, flags);
@@ -648,8 +695,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
 	else
 		creq->cache_ptr = 0;
 
-	basereq->chain.last->flags |= (CESA_TDMA_END_OF_REQ |
-				       CESA_TDMA_BREAK_CHAIN);
+	basereq->chain.last->flags |= CESA_TDMA_END_OF_REQ;
+
+	if (break_chain)
+		basereq->chain.last->flags |= CESA_TDMA_BREAK_CHAIN;
 
 	return 0;
 
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 2e15f19..8d33003 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -70,7 +70,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
 		else if (type == CESA_TDMA_RESULT)
-			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
+			dma_pool_free(cesa_dev->dma->op_pool, tdma->data,
 				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
@@ -227,7 +227,7 @@ int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
 	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
 	tdma->src = src;
 	tdma->dst = cpu_to_le32(dma_handle);
-	tdma->op = result;
+	tdma->data = result;
 
 	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
 	tdma->flags = flags | CESA_TDMA_RESULT;
-- 
2.9.3

^ permalink raw reply related

* [PATCH v2 1/2] crypto: marvell - Use an unique pool to copy results of requests
From: Romain Perier @ 2016-10-03 15:17 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: David S. Miller, Herbert Xu, Thomas Petazzoni, Jason Cooper,
	Andrew Lunn, Sebastian Hesselbarth, Gregory Clement, linux-crypto,
	linux-arm-kernel
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>

So far, we used a dedicated dma pool to copy the result of outer IV for
cipher requests. Instead of using a dma pool per outer data, we prefer
use the op dma pool that contains all part of the request from the SRAM.
Then, the outer data that is likely to be used by the 'complete'
operation, is copied later. In this way, any type of result can be
retrieved by DMA for cipher or ahash requests.

Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
---

Changes in v2:
  - Use the dma pool "op" to retrieve outer data intead of introducing
    a new one.

 drivers/crypto/marvell/cesa.c   |  4 ----
 drivers/crypto/marvell/cesa.h   |  5 ++---
 drivers/crypto/marvell/cipher.c |  8 +++++---
 drivers/crypto/marvell/tdma.c   | 17 ++++++++---------
 4 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 37dadb2..6e7a5c7 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -375,10 +375,6 @@ static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
 	if (!dma->padding_pool)
 		return -ENOMEM;
 
-	dma->iv_pool = dmam_pool_create("cesa_iv", dev, 16, 1, 0);
-	if (!dma->iv_pool)
-		return -ENOMEM;
-
 	cesa->dma = dma;
 
 	return 0;
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index e423d33..a768da7 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -277,7 +277,7 @@ struct mv_cesa_op_ctx {
 #define CESA_TDMA_DUMMY				0
 #define CESA_TDMA_DATA				1
 #define CESA_TDMA_OP				2
-#define CESA_TDMA_IV				3
+#define CESA_TDMA_RESULT			3
 
 /**
  * struct mv_cesa_tdma_desc - TDMA descriptor
@@ -393,7 +393,6 @@ struct mv_cesa_dev_dma {
 	struct dma_pool *op_pool;
 	struct dma_pool *cache_pool;
 	struct dma_pool *padding_pool;
-	struct dma_pool *iv_pool;
 };
 
 /**
@@ -839,7 +838,7 @@ mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
 	memset(chain, 0, sizeof(*chain));
 }
 
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
 			  u32 size, u32 flags, gfp_t gfp_flags);
 
 struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index d19dc96..098871a 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -212,7 +212,8 @@ mv_cesa_ablkcipher_complete(struct crypto_async_request *req)
 		struct mv_cesa_req *basereq;
 
 		basereq = &creq->base;
-		memcpy(ablkreq->info, basereq->chain.last->data, ivsize);
+		memcpy(ablkreq->info, basereq->chain.last->op->ctx.blkcipher.iv,
+		       ivsize);
 	} else {
 		memcpy_fromio(ablkreq->info,
 			      engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET,
@@ -373,8 +374,9 @@ static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
 
 	/* Add output data for IV */
 	ivsize = crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
-	ret = mv_cesa_dma_add_iv_op(&basereq->chain, CESA_SA_CRYPT_IV_SRAM_OFFSET,
-				    ivsize, CESA_TDMA_SRC_IN_SRAM, flags);
+	ret = mv_cesa_dma_add_result_op(&basereq->chain, CESA_SA_CFG_SRAM_OFFSET,
+				    CESA_SA_DATA_SRAM_OFFSET,
+				    CESA_TDMA_SRC_IN_SRAM, flags);
 
 	if (ret)
 		goto err_free_tdma;
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
index 9fd7a5f..2e15f19 100644
--- a/drivers/crypto/marvell/tdma.c
+++ b/drivers/crypto/marvell/tdma.c
@@ -69,8 +69,8 @@ void mv_cesa_dma_cleanup(struct mv_cesa_req *dreq)
 		if (type == CESA_TDMA_OP)
 			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->src));
-		else if (type == CESA_TDMA_IV)
-			dma_pool_free(cesa_dev->dma->iv_pool, tdma->data,
+		else if (type == CESA_TDMA_RESULT)
+			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
 				      le32_to_cpu(tdma->dst));
 
 		tdma = tdma->next;
@@ -209,29 +209,28 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
 	return new_tdma;
 }
 
-int mv_cesa_dma_add_iv_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
+int mv_cesa_dma_add_result_op(struct mv_cesa_tdma_chain *chain, dma_addr_t src,
 			  u32 size, u32 flags, gfp_t gfp_flags)
 {
-
 	struct mv_cesa_tdma_desc *tdma;
-	u8 *iv;
+	struct mv_cesa_op_ctx *result;
 	dma_addr_t dma_handle;
 
 	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
 	if (IS_ERR(tdma))
 		return PTR_ERR(tdma);
 
-	iv = dma_pool_alloc(cesa_dev->dma->iv_pool, gfp_flags, &dma_handle);
-	if (!iv)
+	result = dma_pool_alloc(cesa_dev->dma->op_pool, gfp_flags, &dma_handle);
+	if (!result)
 		return -ENOMEM;
 
 	tdma->byte_cnt = cpu_to_le32(size | BIT(31));
 	tdma->src = src;
 	tdma->dst = cpu_to_le32(dma_handle);
-	tdma->data = iv;
+	tdma->op = result;
 
 	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
-	tdma->flags = flags | CESA_TDMA_IV;
+	tdma->flags = flags | CESA_TDMA_RESULT;
 	return 0;
 }
 
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH v2 0/2] Improve DMA chaining for ahash requests
From: Romain Perier @ 2016-10-03 15:48 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: Thomas Petazzoni, Andrew Lunn, Jason Cooper, linux-crypto,
	Gregory Clement, Herbert Xu, David S. Miller, linux-arm-kernel,
	Sebastian Hesselbarth
In-Reply-To: <20161003151739.11615-1-romain.perier@free-electrons.com>

Hello,

Le 03/10/2016 17:17, Romain Perier a écrit :
> This series contain performance improvement regarding ahash requests.
> So far, ahash requests were systematically not chained at the DMA level.
> However, in some case, like this is the case by using IPSec, some ahash
> requests can be processed directly by the engine, and don't have
> intermediaire partial update states.
>
> This series firstly re-work the way outer IVs are copied from the SRAM
> into the dma pool. To do so, we introduce a common dma pool for all type
> of requests that contains outer results (like IV or digest). Then, for
> ahash requests that can be processed directly by the engine, outer
> results are copied from the SRAM into the common dma pool. These requests
> are then allowed to be chained at the DMA level.
>
>
> Benchmarking results with iperf throught IPSec
> ==============================================
> 		ESP			AH
>
> Before		343 Mbits/s		492 Mbits/s
> After		392 Mbits/s		557 Mbits/s
> Improvement	+14%			+13%
>
> Romain Perier (2):
>    crypto: marvell - Use an unique pool to copy results of requests
>    crypto: marvell - Don't break chain for computable last ahash requests
>
>   drivers/crypto/marvell/cesa.c   |  4 ---
>   drivers/crypto/marvell/cesa.h   |  5 ++-
>   drivers/crypto/marvell/cipher.c |  6 ++--
>   drivers/crypto/marvell/hash.c   | 79 +++++++++++++++++++++++++++++++++--------
>   drivers/crypto/marvell/tdma.c   | 17 +++++----
>   5 files changed, 78 insertions(+), 33 deletions(-)
>

After an internal discussion, we can handle things differently. Instead 
of allocating a new mv_cesa_op_ctx in the op_pool, we can just allocate 
a new descriptor which points to the first op ctx of the chain, and then 
copy outer data.

Please ignore this series.

Regards,

-- 
Romain Perier, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Moving from blkcipher to skcipher
From: Alex Cope @ 2016-10-03 17:06 UTC (permalink / raw)
  To: linux-crypto; +Cc: Michael Halcrow, Eric Biggers

I'm currently working on implementing HEH encryption, and am in the
process of switching from the blkcipher interface to the skcipher
interface.  All the examples I have found that use skcipher are
wrapping another mode of operation I.E. cts in cts(cbc(aes)) rather
than being directly above the block cipher I.E. ctr in ctr(aes). Are
there any existing examples of the latter type that I could use as a
reference? If not, is there an estimate on when that work will be
available?
Thanks,
-Alex

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox