Linux cryptographic layer development
 help / color / mirror / Atom feed
* [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
@ 2014-03-24 16:10 Mathias Krause
  2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller
  Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
	H. Peter Anvin, Marek Vasut

The recent addition of the AVX2 variant of the SHA1 hash function wrongly
disabled the AVX variant by introducing a flaw in the feature test. Fixed
in patch 1.

The alignment calculations of the AVX2 assembler implementation are
questionable, too. Especially the page alignment of the stack pointer is
broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
code alignment is fixed.

Please apply!

Mathias Krause (3):
  crypto: x86/sha1 - re-enable the AVX variant
  crypto: x86/sha1 - fix stack alignment of AVX2 variant
  crypto: x86/sha1 - reduce size of the AVX2 asm implementation

 arch/x86/crypto/sha1_avx2_x86_64_asm.S |    8 ++------
 arch/x86/crypto/sha1_ssse3_glue.c      |   26 ++++++++++++++++----------
 2 files changed, 18 insertions(+), 16 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
  2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller
  Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
	H. Peter Anvin, Marek Vasut

Commit 7c1da8d0d0 "crypto: sha - SHA1 transform x86_64 AVX2"
accidentally disabled the AVX variant by making the avx_usable() test
not only fail in case the CPU doesn't support AVX or OSXSAVE but also
if it doesn't support AVX2.

Fix that regression by splitting up the AVX/AVX2 test into two
functions. Also test for the BMI1 extension in the avx2_usable() test
as the AVX2 implementation not only makes use of BMI2 but also BMI1
instructions.

Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 arch/x86/crypto/sha1_ssse3_glue.c |   26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 139a55c04d..74d16ef707 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -208,11 +208,7 @@ static bool __init avx_usable(void)
 {
 	u64 xcr0;
 
-#if defined(CONFIG_AS_AVX2)
-	if (!cpu_has_avx || !cpu_has_avx2 || !cpu_has_osxsave)
-#else
 	if (!cpu_has_avx || !cpu_has_osxsave)
-#endif
 		return false;
 
 	xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
@@ -224,11 +220,23 @@ static bool __init avx_usable(void)
 
 	return true;
 }
+
+#ifdef CONFIG_AS_AVX2
+static bool __init avx2_usable(void)
+{
+	if (avx_usable() && cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI1) &&
+	    boot_cpu_has(X86_FEATURE_BMI2))
+		return true;
+
+	return false;
+}
+#endif
 #endif
 
 static int __init sha1_ssse3_mod_init(void)
 {
 	char *algo_name;
+
 	/* test for SSSE3 first */
 	if (cpu_has_ssse3) {
 		sha1_transform_asm = sha1_transform_ssse3;
@@ -238,13 +246,11 @@ static int __init sha1_ssse3_mod_init(void)
 #ifdef CONFIG_AS_AVX
 	/* allow AVX to override SSSE3, it's a little faster */
 	if (avx_usable()) {
-		if (cpu_has_avx) {
-			sha1_transform_asm = sha1_transform_avx;
-			algo_name = "AVX";
-		}
+		sha1_transform_asm = sha1_transform_avx;
+		algo_name = "AVX";
 #ifdef CONFIG_AS_AVX2
-		if (cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI2)) {
-			/* allow AVX2 to override AVX, it's a little faster */
+		/* allow AVX2 to override AVX, it's a little faster */
+		if (avx2_usable()) {
 			sha1_transform_asm = sha1_apply_transform_avx2;
 			algo_name = "AVX2";
 		}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
  2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
  2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller
  Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
	H. Peter Anvin, Marek Vasut

The AVX2 implementation might waste up to a page of stack memory because
of a wrong alignment calculation. This will, in the worst case, increase
the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big
for a kernel function. Even worse, it might also allocate *less* bytes
than needed if the stack pointer is already aligned bacause in that case
the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards,
not downwards.

Fix those issues by changing and simplifying the alignment calculation
to use a 32 byte alignment, the alignment really needed.

Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 4f348544d1..bacac22b20 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -636,9 +636,7 @@ _loop3:
 
 	/* Align stack */
 	mov	%rsp, %rbx
-	and	$(0x1000-1), %rbx
-	sub	$(8+32), %rbx
-	sub	%rbx, %rsp
+	and	$~(0x20-1), %rsp
 	push	%rbx
 	sub	$RESERVE_STACK, %rsp
 
@@ -665,8 +663,7 @@ _loop3:
 	avx2_zeroupper
 
 	add	$RESERVE_STACK, %rsp
-	pop	%rbx
-	add	%rbx, %rsp
+	pop	%rsp
 
 	pop	%r15
 	pop	%r14
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
  2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
  2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
  2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller
  Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
	H. Peter Anvin, Marek Vasut

There is really no need to page align sha1_transform_avx2. The default
alignment is just fine. This is not the hot code but only the entry
point, after all.

Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S |    1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index bacac22b20..1cd792db15 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -623,7 +623,6 @@ _loop3:
  */
 .macro SHA1_VECTOR_ASM  name
 	ENTRY(\name)
-	.align 4096
 
 	push	%rbx
 	push	%rbp
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
                   ` (2 preceding siblings ...)
  2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
@ 2014-03-24 17:04 ` H. Peter Anvin
  2014-03-24 17:29 ` chandramouli narayanan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:04 UTC (permalink / raw)
  To: Mathias Krause, Herbert Xu, David S. Miller
  Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut

On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
> 
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
> 
> Please apply!
> 

Yikes.  Yes... the alignment calculation is confused in the extreme.

	-hpa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
                   ` (3 preceding siblings ...)
  2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
@ 2014-03-24 17:29 ` chandramouli narayanan
  2014-03-25  7:55   ` Mathias Krause
  2014-03-24 17:31 ` H. Peter Anvin
  2014-03-24 20:19 ` Marek Vasut
  6 siblings, 1 reply; 10+ messages in thread
From: chandramouli narayanan @ 2014-03-24 17:29 UTC (permalink / raw)
  To: Mathias Krause
  Cc: Herbert Xu, David S. Miller, linux-crypto, H. Peter Anvin,
	Marek Vasut

On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
> 
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
> 
> Please apply!
> 
> Mathias Krause (3):
>   crypto: x86/sha1 - re-enable the AVX variant
>   crypto: x86/sha1 - fix stack alignment of AVX2 variant
>   crypto: x86/sha1 - reduce size of the AVX2 asm implementation
> 
>  arch/x86/crypto/sha1_avx2_x86_64_asm.S |    8 ++------
>  arch/x86/crypto/sha1_ssse3_glue.c      |   26 ++++++++++++++++----------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 
Your fixes are the right on mark. I went through your patches and tested
them and found to be correct.

Sorry for causing regression and missing alignment issues in the patches
I submitted.

- mouli

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
                   ` (4 preceding siblings ...)
  2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-24 17:31 ` H. Peter Anvin
  2014-03-24 20:19 ` Marek Vasut
  6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:31 UTC (permalink / raw)
  To: Mathias Krause, Herbert Xu, David S. Miller
  Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut

On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
> 
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
> 
> Please apply!
> 
> Mathias Krause (3):
>   crypto: x86/sha1 - re-enable the AVX variant
>   crypto: x86/sha1 - fix stack alignment of AVX2 variant
>   crypto: x86/sha1 - reduce size of the AVX2 asm implementation
> 
>  arch/x86/crypto/sha1_avx2_x86_64_asm.S |    8 ++------
>  arch/x86/crypto/sha1_ssse3_glue.c      |   26 ++++++++++++++++----------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 

For all these patches:

Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>

Thank you for doing these.

	-hpa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
                   ` (5 preceding siblings ...)
  2014-03-24 17:31 ` H. Peter Anvin
@ 2014-03-24 20:19 ` Marek Vasut
  2014-03-25 12:44   ` Herbert Xu
  6 siblings, 1 reply; 10+ messages in thread
From: Marek Vasut @ 2014-03-24 20:19 UTC (permalink / raw)
  To: Mathias Krause
  Cc: Herbert Xu, David S. Miller, linux-crypto, Chandramouli Narayanan,
	H. Peter Anvin

On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
> 
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
> 
> Please apply!

Nice,

Reviewed-by: Marek Vasut <marex@denx.de>

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-25  7:55   ` Mathias Krause
  0 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-25  7:55 UTC (permalink / raw)
  To: chandramouli narayanan
  Cc: Herbert Xu, David S. Miller, linux-crypto@vger.kernel.org,
	H. Peter Anvin, Marek Vasut

On 24 March 2014 18:29, chandramouli narayanan <mouli@linux.intel.com> wrote:
> On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
>> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
>> disabled the AVX variant by introducing a flaw in the feature test. Fixed
>> in patch 1.
>>
>> The alignment calculations of the AVX2 assembler implementation are
>> questionable, too. Especially the page alignment of the stack pointer is
>> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
>> code alignment is fixed.
>>
>> Please apply!
>>
>> Mathias Krause (3):
>>   crypto: x86/sha1 - re-enable the AVX variant
>>   crypto: x86/sha1 - fix stack alignment of AVX2 variant
>>   crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>>
>>  arch/x86/crypto/sha1_avx2_x86_64_asm.S |    8 ++------
>>  arch/x86/crypto/sha1_ssse3_glue.c      |   26 ++++++++++++++++----------
>>  2 files changed, 18 insertions(+), 16 deletions(-)
>>
> Your fixes are the right on mark. I went through your patches and tested
> them and found to be correct.

Thanks for double-checking!

> Sorry for causing regression and missing alignment issues in the patches
> I submitted.

No problem with that. But as I'm not subscribed to the linux-crypto
mailing list I haven't seen your earlier submissions. Otherwise I
would have objected earlier. ;)

Thanks,
Mathias

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
  2014-03-24 20:19 ` Marek Vasut
@ 2014-03-25 12:44   ` Herbert Xu
  0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2014-03-25 12:44 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Mathias Krause, David S. Miller, linux-crypto,
	Chandramouli Narayanan, H. Peter Anvin

On Mon, Mar 24, 2014 at 09:19:44PM +0100, Marek Vasut wrote:
> On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> > The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> > disabled the AVX variant by introducing a flaw in the feature test. Fixed
> > in patch 1.
> > 
> > The alignment calculations of the AVX2 assembler implementation are
> > questionable, too. Especially the page alignment of the stack pointer is
> > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> > code alignment is fixed.
> > 
> > Please apply!
> 
> Nice,
> 
> Reviewed-by: Marek Vasut <marex@denx.de>

All applied.  Thanks Mathias!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-03-25 12:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
2014-03-24 17:29 ` chandramouli narayanan
2014-03-25  7:55   ` Mathias Krause
2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
2014-03-25 12:44   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox