* [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
Commit 7c1da8d0d0 "crypto: sha - SHA1 transform x86_64 AVX2"
accidentally disabled the AVX variant by making the avx_usable() test
not only fail in case the CPU doesn't support AVX or OSXSAVE but also
if it doesn't support AVX2.
Fix that regression by splitting up the AVX/AVX2 test into two
functions. Also test for the BMI1 extension in the avx2_usable() test
as the AVX2 implementation not only makes use of BMI2 but also BMI1
instructions.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 139a55c04d..74d16ef707 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -208,11 +208,7 @@ static bool __init avx_usable(void)
{
u64 xcr0;
-#if defined(CONFIG_AS_AVX2)
- if (!cpu_has_avx || !cpu_has_avx2 || !cpu_has_osxsave)
-#else
if (!cpu_has_avx || !cpu_has_osxsave)
-#endif
return false;
xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
@@ -224,11 +220,23 @@ static bool __init avx_usable(void)
return true;
}
+
+#ifdef CONFIG_AS_AVX2
+static bool __init avx2_usable(void)
+{
+ if (avx_usable() && cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI1) &&
+ boot_cpu_has(X86_FEATURE_BMI2))
+ return true;
+
+ return false;
+}
+#endif
#endif
static int __init sha1_ssse3_mod_init(void)
{
char *algo_name;
+
/* test for SSSE3 first */
if (cpu_has_ssse3) {
sha1_transform_asm = sha1_transform_ssse3;
@@ -238,13 +246,11 @@ static int __init sha1_ssse3_mod_init(void)
#ifdef CONFIG_AS_AVX
/* allow AVX to override SSSE3, it's a little faster */
if (avx_usable()) {
- if (cpu_has_avx) {
- sha1_transform_asm = sha1_transform_avx;
- algo_name = "AVX";
- }
+ sha1_transform_asm = sha1_transform_avx;
+ algo_name = "AVX";
#ifdef CONFIG_AS_AVX2
- if (cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI2)) {
- /* allow AVX2 to override AVX, it's a little faster */
+ /* allow AVX2 to override AVX, it's a little faster */
+ if (avx2_usable()) {
sha1_transform_asm = sha1_apply_transform_avx2;
algo_name = "AVX2";
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
The AVX2 implementation might waste up to a page of stack memory because
of a wrong alignment calculation. This will, in the worst case, increase
the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big
for a kernel function. Even worse, it might also allocate *less* bytes
than needed if the stack pointer is already aligned bacause in that case
the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards,
not downwards.
Fix those issues by changing and simplifying the alignment calculation
to use a 32 byte alignment, the alignment really needed.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 4f348544d1..bacac22b20 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -636,9 +636,7 @@ _loop3:
/* Align stack */
mov %rsp, %rbx
- and $(0x1000-1), %rbx
- sub $(8+32), %rbx
- sub %rbx, %rsp
+ and $~(0x20-1), %rsp
push %rbx
sub $RESERVE_STACK, %rsp
@@ -665,8 +663,7 @@ _loop3:
avx2_zeroupper
add $RESERVE_STACK, %rsp
- pop %rbx
- add %rbx, %rsp
+ pop %rsp
pop %r15
pop %r14
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
There is really no need to page align sha1_transform_avx2. The default
alignment is just fine. This is not the hot code but only the entry
point, after all.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index bacac22b20..1cd792db15 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -623,7 +623,6 @@ _loop3:
*/
.macro SHA1_VECTOR_ASM name
ENTRY(\name)
- .align 4096
push %rbx
push %rbp
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (2 preceding siblings ...)
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
@ 2014-03-24 17:04 ` H. Peter Anvin
2014-03-24 17:29 ` chandramouli narayanan
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:04 UTC (permalink / raw)
To: Mathias Krause, Herbert Xu, David S. Miller
Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut
On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
Yikes. Yes... the alignment calculation is confused in the extreme.
-hpa
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (3 preceding siblings ...)
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
@ 2014-03-24 17:29 ` chandramouli narayanan
2014-03-25 7:55 ` Mathias Krause
2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
6 siblings, 1 reply; 10+ messages in thread
From: chandramouli narayanan @ 2014-03-24 17:29 UTC (permalink / raw)
To: Mathias Krause
Cc: Herbert Xu, David S. Miller, linux-crypto, H. Peter Anvin,
Marek Vasut
On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
Your fixes are the right on mark. I went through your patches and tested
them and found to be correct.
Sorry for causing regression and missing alignment issues in the patches
I submitted.
- mouli
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-25 7:55 ` Mathias Krause
0 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-25 7:55 UTC (permalink / raw)
To: chandramouli narayanan
Cc: Herbert Xu, David S. Miller, linux-crypto@vger.kernel.org,
H. Peter Anvin, Marek Vasut
On 24 March 2014 18:29, chandramouli narayanan <mouli@linux.intel.com> wrote:
> On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
>> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
>> disabled the AVX variant by introducing a flaw in the feature test. Fixed
>> in patch 1.
>>
>> The alignment calculations of the AVX2 assembler implementation are
>> questionable, too. Especially the page alignment of the stack pointer is
>> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
>> code alignment is fixed.
>>
>> Please apply!
>>
>> Mathias Krause (3):
>> crypto: x86/sha1 - re-enable the AVX variant
>> crypto: x86/sha1 - fix stack alignment of AVX2 variant
>> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>>
>> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
>> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
>> 2 files changed, 18 insertions(+), 16 deletions(-)
>>
> Your fixes are the right on mark. I went through your patches and tested
> them and found to be correct.
Thanks for double-checking!
> Sorry for causing regression and missing alignment issues in the patches
> I submitted.
No problem with that. But as I'm not subscribed to the linux-crypto
mailing list I haven't seen your earlier submissions. Otherwise I
would have objected earlier. ;)
Thanks,
Mathias
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (4 preceding siblings ...)
2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:31 UTC (permalink / raw)
To: Mathias Krause, Herbert Xu, David S. Miller
Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut
On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
For all these patches:
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Thank you for doing these.
-hpa
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (5 preceding siblings ...)
2014-03-24 17:31 ` H. Peter Anvin
@ 2014-03-24 20:19 ` Marek Vasut
2014-03-25 12:44 ` Herbert Xu
6 siblings, 1 reply; 10+ messages in thread
From: Marek Vasut @ 2014-03-24 20:19 UTC (permalink / raw)
To: Mathias Krause
Cc: Herbert Xu, David S. Miller, linux-crypto, Chandramouli Narayanan,
H. Peter Anvin
On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
Nice,
Reviewed-by: Marek Vasut <marex@denx.de>
Best regards,
Marek Vasut
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 20:19 ` Marek Vasut
@ 2014-03-25 12:44 ` Herbert Xu
0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2014-03-25 12:44 UTC (permalink / raw)
To: Marek Vasut
Cc: Mathias Krause, David S. Miller, linux-crypto,
Chandramouli Narayanan, H. Peter Anvin
On Mon, Mar 24, 2014 at 09:19:44PM +0100, Marek Vasut wrote:
> On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> > The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> > disabled the AVX variant by introducing a flaw in the feature test. Fixed
> > in patch 1.
> >
> > The alignment calculations of the AVX2 assembler implementation are
> > questionable, too. Especially the page alignment of the stack pointer is
> > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> > code alignment is fixed.
> >
> > Please apply!
>
> Nice,
>
> Reviewed-by: Marek Vasut <marex@denx.de>
All applied. Thanks Mathias!
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread