* [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
@ 2014-03-24 16:10 Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
The recent addition of the AVX2 variant of the SHA1 hash function wrongly
disabled the AVX variant by introducing a flaw in the feature test. Fixed
in patch 1.
The alignment calculations of the AVX2 assembler implementation are
questionable, too. Especially the page alignment of the stack pointer is
broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
code alignment is fixed.
Please apply!
Mathias Krause (3):
crypto: x86/sha1 - re-enable the AVX variant
crypto: x86/sha1 - fix stack alignment of AVX2 variant
crypto: x86/sha1 - reduce size of the AVX2 asm implementation
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
2 files changed, 18 insertions(+), 16 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
Commit 7c1da8d0d0 "crypto: sha - SHA1 transform x86_64 AVX2"
accidentally disabled the AVX variant by making the avx_usable() test
not only fail in case the CPU doesn't support AVX or OSXSAVE but also
if it doesn't support AVX2.
Fix that regression by splitting up the AVX/AVX2 test into two
functions. Also test for the BMI1 extension in the avx2_usable() test
as the AVX2 implementation not only makes use of BMI2 but also BMI1
instructions.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 139a55c04d..74d16ef707 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -208,11 +208,7 @@ static bool __init avx_usable(void)
{
u64 xcr0;
-#if defined(CONFIG_AS_AVX2)
- if (!cpu_has_avx || !cpu_has_avx2 || !cpu_has_osxsave)
-#else
if (!cpu_has_avx || !cpu_has_osxsave)
-#endif
return false;
xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
@@ -224,11 +220,23 @@ static bool __init avx_usable(void)
return true;
}
+
+#ifdef CONFIG_AS_AVX2
+static bool __init avx2_usable(void)
+{
+ if (avx_usable() && cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI1) &&
+ boot_cpu_has(X86_FEATURE_BMI2))
+ return true;
+
+ return false;
+}
+#endif
#endif
static int __init sha1_ssse3_mod_init(void)
{
char *algo_name;
+
/* test for SSSE3 first */
if (cpu_has_ssse3) {
sha1_transform_asm = sha1_transform_ssse3;
@@ -238,13 +246,11 @@ static int __init sha1_ssse3_mod_init(void)
#ifdef CONFIG_AS_AVX
/* allow AVX to override SSSE3, it's a little faster */
if (avx_usable()) {
- if (cpu_has_avx) {
- sha1_transform_asm = sha1_transform_avx;
- algo_name = "AVX";
- }
+ sha1_transform_asm = sha1_transform_avx;
+ algo_name = "AVX";
#ifdef CONFIG_AS_AVX2
- if (cpu_has_avx2 && boot_cpu_has(X86_FEATURE_BMI2)) {
- /* allow AVX2 to override AVX, it's a little faster */
+ /* allow AVX2 to override AVX, it's a little faster */
+ if (avx2_usable()) {
sha1_transform_asm = sha1_apply_transform_avx2;
algo_name = "AVX2";
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
The AVX2 implementation might waste up to a page of stack memory because
of a wrong alignment calculation. This will, in the worst case, increase
the stack usage of sha1_transform_avx2() alone to 5.4 kB -- way to big
for a kernel function. Even worse, it might also allocate *less* bytes
than needed if the stack pointer is already aligned bacause in that case
the 'sub %rbx, %rsp' is effectively moving the stack pointer upwards,
not downwards.
Fix those issues by changing and simplifying the alignment calculation
to use a 32 byte alignment, the alignment really needed.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 4f348544d1..bacac22b20 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -636,9 +636,7 @@ _loop3:
/* Align stack */
mov %rsp, %rbx
- and $(0x1000-1), %rbx
- sub $(8+32), %rbx
- sub %rbx, %rsp
+ and $~(0x20-1), %rsp
push %rbx
sub $RESERVE_STACK, %rsp
@@ -665,8 +663,7 @@ _loop3:
avx2_zeroupper
add $RESERVE_STACK, %rsp
- pop %rbx
- add %rbx, %rsp
+ pop %rsp
pop %r15
pop %r14
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
@ 2014-03-24 16:10 ` Mathias Krause
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-24 16:10 UTC (permalink / raw)
To: Herbert Xu, David S. Miller
Cc: linux-crypto, Mathias Krause, Chandramouli Narayanan,
H. Peter Anvin, Marek Vasut
There is really no need to page align sha1_transform_avx2. The default
alignment is just fine. This is not the hot code but only the entry
point, after all.
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
arch/x86/crypto/sha1_avx2_x86_64_asm.S | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index bacac22b20..1cd792db15 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -623,7 +623,6 @@ _loop3:
*/
.macro SHA1_VECTOR_ASM name
ENTRY(\name)
- .align 4096
push %rbx
push %rbp
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (2 preceding siblings ...)
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
@ 2014-03-24 17:04 ` H. Peter Anvin
2014-03-24 17:29 ` chandramouli narayanan
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:04 UTC (permalink / raw)
To: Mathias Krause, Herbert Xu, David S. Miller
Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut
On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
Yikes. Yes... the alignment calculation is confused in the extreme.
-hpa
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (3 preceding siblings ...)
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
@ 2014-03-24 17:29 ` chandramouli narayanan
2014-03-25 7:55 ` Mathias Krause
2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
6 siblings, 1 reply; 10+ messages in thread
From: chandramouli narayanan @ 2014-03-24 17:29 UTC (permalink / raw)
To: Mathias Krause
Cc: Herbert Xu, David S. Miller, linux-crypto, H. Peter Anvin,
Marek Vasut
On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
Your fixes are the right on mark. I went through your patches and tested
them and found to be correct.
Sorry for causing regression and missing alignment issues in the patches
I submitted.
- mouli
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (4 preceding siblings ...)
2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
6 siblings, 0 replies; 10+ messages in thread
From: H. Peter Anvin @ 2014-03-24 17:31 UTC (permalink / raw)
To: Mathias Krause, Herbert Xu, David S. Miller
Cc: linux-crypto, Chandramouli Narayanan, Marek Vasut
On 03/24/2014 09:10 AM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
>
> Mathias Krause (3):
> crypto: x86/sha1 - re-enable the AVX variant
> crypto: x86/sha1 - fix stack alignment of AVX2 variant
> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>
> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
For all these patches:
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Thank you for doing these.
-hpa
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
` (5 preceding siblings ...)
2014-03-24 17:31 ` H. Peter Anvin
@ 2014-03-24 20:19 ` Marek Vasut
2014-03-25 12:44 ` Herbert Xu
6 siblings, 1 reply; 10+ messages in thread
From: Marek Vasut @ 2014-03-24 20:19 UTC (permalink / raw)
To: Mathias Krause
Cc: Herbert Xu, David S. Miller, linux-crypto, Chandramouli Narayanan,
H. Peter Anvin
On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> disabled the AVX variant by introducing a flaw in the feature test. Fixed
> in patch 1.
>
> The alignment calculations of the AVX2 assembler implementation are
> questionable, too. Especially the page alignment of the stack pointer is
> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> code alignment is fixed.
>
> Please apply!
Nice,
Reviewed-by: Marek Vasut <marex@denx.de>
Best regards,
Marek Vasut
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 17:29 ` chandramouli narayanan
@ 2014-03-25 7:55 ` Mathias Krause
0 siblings, 0 replies; 10+ messages in thread
From: Mathias Krause @ 2014-03-25 7:55 UTC (permalink / raw)
To: chandramouli narayanan
Cc: Herbert Xu, David S. Miller, linux-crypto@vger.kernel.org,
H. Peter Anvin, Marek Vasut
On 24 March 2014 18:29, chandramouli narayanan <mouli@linux.intel.com> wrote:
> On Mon, 2014-03-24 at 17:10 +0100, Mathias Krause wrote:
>> The recent addition of the AVX2 variant of the SHA1 hash function wrongly
>> disabled the AVX variant by introducing a flaw in the feature test. Fixed
>> in patch 1.
>>
>> The alignment calculations of the AVX2 assembler implementation are
>> questionable, too. Especially the page alignment of the stack pointer is
>> broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
>> code alignment is fixed.
>>
>> Please apply!
>>
>> Mathias Krause (3):
>> crypto: x86/sha1 - re-enable the AVX variant
>> crypto: x86/sha1 - fix stack alignment of AVX2 variant
>> crypto: x86/sha1 - reduce size of the AVX2 asm implementation
>>
>> arch/x86/crypto/sha1_avx2_x86_64_asm.S | 8 ++------
>> arch/x86/crypto/sha1_ssse3_glue.c | 26 ++++++++++++++++----------
>> 2 files changed, 18 insertions(+), 16 deletions(-)
>>
> Your fixes are the right on mark. I went through your patches and tested
> them and found to be correct.
Thanks for double-checking!
> Sorry for causing regression and missing alignment issues in the patches
> I submitted.
No problem with that. But as I'm not subscribed to the linux-crypto
mailing list I haven't seen your earlier submissions. Otherwise I
would have objected earlier. ;)
Thanks,
Mathias
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
2014-03-24 20:19 ` Marek Vasut
@ 2014-03-25 12:44 ` Herbert Xu
0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2014-03-25 12:44 UTC (permalink / raw)
To: Marek Vasut
Cc: Mathias Krause, David S. Miller, linux-crypto,
Chandramouli Narayanan, H. Peter Anvin
On Mon, Mar 24, 2014 at 09:19:44PM +0100, Marek Vasut wrote:
> On Monday, March 24, 2014 at 05:10:36 PM, Mathias Krause wrote:
> > The recent addition of the AVX2 variant of the SHA1 hash function wrongly
> > disabled the AVX variant by introducing a flaw in the feature test. Fixed
> > in patch 1.
> >
> > The alignment calculations of the AVX2 assembler implementation are
> > questionable, too. Especially the page alignment of the stack pointer is
> > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for
> > code alignment is fixed.
> >
> > Please apply!
>
> Nice,
>
> Reviewed-by: Marek Vasut <marex@denx.de>
All applied. Thanks Mathias!
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-03-25 12:44 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-24 16:10 [PATCH 0/3] crypto: x86/sha1 - regression and other fixes Mathias Krause
2014-03-24 16:10 ` [PATCH 1/3] crypto: x86/sha1 - re-enable the AVX variant Mathias Krause
2014-03-24 16:10 ` [PATCH 2/3] crypto: x86/sha1 - fix stack alignment of AVX2 variant Mathias Krause
2014-03-24 16:10 ` [PATCH 3/3] crypto: x86/sha1 - reduce size of the AVX2 asm implementation Mathias Krause
2014-03-24 17:04 ` [PATCH 0/3] crypto: x86/sha1 - regression and other fixes H. Peter Anvin
2014-03-24 17:29 ` chandramouli narayanan
2014-03-25 7:55 ` Mathias Krause
2014-03-24 17:31 ` H. Peter Anvin
2014-03-24 20:19 ` Marek Vasut
2014-03-25 12:44 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox