Linux cryptographic layer development
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Eric Biggers <ebiggers3@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Mathias Krause <minipli@googlemail.com>,
	Chandramouli Narayanan <mouli@linux.intel.com>,
	Jussi Kivilinna <jussi.kivilinna@iki.fi>,
	Peter Zijlstra <peterz@infradead.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-crypto@vger.kernel.org, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files
Date: Wed, 13 Sep 2017 16:24:28 -0500	[thread overview]
Message-ID: <20170913212428.kibwbqs2f7dkeslb@treble> (raw)
In-Reply-To: <20170908175705.GA623@zzz.localdomain>

On Fri, Sep 08, 2017 at 10:57:05AM -0700, Eric Biggers wrote:
> On Thu, Sep 07, 2017 at 11:26:47PM +0200, Ingo Molnar wrote:
> > 
> > * Eric Biggers <ebiggers3@gmail.com> wrote:
> > 
> > > On Thu, Sep 07, 2017 at 09:15:34AM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Eric Biggers <ebiggers3@gmail.com> wrote:
> > > > 
> > > > > Thanks for fixing these!  I don't have time to review these in detail, but I ran
> > > > > the crypto self-tests on the affected algorithms, and they all pass.  I also
> > > > > benchmarked them before and after; the only noticable performance difference was
> > > > > that sha256-avx2 and sha512-avx2 became a few percent slower.  I don't suppose
> > > > > there is a way around that?  Otherwise it's probably not a big deal.
> > > > 
> > > > Which CPU model did you use for the test?
> > > > 
> > > > Thanks,
> > > > 
> > > > 	Ingo
> > > 
> > > This was on Haswell, "Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz".
> > 
> > Any chance to test this with the latest microarchitecture - any Skylake derivative 
> > Intel CPU you have access to?
> > 
> > Thanks,
> > 
> > 	Ingo
> 
> Tested with Skylake, "Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz".  The results
> were the following which seemed a bit worse than Haswell:
> 
> 	sha256-avx2 became 3.5% slower
> 	sha512-avx2 became 7.5% slower
> 
> Note: it's tricky to benchmark this, especially with just a few percent
> difference, so don't read too much into the exact numbers.

Here's a v2 for the sha256-avx2 patch, would you mind seeing if this
closes the performance gap?

I'm still looking at the other one (sha512-avx2), but so far I haven't
found a way to speed it back up.

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] x86/crypto: Fix RBP usage in sha256-avx2-asm.S

Using RBP as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.

There's no need to use RBP as a temporary register for the TBL value,
because it always stores the same value: the address of the K256 table.
Instead just reference the address of K256 directly.

Reported-by: Eric Biggers <ebiggers@google.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/crypto/sha256-avx2-asm.S | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S
index 89c8f09787d2..1420db15dcdd 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -98,8 +98,6 @@ d	= %r8d
 e       = %edx	# clobbers NUM_BLKS
 y3	= %esi	# clobbers INP
 
-
-TBL	= %rbp
 SRND	= CTX	# SRND is same register as CTX
 
 a = %eax
@@ -531,7 +529,6 @@ STACK_SIZE	= _RSP      + _RSP_SIZE
 ENTRY(sha256_transform_rorx)
 .align 32
 	pushq	%rbx
-	pushq	%rbp
 	pushq	%r12
 	pushq	%r13
 	pushq	%r14
@@ -568,8 +565,6 @@ ENTRY(sha256_transform_rorx)
 	mov	CTX, _CTX(%rsp)
 
 loop0:
-	lea     K256(%rip), TBL
-
 	## Load first 16 dwords from two blocks
 	VMOVDQ	0*32(INP),XTMP0
 	VMOVDQ	1*32(INP),XTMP1
@@ -597,19 +592,19 @@ last_block_enter:
 
 .align 16
 loop1:
-	vpaddd	0*32(TBL, SRND), X0, XFER
+	vpaddd	K256+0*32(SRND), X0, XFER
 	vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 0*32
 
-	vpaddd	1*32(TBL, SRND), X0, XFER
+	vpaddd	K256+1*32(SRND), X0, XFER
 	vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 1*32
 
-	vpaddd	2*32(TBL, SRND), X0, XFER
+	vpaddd	K256+2*32(SRND), X0, XFER
 	vmovdqa XFER, 2*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 2*32
 
-	vpaddd	3*32(TBL, SRND), X0, XFER
+	vpaddd	K256+3*32(SRND), X0, XFER
 	vmovdqa XFER, 3*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 3*32
 
@@ -619,10 +614,11 @@ loop1:
 
 loop2:
 	## Do last 16 rounds with no scheduling
-	vpaddd	0*32(TBL, SRND), X0, XFER
+	vpaddd	K256+0*32(SRND), X0, XFER
 	vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
 	DO_4ROUNDS	_XFER + 0*32
-	vpaddd	1*32(TBL, SRND), X1, XFER
+
+	vpaddd	K256+1*32(SRND), X1, XFER
 	vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
 	DO_4ROUNDS	_XFER + 1*32
 	add	$2*32, SRND
@@ -676,9 +672,6 @@ loop3:
 	ja	done_hash
 
 do_last_block:
-	#### do last block
-	lea	K256(%rip), TBL
-
 	VMOVDQ	0*16(INP),XWORD0
 	VMOVDQ	1*16(INP),XWORD1
 	VMOVDQ	2*16(INP),XWORD2
@@ -718,7 +711,6 @@ done_hash:
 	popq	%r14
 	popq	%r13
 	popq	%r12
-	popq	%rbp
 	popq	%rbx
 	ret
 ENDPROC(sha256_transform_rorx)
-- 
2.13.5

  reply	other threads:[~2017-09-13 21:24 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29 18:05 [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 01/12] x86/crypto: Fix RBP usage in blowfish-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 02/12] x86/crypto: Fix RBP usage in camellia-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 03/12] x86/crypto: Fix RBP usage in cast5-avx-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 04/12] x86/crypto: Fix RBP usage in cast6-avx-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 05/12] x86/crypto: Fix RBP usage in des3_ede-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 06/12] x86/crypto: Fix RBP usage in sha1_avx2_x86_64_asm.S Josh Poimboeuf
2017-09-06 16:11   ` Tim Chen
2017-08-29 18:05 ` [PATCH 07/12] x86/crypto: Fix RBP usage in sha1_ssse3_asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 08/12] x86/crypto: Fix RBP usage in sha256-avx-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 09/12] x86/crypto: Fix RBP usage in sha256-avx2-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 10/12] x86/crypto: Fix RBP usage in sha256-ssse3-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 11/12] x86/crypto: Fix RBP usage in sha512-avx2-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 12/12] x86/crypto: Fix RBP usage in twofish-avx-x86_64-asm_64.S Josh Poimboeuf
2017-09-02  0:09 ` [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files Eric Biggers
2017-09-07  0:15   ` Josh Poimboeuf
2017-09-07  7:15   ` Ingo Molnar
2017-09-07 17:58     ` Eric Biggers
2017-09-07 21:26       ` Ingo Molnar
2017-09-08 17:57         ` Eric Biggers
2017-09-13 21:24           ` Josh Poimboeuf [this message]
2017-09-13 22:33             ` Josh Poimboeuf
2017-09-15  4:54               ` Eric Biggers
2017-09-15  5:34                 ` Ingo Molnar
2017-09-15 16:07                   ` Eric Biggers
2017-09-15 21:06                     ` Ingo Molnar
2017-09-19  3:00                       ` Herbert Xu
2017-09-14  9:16             ` Ingo Molnar
2017-09-14  9:28               ` Ingo Molnar
2017-09-14 13:28               ` Josh Poimboeuf
2017-09-15  5:37                 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170913212428.kibwbqs2f7dkeslb@treble \
    --to=jpoimboe@redhat.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers3@gmail.com \
    --cc=ebiggers@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jslaby@suse.cz \
    --cc=jussi.kivilinna@iki.fi \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=minipli@googlemail.com \
    --cc=mouli@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox