From: Eric Biggers <ebiggers@kernel.org>
To: Stefan Kanthak <stefan.kanthak@nexgo.de>
Cc: ardb@kernel.org, linux-crypto@vger.kernel.org
Subject: Re: [PATCH 1/2] crypto: s(h)aving 40+ bytes off arch/x86/crypto/sha256_ni_asm.S
Date: Tue, 9 Apr 2024 08:32:40 -0400 [thread overview]
Message-ID: <20240409123240.GB717@quark.localdomain> (raw)
In-Reply-To: <4D8C090A8BD54C05A97F57A0F640E94F@H270>
On Tue, Apr 09, 2024 at 12:23:13PM +0200, Stefan Kanthak wrote:
> "Eric Biggers" <ebiggers@kernel.org> wrote:
>
> > [+Cc linux-crypto]
> >
> > Please use reply-all so that the list gets included.
> >
> > On Mon, Apr 08, 2024 at 04:15:32PM +0200, Stefan Kanthak wrote:
> >> Hi Eric,
> >>
> >> > On Mon, Apr 08, 2024 at 11:26:52AM +0200, Stefan Kanthak wrote:
> >> >> Use shorter SSE2 instructions instead of some SSE4.1
> >> >> use short displacements into K256
> >> >>
> >> >> --- -/arch/x86/crypto/sha256_ni_asm.S
> >> >> +++ +/arch/x86/crypto/sha256_ni_asm.S
> >> >
> >> > Thanks! I'd like to benchmark this to see how it affects performance,
> >>
> >> Performance is NOT affected: if CPUs weren't superscalar, the patch might
> >> save 2 to 4 processor cycles as it replaces palignr/pblendw (slow) with
> >> punpck*qdq (fast and shorter)
> >>
> >> > but unfortunately this patch doesn't apply. It looks your email client
> >> > corrupted your patch by replacing tabs with spaces. Can you please use
> >> > 'git send-email' to send patches?
> >>
> >> I don't use git at all; I'll use cURL instead.
>
> [...]
>
> >> > Please make sure to run the crypto self-tests too.
> >>
> >> I can't, I don't use Linux at all; I just noticed that this function uses
> >> 4-byte displacements and palignr/pblendw instead of punpck?qdq after pshufd
> >>
> >> > The above is storing the two halves of the state in the wrong order.
> >>
> >> ARGH, you are right; I recognized it too, wanted to correct it, but was
> >> interrupted and forgot it after returning to the patch. Sorry.
> >
> > I'm afraid that if you don't submit a probably formatted and tested patch, your
> > patch can't be accepted. We can treat it as a suggestion, though since you're
> > sending actual code it would really help if it had your Signed-off-by.
>
> Treat is as suggestion.
All right. I'll send out a properly formatted and tested patch then. I'd also
like to convert the SHA-256 rounds to use macros, which would make the source
150 lines shorter (without changing the output). I'll probably do that first.
> I but wonder that in the past 9 years since Tim Chen submitted the SHA-NI code
> (which was copied umpteen times by others and included in their own code bases)
> nobody noticed/questioned (or if so, bothered to submit a patch like mine, that
> reduces the code size by 5%, upstream) why he used 16x "pshufd $14, %xmm0, %xmm0"
> instead of the 1 byte shorter "punpckhqdq %xmm0, %xmm0" or "psrldq $8, %xmm0"
> (which both MAY execute on more ports or faster than the shuffle instructions,
> depending on the micro-architecture), why he used 8x a 4-byte instead of a 1-byte
> displacement, or why he used "palignr/pblendw" instead of the shorter "punpck?qdq".
Not many people work on crypto assembly code, and x86 SIMD is especially
difficult because there are often multiple ways to do things that differ in
subtle ways such as instruction length and the execution ports used on different
models of CPU. I think your suggestions are good, so thanks for them.
>
> regards
> Stefan
>
> PS: aaaahhhh, you picked my suggestion up and applied it to the AES routine.
Yes, I realized that a similar optimization can apply to AES round keys, as they
can be indexed them from -112 through 112 instead of 0 through 224.
- Eric
next prev parent reply other threads:[~2024-04-09 12:32 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-08 9:26 [PATCH 1/2] crypto: s(h)aving 40+ bytes off arch/x86/crypto/sha256_ni_asm.S Stefan Kanthak
2024-04-08 12:37 ` Eric Biggers
[not found] ` <9088939CC5454139901CEDD97DAFB004@H270>
2024-04-08 15:18 ` Eric Biggers
2024-04-09 10:23 ` Stefan Kanthak
2024-04-09 12:32 ` Eric Biggers [this message]
2024-04-08 13:12 ` Eric Biggers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240409123240.GB717@quark.localdomain \
--to=ebiggers@kernel.org \
--cc=ardb@kernel.org \
--cc=linux-crypto@vger.kernel.org \
--cc=stefan.kanthak@nexgo.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox