public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Eric Biggers' <ebiggers@kernel.org>,
	Arvind Sankar <nivedita@alum.mit.edu>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v2 6/6] crypto: lib/sha - Combine round constants and message schedule
Date: Thu, 22 Oct 2020 08:20:01 +0000	[thread overview]
Message-ID: <d272bd02d90343e8a92821ff457609f8@AcuMS.aculab.com> (raw)
In-Reply-To: <20201022043450.GC857@sol.localdomain>


From: Eric Biggers
> Sent: 22 October 2020 05:35
> 
> On Tue, Oct 20, 2020 at 04:39:57PM -0400, Arvind Sankar wrote:
> > Putting the round constants and the message schedule arrays together in
> > one structure saves one register, which can be a significant benefit on
> > register-constrained architectures. On x86-32 (tested on Broadwell
> > Xeon), this gives a 10% performance benefit.
> >
> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> > Suggested-by: David Laight <David.Laight@ACULAB.COM>
> > ---
> >  lib/crypto/sha256.c | 49 ++++++++++++++++++++++++++-------------------
> >  1 file changed, 28 insertions(+), 21 deletions(-)
> >
> > diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c
> > index 3a8802d5f747..985cd0560d79 100644
> > --- a/lib/crypto/sha256.c
> > +++ b/lib/crypto/sha256.c
> > @@ -29,6 +29,11 @@ static const u32 SHA256_K[] = {
> >  	0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2,
> >  };
> >
> > +struct KW {
> > +	u32 K[64];
> > +	u32 W[64];
> > +};
> 
> Note that this doubles the stack usage from 256 to 512 bytes.  That's pretty
> large for kernel code, especially when compiler options can increase the stack
> usage well beyond the "expected" value.
> 
> So unless this gives a big performance improvement on architectures other than
> 32-bit x86 (which people don't really care about these days), we probably
> shouldn't do this.

IIRC the gain came from an odd side effect - which can probably
be got (for some compiler versions) by other means.

> FWIW, it's possible to reduce the length of 'W' to 16 words by computing the
> next W value just before each round 16-63,

I was looking at that.
You'd need to do the first 16 rounds then rounds 17-63 in a second
loop to avoid the conditional.
The problem is that it needs too many registers.
You'd need registers for 16 W values, the 8 a-h and a few spare.

...

Looking closely each round is like:
        t1 = h + e1(e) + Ch(e, f, g) + 0x428a2f98 + W[0];
        t2 = e0(a) + Maj(a, b, c);
        h = t1 + t2;   // Not used for a few rounds
        d += t1;       // Needed next round
So only needs 4 of the state variables (e, f, g, h).
The next round uses d, e, f and g.
So with extreme care d and h can use the same register.
Although I'm not sure how you'd get the compiler to do it.
Possibly making state[] volatile (or using READ/WRITE_ONCE).
So the last two lines become:
        state[7] = t1 + t2;
        d = state[3] + t1;
That might stop the x86 (32bit) spilling registers.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


      reply	other threads:[~2020-10-22  8:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 20:39 [PATCH v2 0/6] crypto: lib/sha256 - cleanup/optimization Arvind Sankar
2020-10-20 20:39 ` [PATCH v2 1/6] crypto: Use memzero_explicit() for clearing state Arvind Sankar
2020-10-22  4:36   ` Eric Biggers
2020-10-23 15:39     ` Arvind Sankar
2020-10-23 15:56       ` Eric Biggers
2020-10-23 20:45         ` Herbert Xu
2020-10-23 21:53           ` Eric Biggers
2020-10-29  7:00             ` Herbert Xu
2020-10-20 20:39 ` [PATCH v2 2/6] crypto: lib/sha256 - Don't clear temporary variables Arvind Sankar
2020-10-22  4:58   ` Eric Biggers
2020-10-23  3:17     ` Arvind Sankar
2020-10-20 20:39 ` [PATCH v2 3/6] crypto: lib/sha256 - Clear W[] in sha256_update() instead of sha256_transform() Arvind Sankar
2020-10-22  4:59   ` Eric Biggers
2020-10-20 20:39 ` [PATCH v2 4/6] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64 Arvind Sankar
2020-10-22  5:02   ` Eric Biggers
2020-10-23  3:12     ` Arvind Sankar
2020-10-23  3:16       ` Herbert Xu
2020-10-20 20:39 ` [PATCH v2 5/6] crypto: lib/sha256 - Unroll LOAD and BLEND loops Arvind Sankar
2020-10-22  5:02   ` Eric Biggers
2020-10-20 20:39 ` [PATCH v2 6/6] crypto: lib/sha - Combine round constants and message schedule Arvind Sankar
2020-10-20 21:36   ` David Laight
2020-10-21 15:16     ` Arvind Sankar
2020-10-22  4:34   ` Eric Biggers
2020-10-22  8:20     ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d272bd02d90343e8a92821ff457609f8@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nivedita@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox