public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Nathan Huckleberry <nhuck@google.com>
Cc: linux-crypto@vger.kernel.org,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Paul Crowley <paulcrowley@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH v4 4/8] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR
Date: Thu, 21 Apr 2022 15:29:10 -0700	[thread overview]
Message-ID: <YmHatvKDOZ5z1A9I@sol.localdomain> (raw)
In-Reply-To: <CAJkfWY51QXF3Mf6jfNc54yRPhR5J9azyaXkkd2=x6Q-RkfdBsA@mail.gmail.com>

On Thu, Apr 21, 2022 at 04:59:31PM -0500, Nathan Huckleberry wrote:
> On Mon, Apr 18, 2022 at 7:13 PM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Tue, Apr 12, 2022 at 05:28:12PM +0000, Nathan Huckleberry wrote:
> > > diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
> > > index 363699dd7220..ce17fe630150 100644
> > > --- a/arch/x86/crypto/aesni-intel_asm.S
> > > +++ b/arch/x86/crypto/aesni-intel_asm.S
> > > @@ -2821,6 +2821,76 @@ SYM_FUNC_END(aesni_ctr_enc)
> > >
> > >  #endif
> > >
> > > +#ifdef __x86_64__
> > > +/*
> > > + * void aesni_xctr_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
> > > + *                 size_t len, u8 *iv, int byte_ctr)
> > > + */
> > > +SYM_FUNC_START(aesni_xctr_enc)
> > > +     FRAME_BEGIN
> > > +     cmp $16, LEN
> > > +     jb .Lxctr_ret
> > > +     shr     $4, %arg6
> > > +     movq %arg6, CTR
> > > +     mov 480(KEYP), KLEN
> > > +     movups (IVP), IV
> > > +     cmp $64, LEN
> > > +     jb .Lxctr_enc_loop1
> > > +.align 4
> > > +.Lxctr_enc_loop4:
> > > +     movaps IV, STATE1
> > > +     vpaddq ONE(%rip), CTR, CTR
> > > +     vpxor CTR, STATE1, STATE1
> > > +     movups (INP), IN1
> > > +     movaps IV, STATE2
> > > +     vpaddq ONE(%rip), CTR, CTR
> > > +     vpxor CTR, STATE2, STATE2
> > > +     movups 0x10(INP), IN2
> > > +     movaps IV, STATE3
> > > +     vpaddq ONE(%rip), CTR, CTR
> > > +     vpxor CTR, STATE3, STATE3
> > > +     movups 0x20(INP), IN3
> > > +     movaps IV, STATE4
> > > +     vpaddq ONE(%rip), CTR, CTR
> > > +     vpxor CTR, STATE4, STATE4
> > > +     movups 0x30(INP), IN4
> > > +     call _aesni_enc4
> > > +     pxor IN1, STATE1
> > > +     movups STATE1, (OUTP)
> > > +     pxor IN2, STATE2
> > > +     movups STATE2, 0x10(OUTP)
> > > +     pxor IN3, STATE3
> > > +     movups STATE3, 0x20(OUTP)
> > > +     pxor IN4, STATE4
> > > +     movups STATE4, 0x30(OUTP)
> > > +     sub $64, LEN
> > > +     add $64, INP
> > > +     add $64, OUTP
> > > +     cmp $64, LEN
> > > +     jge .Lxctr_enc_loop4
> > > +     cmp $16, LEN
> > > +     jb .Lxctr_ret
> > > +.align 4
> > > +.Lxctr_enc_loop1:
> > > +     movaps IV, STATE
> > > +     vpaddq ONE(%rip), CTR, CTR
> > > +     vpxor CTR, STATE1, STATE1
> > > +     movups (INP), IN
> > > +     call _aesni_enc1
> > > +     pxor IN, STATE
> > > +     movups STATE, (OUTP)
> > > +     sub $16, LEN
> > > +     add $16, INP
> > > +     add $16, OUTP
> > > +     cmp $16, LEN
> > > +     jge .Lxctr_enc_loop1
> > > +.Lxctr_ret:
> > > +     FRAME_END
> > > +     RET
> > > +SYM_FUNC_END(aesni_xctr_enc)
> > > +
> > > +#endif
> >
> > Sorry, I missed this file.  This is the non-AVX version, right?  That means that
> > AVX instructions, i.e. basically anything instruction starting with "v", can't
> > be used here.  So the above isn't going to work.  (There might be a way to test
> > this with QEMU; maybe --cpu-type=Nehalem without --enable-kvm?)
> >
> > You could rewrite this without using AVX instructions.  However, polyval-clmulni
> > is broken in the same way; it uses AVX instructions without checking whether
> > they are available.  But your patchset doesn't aim to provide a non-AVX polyval
> > implementation at all.  So even if you got the non-AVX XCTR working, it wouldn't
> > be paired with an accelerated polyval.
> >
> > So I think you should just not provide non-AVX versions for now.  That would
> > mean:
> >
> >         1.) Drop the change to aesni-intel_asm.S
> >         2.) Don't register the AES XCTR algorithm unless AVX is available
> >             (in addition to AES-NI)
> 
> Is there a preferred way to conditionally register xctr? It looks like
> aesni-intel_glue.c registers a default implementation for all the
> algorithms in the array, then better versions are enabled depending on
> cpu features. Should I remove xctr from the list of other algorithms
> and register it separately?
> 

Yes, it will need to be removed from the aesni_skciphers array.  I don't see any
other algorithms in that file that are conditional on AES-NI && AVX, so it will
have to go by itself.

- Eric

  reply	other threads:[~2022-04-21 22:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 17:28 [PATCH v4 0/8] crypto: HCTR2 support Nathan Huckleberry
2022-04-12 17:28 ` [PATCH v4 1/8] crypto: xctr - Add XCTR support Nathan Huckleberry
2022-04-18 19:03   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 2/8] crypto: polyval - Add POLYVAL support Nathan Huckleberry
2022-04-18 19:25   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 3/8] crypto: hctr2 - Add HCTR2 support Nathan Huckleberry
2022-04-13  4:20   ` Eric Biggers
2022-04-18 20:46   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 4/8] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR Nathan Huckleberry
2022-04-14  7:00   ` Eric Biggers
2022-04-18 23:44   ` Eric Biggers
2022-04-19  0:13   ` Eric Biggers
2022-04-21 21:59     ` Nathan Huckleberry
2022-04-21 22:29       ` Eric Biggers [this message]
2022-04-12 17:28 ` [PATCH v4 5/8] crypto: arm64/aes-xctr: " Nathan Huckleberry
2022-04-19  4:33   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 6/8] crypto: x86/polyval: Add PCLMULQDQ accelerated implementation of POLYVAL Nathan Huckleberry
2022-04-13  5:18   ` Eric Biggers
2022-04-18 21:36   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 7/8] crypto: arm64/polyval: Add PMULL " Nathan Huckleberry
2022-04-13  5:53   ` Eric Biggers
2022-04-12 17:28 ` [PATCH v4 8/8] fscrypt: Add HCTR2 support for filename encryption Nathan Huckleberry
2022-04-13  6:10   ` Eric Biggers
2022-04-13  6:16     ` Ard Biesheuvel
2022-04-14  7:12       ` Eric Biggers
2022-04-14  7:15         ` Ard Biesheuvel
2022-04-18 18:05   ` Eric Biggers
2022-04-14 14:18 ` [PATCH v4 0/8] crypto: HCTR2 support Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmHatvKDOZ5z1A9I@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ardb@kernel.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=nhuck@google.com \
    --cc=paulcrowley@google.com \
    --cc=samitolvanen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox