public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org, jason@zx2c4.com,
	linux-crypto@vger.kernel.org, arnd@arndb.de,
	herbert@gondor.apana.org.au
Subject: Re: [PATCH 2/3] crypto: crypto_xor - use unaligned accessors for aligned fast path
Date: Mon, 8 Oct 2018 20:47:25 -0700	[thread overview]
Message-ID: <20181009034725.GB718@sol.localdomain> (raw)
In-Reply-To: <20181008211554.5355-3-ard.biesheuvel@linaro.org>

Hi Ard,

On Mon, Oct 08, 2018 at 11:15:53PM +0200, Ard Biesheuvel wrote:
> On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> because the ordinary load/store instructions (ldr, ldrh, ldrb) can
> tolerate any misalignment of the memory address. However, load/store
> double and load/store multiple instructions (ldrd, ldm) may still only
> be used on memory addresses that are 32-bit aligned, and so we have to
> use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
> may end up with a severe performance hit due to alignment traps that
> require fixups by the kernel.
> 
> Fortunately, the get_unaligned() accessors do the right thing: when
> building for ARMv6 or later, the compiler will emit unaligned accesses
> using the ordinary load/store instructions (but avoid the ones that
> require 32-bit alignment). When building for older ARM, those accessors
> will emit the appropriate sequence of ldrb/mov/orr instructions. And on
> architectures that can truly tolerate any kind of misalignment, the
> get_unaligned() accessors resolve to the leXX_to_cpup accessors that
> operate on aligned addresses.
> 
> So switch to the unaligned accessors for the aligned fast path. This
> will create the exact same code on architectures that can really
> tolerate any kind of misalignment, and generate code for ARMv6+ that
> avoids load/store instructions that trigger alignment faults.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  crypto/algapi.c         |  7 +++----
>  include/crypto/algapi.h | 11 +++++++++--
>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/crypto/algapi.c b/crypto/algapi.c
> index 2545c5f89c4c..52ce3c5a0499 100644
> --- a/crypto/algapi.c
> +++ b/crypto/algapi.c
> @@ -988,11 +988,10 @@ void crypto_inc(u8 *a, unsigned int size)
>  	__be32 *b = (__be32 *)(a + size);
>  	u32 c;
>  
> -	if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ||
> -	    IS_ALIGNED((unsigned long)b, __alignof__(*b)))
> +	if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
>  		for (; size >= 4; size -= 4) {
> -			c = be32_to_cpu(*--b) + 1;
> -			*b = cpu_to_be32(c);
> +			c = get_unaligned_be32(--b) + 1;
> +			put_unaligned_be32(c, b);
>  			if (likely(c))
>  				return;
>  		}
> diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
> index 4a5ad10e75f0..86267c232f34 100644
> --- a/include/crypto/algapi.h
> +++ b/include/crypto/algapi.h
> @@ -17,6 +17,8 @@
>  #include <linux/kernel.h>
>  #include <linux/skbuff.h>
>  
> +#include <asm/unaligned.h>
> +
>  /*
>   * Maximum values for blocksize and alignmask, used to allocate
>   * static buffers that are big enough for any combination of
> @@ -212,7 +214,9 @@ static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size)
>  		unsigned long *s = (unsigned long *)src;
>  
>  		while (size > 0) {
> -			*d++ ^= *s++;
> +			put_unaligned(get_unaligned(d) ^ get_unaligned(s), d);
> +			d++;
> +			s++;
>  			size -= sizeof(unsigned long);
>  		}
>  	} else {
> @@ -231,7 +235,10 @@ static inline void crypto_xor_cpy(u8 *dst, const u8 *src1, const u8 *src2,
>  		unsigned long *s2 = (unsigned long *)src2;
>  
>  		while (size > 0) {
> -			*d++ = *s1++ ^ *s2++;
> +			put_unaligned(get_unaligned(s1) ^ get_unaligned(s2), d);
> +			d++;
> +			s1++;
> +			s2++;
>  			size -= sizeof(unsigned long);
>  		}
>  	} else {
> -- 
> 2.11.0
> 

Doesn't __crypto_xor() have the same problem too?

- Eric

  reply	other threads:[~2018-10-09  3:47 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08 21:15 [PATCH 0/3] crypto: use unaligned accessors in aligned fast paths Ard Biesheuvel
2018-10-08 21:15 ` [PATCH 1/3] crypto: memneq - use unaligned accessors for aligned fast path Ard Biesheuvel
2018-10-09  3:34   ` Eric Biggers
2018-10-09  6:01     ` Ard Biesheuvel
2018-10-08 21:15 ` [PATCH 2/3] crypto: crypto_xor " Ard Biesheuvel
2018-10-09  3:47   ` Eric Biggers [this message]
2018-10-09  8:38     ` Ard Biesheuvel
2018-10-08 21:15 ` [PATCH 3/3] crypto: siphash - drop _aligned variants Ard Biesheuvel
2018-10-09  4:11   ` Jason A. Donenfeld
2018-10-09  5:59     ` Ard Biesheuvel
2018-10-09  6:10       ` Jeffrey Walton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181009034725.GB718@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=jason@zx2c4.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox