public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Lukasz Stelmach <l.stelmach@samsung.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-crypto@vger.kernel.org
Subject: Re: xor_blocks() assumptions
Date: Tue, 3 Jan 2023 23:46:44 -0800	[thread overview]
Message-ID: <Y7Uu5GkxfrejPJXL@sol.localdomain> (raw)
In-Reply-To: <dleftjbknfoopx.fsf%l.stelmach@samsung.com>

On Tue, Jan 03, 2023 at 12:13:30PM +0100, Lukasz Stelmach wrote:
> > It also would be worth considering just optimizing crypto_xor() by
> > unrolling the word-at-a-time loop to 4x or so.
> 
> If I understand correctly the generic 8regs and 32regs implementations
> in include/asm-generic/xor.h are what you mean. Using xor_blocks() in
> crypto_xor() could enable them for free on architectures lacking SIMD or
> vector instructions.

I actually meant exactly what I said -- unrolling the word-at-a-time loop in
crypto_xor().  Not using xor_blocks().  Something like this:

diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
index 61b327206b557..c0b90f14cae18 100644
--- a/include/crypto/algapi.h
+++ b/include/crypto/algapi.h
@@ -167,7 +167,18 @@ static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size)
 		unsigned long *s = (unsigned long *)src;
 		unsigned long l;
 
-		while (size > 0) {
+		while (size >= 4 * sizeof(unsigned long)) {
+			l = get_unaligned(d) ^ get_unaligned(s++);
+			put_unaligned(l, d++);
+			l = get_unaligned(d) ^ get_unaligned(s++);
+			put_unaligned(l, d++);
+			l = get_unaligned(d) ^ get_unaligned(s++);
+			put_unaligned(l, d++);
+			l = get_unaligned(d) ^ get_unaligned(s++);
+			put_unaligned(l, d++);
+			size -= 4 * sizeof(unsigned long);
+		}
+		if (size > 0) {
 			l = get_unaligned(d) ^ get_unaligned(s++);
 			put_unaligned(l, d++);
 			size -= sizeof(unsigned long);

Actually, the compiler might unroll the loop automatically anyway, so even the
above change might not even be necessary.  The point is, I expect that a proper
scalar implementation will perform well for pretty much anything other than
large input sizes.

It's only large input sizes where xor_blocks() might be worth it, considering
the significant overhead of the indirect call in xor_blocks() as well as
entering an SIMD code section.  (Note that indirect calls are very expensive
these days, due to the speculative execution mitigations.)

Of course, the real question is what real-world scenario are you actually trying
to optimize for...

- Eric

      parent reply	other threads:[~2023-01-04  7:47 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230102224447eucas1p1dad1a2362030eee0d3890dd3546a1532@eucas1p1.samsung.com>
2023-01-02 22:44 ` xor_blocks() assumptions Lukasz Stelmach
2023-01-02 23:03   ` Eric Biggers
2023-01-03 11:13     ` Lukasz Stelmach
2023-01-03 14:01       ` Ard Biesheuvel
2023-01-04  7:46       ` Eric Biggers [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7Uu5GkxfrejPJXL@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=l.stelmach@samsung.com \
    --cc=linux-crypto@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox